Towards End-to-End Raw Audio Music Synthesis

Eppe, Manfred; Alpay, Tayfun; Wermter, Stefan

doi:10.1007/978-3-030-01424-7_14

Manfred Eppe¹⁸,
Tayfun Alpay¹⁸ &
Stefan Wermter¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11141))

Included in the following conference series:

International Conference on Artificial Neural Networks

9362 Accesses
3 Citations

Abstract

In this paper, we address the problem of automated music synthesis using deep neural networks and ask whether neural networks are capable of realizing timing, pitch accuracy and pattern generalization for automated music generation when processing raw audio data. To this end, we present a proof of concept and build a recurrent neural network architecture capable of generalizing appropriate musical raw audio tracks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automated Music Generation Using Recurrent Neural Networks

Musical Synthesis for Certain Music Styles Based on Machine Learning Algorithms

LSTM-RNN-Based Automatic Music Generation Algorithm

Notes

References

Bergstra, J., Yamins, D., Cox, D.: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: International Conference on Machine Learning (ICML) (2013)
Google Scholar
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: Neural Information Processing Systems (NIPS) (2014)
Google Scholar
Engel, J., et al.: Neural audio synthesis of musical notes with WaveNet autoencoders. Technical report (2017). http://arxiv.org/abs/1704.01279
Eppe, M., et al.: Computational invention of cadences and chord progressions by conceptual chord-blending. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), pp. 2445–2451 (2015)
Google Scholar
Eppe, M., Kerzel, M., Strahl, E.: Deep neural object analysis by interactive auditory exploration with a humanoid robot. In: International Conference on Intelligent Robots and Systems (IROS) (2018)
Google Scholar
Eppe, M., et al.: A computational framework for concept blending. Artif. Intell. 256(3), 105–129 (2018)
Google Scholar
Griffin, D., Lim, J.: Signal estimation from modified short-time Fourier transform. IEEE Trans. Acoust. Speech Signal Process. 32(2), 236–243 (1984)
Article Google Scholar
Huang, A., Wu, R.: Deep learning for music. Technical report (2016). https://arxiv.org/pdf/1606.04930.pdf
Kalingeri, V., Grandhe, S.: Music generation using deep learning. Technical report (2016). https://arxiv.org/pdf/1612.04928.pdf
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Lee, J., Cho, K., Hofmann, T.: Fully character-level neural machine translation without explicit segmentation. Trans. Assoc. Comput. Linguist. 5, 365–378 (2017)
Google Scholar
Liang, F., Gotham, M., Johnson, M., Shotton, J.: Automatic stylistic composition of bach chorales with deep LSTM. In: Proceedings of the 18th International Society for Music Information Retrieval Conference, pp. 449–456 (2017)
Google Scholar
Mcfee, B., et al.: librosa: audio and music signal analysis in Python. In: Python in Science Conference (SciPy) (2015)
Google Scholar
Nayebi, A., Vitelli, M.: GRUV: algorithmic music generation using recurrent neural networks. Stanford University, Technical report (2015)
Google Scholar
van den Oord, A., et al.: WaveNet: a generative model for raw audio. Technical report (2016). http://arxiv.org/abs/1609.03499
Simon, I., Oore, S.: Performance RNN: generating music with expressive timing and dynamics (2017). https://magenta.tensorflow.org/performance-rnn
Smith, J.O.: Spectral Audio Signal Processing. W3K Publishing (2011)
Google Scholar
Wang, Y., et al.: Tacotron: towards end-to-end speech synthesis. Technical report, Google, Inc. (2017). http://arxiv.org/abs/1703.10135

Download references

Acknowledgments

The authors gratefully acknowledge partial support from the German Research Foundation DFG under project CML (TRR 169), the European Union under project SECURE (No 642667).

Author information

Authors and Affiliations

Knowledge Technology, Department of Informatics, University of Hamburg, Vogt-Koelln-Str. 30, 22527, Hamburg, Germany
Manfred Eppe, Tayfun Alpay & Stefan Wermter

Authors

Manfred Eppe
View author publications
You can also search for this author in PubMed Google Scholar
Tayfun Alpay
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Wermter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manfred Eppe .

Editor information

Editors and Affiliations

Czech Academy of Sciences, Prague 8, Czech Republic
Věra Kůrková
Open University of Cyprus, Latsia, Cyprus
Yannis Manolopoulos
CITEC Bielefeld University, Bielefeld, Germany
Barbara Hammer
Democritus University of Thrace, Xanthi, Greece
Lazaros Iliadis
University of Piraeus, Piraeus, Greece
Ilias Maglogiannis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Eppe, M., Alpay, T., Wermter, S. (2018). Towards End-to-End Raw Audio Music Synthesis. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11141. Springer, Cham. https://doi.org/10.1007/978-3-030-01424-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-030-01424-7_14
Published: 27 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01423-0
Online ISBN: 978-3-030-01424-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics