Abstract
In this paper, we address the problem of automated music synthesis using deep neural networks and ask whether neural networks are capable of realizing timing, pitch accuracy and pattern generalization for automated music generation when processing raw audio data. To this end, we present a proof of concept and build a recurrent neural network architecture capable of generalizing appropriate musical raw audio tracks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bergstra, J., Yamins, D., Cox, D.: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: International Conference on Machine Learning (ICML) (2013)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling. In: Neural Information Processing Systems (NIPS) (2014)
Engel, J., et al.: Neural audio synthesis of musical notes with WaveNet autoencoders. Technical report (2017). http://arxiv.org/abs/1704.01279
Eppe, M., et al.: Computational invention of cadences and chord progressions by conceptual chord-blending. In: Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), pp. 2445–2451 (2015)
Eppe, M., Kerzel, M., Strahl, E.: Deep neural object analysis by interactive auditory exploration with a humanoid robot. In: International Conference on Intelligent Robots and Systems (IROS) (2018)
Eppe, M., et al.: A computational framework for concept blending. Artif. Intell. 256(3), 105–129 (2018)
Griffin, D., Lim, J.: Signal estimation from modified short-time Fourier transform. IEEE Trans. Acoust. Speech Signal Process. 32(2), 236–243 (1984)
Huang, A., Wu, R.: Deep learning for music. Technical report (2016). https://arxiv.org/pdf/1606.04930.pdf
Kalingeri, V., Grandhe, S.: Music generation using deep learning. Technical report (2016). https://arxiv.org/pdf/1612.04928.pdf
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)
Lee, J., Cho, K., Hofmann, T.: Fully character-level neural machine translation without explicit segmentation. Trans. Assoc. Comput. Linguist. 5, 365–378 (2017)
Liang, F., Gotham, M., Johnson, M., Shotton, J.: Automatic stylistic composition of bach chorales with deep LSTM. In: Proceedings of the 18th International Society for Music Information Retrieval Conference, pp. 449–456 (2017)
Mcfee, B., et al.: librosa: audio and music signal analysis in Python. In: Python in Science Conference (SciPy) (2015)
Nayebi, A., Vitelli, M.: GRUV: algorithmic music generation using recurrent neural networks. Stanford University, Technical report (2015)
van den Oord, A., et al.: WaveNet: a generative model for raw audio. Technical report (2016). http://arxiv.org/abs/1609.03499
Simon, I., Oore, S.: Performance RNN: generating music with expressive timing and dynamics (2017). https://magenta.tensorflow.org/performance-rnn
Smith, J.O.: Spectral Audio Signal Processing. W3K Publishing (2011)
Wang, Y., et al.: Tacotron: towards end-to-end speech synthesis. Technical report, Google, Inc. (2017). http://arxiv.org/abs/1703.10135
Acknowledgments
The authors gratefully acknowledge partial support from the German Research Foundation DFG under project CML (TRR 169), the European Union under project SECURE (No 642667).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Eppe, M., Alpay, T., Wermter, S. (2018). Towards End-to-End Raw Audio Music Synthesis. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11141. Springer, Cham. https://doi.org/10.1007/978-3-030-01424-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-01424-7_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01423-0
Online ISBN: 978-3-030-01424-7
eBook Packages: Computer ScienceComputer Science (R0)