Pitch contours curve frequency domain fitting with vocabulary matching based music generation

Lang, Runnan; Zhu, Songhao; Wang, Dongsheng

doi:10.1007/s11042-021-11049-x

Pitch contours curve frequency domain fitting with vocabulary matching based music generation

Published: 04 June 2021

Volume 80, pages 28463–28486, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Runnan Lang¹,
Songhao Zhu¹ &
Dongsheng Wang¹

225 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we present a whole new perspective on generating music. The method proposed in this paper is the first to be used which uses the frequency domain characteristics of pitch contour curve to generate music melody with long-term structure controllable. The music generated by this method has a good long-term structure that other basic music generation methods do not have. This method has great development potential and application ability, can be combined with other music generation methods, and improve the performance of long-term structure. This method firstly uses the neural network to fit the pitch contour curve in frequency domain, then combines the vocabulary matching method to perfect the detailed characteristics of melody generated in time domain and control the long-term trend of notes generated with respect to label information, finally generates music melody with real and controllable long-term structure. Through a large number of experiments, it can be seen that compared with the music generated based on the LSTM, the music generated by the proposed method has better long-term structure and has similar statistical characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic Music Melody Generation Using LSTM and Markov Chain Model

Music Generation Using Deep Learning Techniques

A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation

References

Ammari T, Kaye J, Tsai JY, Bentley F (2019) Music, search, and IoT: how people (really) use voice assistants. ACM Transactions on Computer-Human Interaction (ATOCHI) 26(3):1–28
Bittner RM, Salamon J, Bosch JJ, Bello JP (2017) Pitch contours as a mid-level representation for music informatics. In Audio Engineering Society Conference: 2017 AES International Conference on Semantic Audio. Audio Engineering Society
Cai W, Wei Z (2020) PiiGAN: generative adversarial networks for pluralistic image inpainting. IEEE Access 8:48451–48463
Article Google Scholar
Chen CJ (2014) U.S. Patent No. 8,886,539. Washington, DC: U.S. Patent and Trademark Office
Chen K, Zhang W, Dubnov S, Xia G, Li W (2019) The effect of explicit structure encoding of deep neural networks for symbolic music generation. In 2019 International Workshop on Multilayer Music Representation and Processing (MMRP), 77-84
Conklin D (2003) Music generation from statistical models. In Proceedings of the AISB 2003 Symposium on Artificial Intelligence and Creativity in the Arts and Sciences, 30-35
Cooley JW, Tukey JW (1965) An algorithm for the machine calculation of complex Fourier series. Math Comput 19(90):297–301
Article MathSciNet Google Scholar
Dong HW, Hsiao WY, Yang LC, Yang YH (2018) Musegan: multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In Thirty-Second AAAI Conference on Artificial Intelligence
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, ⋯, Bengio Y (2014) Generative adversarial nets. In Advances in neural information processing systems, 2672–2680
Hadjeres G, Nielsen F, Pachet F (2017) GLSR-VAE: Geodesic latent space regularization for variational autoencoder architectures. In 2017 IEEE Symposium Series on Computational Intelligence (SSCI), 1–7
Hiller LA, Isaacson LM (1959) Experimental Music: Composition with an Electronic Computer. McGraw-Hill Publishing Company, London
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Hsieh TH, Su L, Yang YH (2019) A streamlined encoder/decoder architecture for melody extraction. In ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 156–160
Jeon W, Ma C (2011) Efficient search of music pitch contours using wavelet transforms and segmented dynamic time warping. In 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2304–2307
Jiang J, Xia GG, Carlton DB, Anderson CN, Miyakawa RH (2020) Transformer VAE: a hierarchical model for structure-aware and interpretable music representation learning. In ICASSP 2020–-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 516–520
Keerti G, Vaishnavi AN, Mukherjee P, Vidya AS, Sreenithya GS, Nayab D (2020) Attentional networks for music generation. arXiv preprint arXiv:2002.03854
Lim H, Rhyu S, Lee K (2017) Chord generation from symbolic melody using BLSTM networks. arXiv preprint arXiv:1712.01011
Mangal S, Modak R, Joshi P (2019) LSTM based music generation system. arXiv preprint arXiv:1908.01080
Ouyang P, Yin S, Wei S (2017) A fast and power efficient architecture to parallelize LSTM based RNN for cognitive intelligence applications. In Proceedings of the 54th Annual Design Automation Conference 2017, 1-6
Razavi A, van den Oord A, Vinyals O (2019) Generating diverse high-fidelity images with vq-vae-2. In 2019 Conference on Neural Information Processing Systems (NIPS):14837–14847–14847
Salamon J, Gómez E (2012) Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Trans Audio Speech Lang Process 20(6):1759–1770
Article Google Scholar
Salamon J, Peeters G, Röbel A (2012) Statistical characterisation of melodic pitch contours and its application for melody extraction. In ISMIR, 187–192
Wang Z, Zou C, Cai W (2020) Small sample classification of hyperspectral remote sensing images based on sequential joint Deeping learning model. IEEE Access 8:71353–71363
Article Google Scholar
Wu J, Hu C, Wang Y, Hu X, Zhu J (2019) A hierarchical recurrent neural network for symbolic melody generation. IEEE Transactions on Cybernetics 50(6):2749–2757
Yamshchikov IP, Tikhonov A (2017) Music generation with variational recurrent autoencoder supported by history. arXiv preprint arXiv:1705.05458
Yang LC, Chou SY, Yang YH (2017) MidiNet: a convolutional generative adversarial network for symbolic-domain music generation. arXiv preprint arXiv:1703.10847
You H, Tian S, Yu L, Lv Y (2019) Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Trans Geosci Remote Sens 58(2):1281–1293
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Automatic, Nanjing University of Posts and Telecommunications, Nanjing, 210046, China
Runnan Lang, Songhao Zhu & Dongsheng Wang

Authors

Runnan Lang
View author publications
You can also search for this author in PubMed Google Scholar
Songhao Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Dongsheng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Songhao Zhu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

Appendix 2

Appendix 3

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lang, R., Zhu, S. & Wang, D. Pitch contours curve frequency domain fitting with vocabulary matching based music generation. Multimed Tools Appl 80, 28463–28486 (2021). https://doi.org/10.1007/s11042-021-11049-x

Download citation

Received: 26 May 2020
Revised: 01 November 2020
Accepted: 05 May 2021
Published: 04 June 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11042-021-11049-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Pitch contours curve frequency domain fitting with vocabulary matching based music generation

Abstract

Access this article

Similar content being viewed by others

Automatic Music Melody Generation Using LSTM and Markov Chain Model

Music Generation Using Deep Learning Techniques

A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation

References