An automatic music generation method based on RSCLN_Transformer network

Zhang, Yumei; Lv, Xiaojiao; Li, Qi; Wu, Xiaojun; Su, Yuping; Yang, Honghong

doi:10.1007/s00530-023-01245-0

An automatic music generation method based on RSCLN_Transformer network

Regular Paper
Published: 12 January 2024

Volume 30, article number 4, (2024)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Yumei Zhang^1,2,3,
Xiaojiao Lv^1,2,
Qi Li^1,2,
Xiaojun Wu^1,2,3,
Yuping Su^1,2 &
…
Honghong Yang^2,3

117 Accesses
Explore all metrics

Abstract

With the development of artificial intelligence and deep learning, a large number of music generation methods have been proposed. Recently, Transformer has been widely used in music generation. However, the structural complexity of music puts forward higher requirements for music generation. In this paper, we propose a new automatic music generation network which consists of a Recursive Skip Connection with Layer Normalization (RSCLN) model, a Transformer-XL model and a multi-head attention mechanism. Our method not only alleviates the gradient vanishing problem in the model training, but also increases the ability of the model to capture the correlation of music information before and after, so as to generate music works closer to the original music style. Effectiveness of the RSCLN_Transformer-XL music automatic generation method is verified through music similarity evaluation experiments using music structure similarity and listening test. The experimental results show that the RSCLN_Transformer-XL music automatic generation model can generate better music than the Transformer-XL model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence in the creative industries: a review

Article Open access 02 July 2021

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Using machine learning to predict artistic styles: an analysis of trends and the research agenda

Article Open access 15 April 2024

Data Availability

The dataset used during the current study can be obtained from reference [26].

References

Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article ADS CAS PubMed Google Scholar
Hemalatha, E.: Artificial music generation using LSTM networks. Int. J. Eng. Adv. Technol. 9(2), 4315–4319 (2019)
Article Google Scholar
Kemal, E.: An expert system for harmonizing chorales in the style of J. S. Bach. J. Logic. Program. 8(1), 145–185 (1990)
MathSciNet Google Scholar
Salas, H., Gelbukh, A., Calvo, H.: Automatic music composition with simple probabilistic generative grammars. Polibits. 44(9), 59–65 (2011)
Article Google Scholar
Feng, Y., Zhou, C.L.: Advances in algorithmic composition. J. Software. 10(2), 209–215 (2006)
Article Google Scholar
Cao, X.Z., Zhang, A.L., Xu, J.C.: Intelligent music composition technology research based on genetic algorithm. Comput. Eng. Appl. 44(32), 206–209 (2008)
Google Scholar
Todd, P.M.: A connectionist approach to algorithmic composition. Comput. Music. J. 13(4), 27–43 (1989)
Article Google Scholar
Hochreiter, S., Schmidhube, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article CAS PubMed Google Scholar
Eck, D., Schmidhuber, J.: A first look at music composition using lstm recurrent neural networks. Istituto Dalle Molle Di Studi Sull Intelligenza Artificiale. 103(4), 48 (2002)
Google Scholar
Li, S., Sung, Y.: INCO-GAN: variable-length music generation method based on inception model-based conditional GAN. Mathematics. 9(4), 102–110 (2021)
Article Google Scholar
Dong, H.W., Hsiao, W.Y., Yang, L.C., Yang, Y.H.: Musegan: multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In: Proceedings of the 31th Association for the Advance of Artificial Intelligence Conference, pp. 212–225 (2018)
Yang, L.C., Chou, S.Y., Yang, Y.H.: MidiNet: a convolutional generative adversarial network for symbolic-domain music generation. In: Proceedings of the 18th International Society for Music Information Retrieval Conference, pp. 324–331 (2017)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Adv. Neural. Inf. Process. Syst. 30(13), 123–130 (2017)
Google Scholar
Deng, X., Chen, S.J., Chen, Y.F., Xu, J.: Multi-level convolutional transformer with adaptive ranking for semi-supervised crowd counting. In: Proceedings of the 4th International Conference on Algorithms, Computing and Artificial Intelligence, pp. 28–34 (2021)
Huang, C., Vaswani, A., Uszkoreit, J., Shazeer, N., Simon, I., Hawthorne, C., Dai, A., Hoffman, M., Dinculescu, M., Eck, D.: Music transformer: generating music with long-term structure. In: International Conference on Learning Representations, pp. 364–375 (2019)
Huang, Y.S., Yang, Y.H.: Pop music transformer: beat-based modeling and generation of expressive pop piano compositions. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1180–1188 (2020)
Choi, K., Hawthorne, C., Simon, I., Dinculescu, M., Engel, J.: Encoding musical style with transformer autoencoders. In: International Conference on Machine Learning, pp. 254–267 (2020)
Wu, S.L., Yang, Y.H.: The Jazz Transformer on the front line: exploring the shortcomings of AI-composed music through quantitative measures. In: Proceedings of the 21th International Society for Music Information Retrieval Conference, pp. 451–463 (2020)
Donahue, C., Mao, H.H., Li ,Y.E., Cottrell, G. W., Mcauley, J.: LakhNES: improving multi-instrumental music generation with cross-domain pre-training. In: Music Information Retrieval Conference, pp. 685–692 (2019)
Oore, S., Simon, I., Dieleman, S., et al.: This time with feeling: learning expressive musical performance. Neural Comput. Appl. 32(4), 955–967 (2020)
Article Google Scholar
Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.: Transformer-XL: Attentive language models beyond a fixed-length context. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2978–2988 (2019)
Liu, F., Ren, X., Zhang, Z., Sun, X., Zou, Y.: Rethinking Skip Connection with Layer Normalization in Transformers and ResNets. In: Proceedings of the 28th International Conference on Computational Linguistics, pp. 1324–1332 (2020)
Zhang, B., Sennrich, R.: Root mean square layer normalization. NeurIPS. 13(27), 12360–12371 (2019)
Google Scholar
Xiong, R., Yang, Y., He, D., Zheng, K., Liu, T.Y.: On layer normalization in the transformer architecture. In: International Conference on Machine Learning, pp. 10524–10533 (2020)
Shaw, P., Uszkoreit, J., Vaswani, A.: Self-attention with relative position representations. In: Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 464–468 (2018)
Huang, Y.S., Yang, Y.H.: Pop music Transformer: Beat-based modeling and generation of expressive pop piano compositions. In: ACM Multimedia, pp. 1180–1188 (2020)
Ma, N., Zhang, X., Liu, M., et al.: Activate or not: learning customized activation. Comput. Vision Pattern Recogn. 21(5), 145–157 (2020)
Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. Comput. Sci. 46(7), 122–127 (2014)
Google Scholar
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(4), 623–656 (1948)
Article MathSciNet Google Scholar
Levitin, D.J.: This is your brain on music: the science of a human obsession. Plume/Penguin, New York (2006)

Download references

Acknowledgements

This work is partially supported by the National Natural Science Foundation of China (No. 62377034, 11872036), the Shaanxi Key Science and Technology Innovation Team Project (No. 2022TD-26), the Fundamental Research Fund for the Central Universities (No. GK202101004, GK202205035), the Science and Technology Plan of Xi’an city (No. 22GXFW0020), Shaanxi Science and Technology Plan Project (No. 2023YBGY158), and the Key Laboratory of the Ministry of Culture and Tourism (No. 2023-02).

Author information

Authors and Affiliations

School of Computer Science, Shaanxi Normal University, Xi’an, China
Yumei Zhang, Xiaojiao Lv, Qi Li, Xiaojun Wu & Yuping Su
Key Laboratory of Intelligent Computing and Service Technology for Folk Song, Ministry of Culture and Tourism, Xi’an, China
Yumei Zhang, Xiaojiao Lv, Qi Li, Xiaojun Wu, Yuping Su & Honghong Yang
Key Laboratory of Modern Teaching Technology, Ministry of Education, Shaanxi Normal University, Xi’an, China
Yumei Zhang, Xiaojun Wu & Honghong Yang

Authors

Yumei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojiao Lv
View author publications
You can also search for this author in PubMed Google Scholar
Qi Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yuping Su
View author publications
You can also search for this author in PubMed Google Scholar
Honghong Yang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

YZ contributed to conceptualization, resources, validation, supervision, writing—review and editing. XL contributed to methodology, software, visualization, writing—original draft. QL performed methodology and writing—review and editing. XW and HY performed supervision, validation, writing—review and editing. YS was involved in writing—review and editing.

Corresponding author

Correspondence to Honghong Yang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Communicated by J. Gao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, Y., Lv, X., Li, Q. et al. An automatic music generation method based on RSCLN_Transformer network. Multimedia Systems 30, 4 (2024). https://doi.org/10.1007/s00530-023-01245-0

Download citation

Received: 08 November 2022
Accepted: 09 December 2023
Published: 12 January 2024
DOI: https://doi.org/10.1007/s00530-023-01245-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An automatic music generation method based on RSCLN_Transformer network

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Using machine learning to predict artistic styles: an analysis of trends and the research agenda

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An automatic music generation method based on RSCLN_Transformer network

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Using machine learning to predict artistic styles: an analysis of trends and the research agenda

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation