Music performance style transfer for learning expressive musical performance

Xiao, Zhe; Chen, Xin; Zhou, Li

doi:10.1007/s11760-023-02788-5

Music performance style transfer for learning expressive musical performance

Original Paper
Published: 19 October 2023

Volume 18, pages 889–898, (2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Zhe Xiao^1,2,
Xin Chen^1,2 &
Li Zhou³

202 Accesses
Explore all metrics

Abstract

Generating expressive musical performance (EMP) is a hot issue in the field of music generation. Music played by humans is always more expressive than music produced by machines. To figure this out, it is crucial to explore the role of human performance in the production of music. This paper proposes a performance style transfer model to learn human performance style and implement EMP system. Our model is implemented using generative adversarial networks (GANs), with a multi-channel image composed of four elaborated spectrograms serving as the input to decompose and reconstruct music audio. To ensure training stability, we have designed a multi-channel consistency loss for GANs. Furthermore, given the lack of objective evaluation criteria for music generation, we propose a hybrid evaluation method that combines qualitative and quantitative methods to evaluate human-needs satisfaction. Three quantitative criteria are proposed at the feature and audio levels, respectively. The effectiveness of our method is verified on a public dataset through objective evaluation, which demonstrates its comparability to state-of-the-art algorithms. Additionally, subjective evaluations are conducted through visual analyses of both audio content and style. Finally, we conduct a musical Turing test in which subjects score the performance of the generated music. A series of experimental results show that our method is very competitive.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence in the creative industries: a review

Article Open access 02 July 2021

Using machine learning to predict artistic styles: an analysis of trends and the research agenda

Article Open access 15 April 2024

A review of intelligent music generation systems

Article 19 February 2024

Availability of data and materials

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Notes

https://github.com/XiaoZhecug/MST.
https://www.piano-e-competition.com/

References

Zhang, Z., Zhou, X., Qin, M., Chen, X.: Chinese character style transfer based on multi-scale gan. Signal Image Video Process. 16(2), 559–567 (2022)
Article Google Scholar
Xue, A.: End-to-end Chinese landscape painting creation using generative adversarial networks. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3863–3871 (2021)
Muhamed, A., Li, L., Shi, X., Yaddanapudi, S., Chi, W., Jackson, D., Suresh, R., Lipton, Z.C., Smola, A.J.: Symbolic music generation with transformer-gans. Proc. AAAI Conf. Artif. Intell. 35(1), 408–417 (2021)
Google Scholar
Isola, P., Zhu, J.-Y., Zhou, T., Efros, A. A.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Dai, S., Zhang, Z., Xia, G.G.: Music style transfer: a position paper. (2018) arXiv:1803.06841
Oore, S., Simon, I., Dieleman, S., Eck, D., Simonyan, K.: This time with feeling: learning expressive musical performance. Neural Comput. Appl. 32(4), 955–967 (2020)
Article Google Scholar
Anantrasirichai, N., Bull, D.: Artificial intelligence in the creative industries: a review. Artif. Intell. Rev., pp. 1–68 (2022)
Wiegreffe, S., Pinter, Y.: Attention is not not explanation. (2019) arXiv:1908.04626
Huang, C.-Z. A., Vaswani, A., Uszkoreit, J., Shazeer, N., Simon, I., Hawthorne, C., Dai, A. M., Hoffman, M. D., Dinculescu, M., Eck, D.: Music transformer (2018). arXiv:1809.04281
Child, R., Gray, S., Radford, A., Sutskever, I.: Generating long sequences with sparse transformers. (2019). arXiv:1904.10509
Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q. V., Salakhutdinov, R.: Transformer-xl: attentive language models beyond a fixed-length context (2019). arXiv:1901.02860
Donahue, C., Mao, H.H., Li, Y.E., Cottrell, G.W., McAuley, J.: Lakhnes: improving multi-instrumental music generation with cross-domain pre-training. (2019). arXiv:1907.04868
Engel, J., Resnick, C., Roberts, A., Dieleman, S., Norouzi, M., Eck, D., Simonyan, K.: Neural audio synthesis of musical notes with wavenet autoencoders. In: International Conference on Machine Learning. PMLR, pp. 1068–1077 (2017)
Verma, P., Smith, J.O.: Neural style transfer for audio spectograms. (2018). arXiv:1801.01589
Friberg, A., Bresin, R., Sundberg, J.: Overview of the kth rule system for musical performance. Adv. Cognit. Psychol. 2(2), 145 (2006)
Article Google Scholar
Teramura, K., Okuma, H., Taniguchi, Y., Makimoto, S., Maeda, S.-i.: Gaussian process regression for rendering music performance. In: Proceedings of the 10th International Conference on Music Perception and Cognition, ICMPC, pp. 167–172 (2008)
Chacón, C.E.C., Grachten, M.: The basis mixer: a computational romantic pianist. In: Late-Breaking Demos of the 17th International Society for Music Information Retrieval Conf. (ISMIR) (2016)
Kirke, A., Miranda, E.R.: An overview of computer systems for expressive music performance. Guide Comput. Expressive Music Perform., pp. 1–47 (2012)
Grachten, M., Widmer, G.: Linear basis models for prediction and analysis of musical expression. J. New Music Res. 41(4), 311–322 (2012)
Article Google Scholar
Cancino-Chacón, C.E.: Computational Modeling of Expressive Music Performance with Linear and Non-linear Basis Function Models. Johannes Kepler University, Linz (2018)
Google Scholar
Cancino-Chacón, C.E., Gadermaier, T., Widmer, G., Grachten, M.: An evaluation of linear and non-linear models of expressive dynamics in classical piano and symphonic music. Mach. Learn. 106(6), 887–909 (2017)
Article MathSciNet Google Scholar
Grachten, M., Krebs, F.: An assessment of learned score features for modeling expressive dynamics in music. IEEE Trans. Multimed. 16(5), 1211–1218 (2014)
Article Google Scholar
Peperkamp, J., Hildebrandt, K., Liem, C.C.S.: A formalization of relative local tempo variations in collections of performances. In: Proceedings of the 18th International Society for Music Information Retrieval Conference, pp. 158–164 (2017)
Moulieras, S., Pachet, F.: Maximum entropy models for generation of expressive music. (2016). arXiv:1610.03606
Van Herwaarden, S., Grachten, M., De Haas, W.B., Wang, H.-M., Yang, Y.-H., Lee, J.H., et al. Predicting expressive dynamics in piano performances using neural networks. In: Proceedings of the 15th Conference of the International Society for Music Information Retrieval (ISMIR 2014). International Society for Music Information Retrieval, pp. 45–52 (2014)
Malik, I., Ek, C.H.: Neural translation of musical style (2017). arXiv:1708.03535
Jeong, D., Kwon, T., Kim, Y., Lee, K., Nam, J.: Virtuosonet: a hierarchical rnn-based system for modeling expressive piano performance. In: ISMIR, pp. 908–915 (2019)
Maezawa, A., Yamamoto, K., Fujishima, T.: Rendering music performance with interpretation variations using conditional variational rnn. In: ISMIR, pp. 855–861 (2019)
Shi, Z.: Computational analysis and modeling of expressive timing in Chopin’s mazurkas. In: Proceedings of the 22nd ISMIR Conference, pp. 650–656 (2021)
Zhu, J.-Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
Jolicoeur-Martineau, A.: The relativistic discriminator: a key element missing from standard gan (2018). arXiv:1807.00734

Download references

Funding

This work is supported by the Technical Innovation Major Project of Hubei Province, China, under Grant 2020AEA010 and the Natural Science Foundation of Hubei Province, China, under Grant 2020CFA031 and Grant 2019CFB581.

Author information

Authors and Affiliations

School of Automation, China University of Geosciences, Wuhan, 430074, People’s Republic of China
Zhe Xiao & Xin Chen
Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems, Wuhan, 430074, China
Zhe Xiao & Xin Chen
School of Arts and Communication, China University of Geosciences, Wuhan, 430074, People’s Republic of China
Li Zhou

Authors

Zhe Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Xin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Li Zhou
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

ZX and XC wrote the main manuscript text and LZ polished the manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Xin Chen.

Ethics declarations

Conflict of interest

I declare that the authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Ethical approval

This declaration is not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xiao, Z., Chen, X. & Zhou, L. Music performance style transfer for learning expressive musical performance. SIViP 18, 889–898 (2024). https://doi.org/10.1007/s11760-023-02788-5

Download citation

Received: 13 February 2023
Revised: 12 June 2023
Accepted: 10 September 2023
Published: 19 October 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s11760-023-02788-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Music performance style transfer for learning expressive musical performance

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

Using machine learning to predict artistic styles: an analysis of trends and the research agenda

A review of intelligent music generation systems

Availability of data and materials

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Music performance style transfer for learning expressive musical performance

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

Using machine learning to predict artistic styles: an analysis of trends and the research agenda

A review of intelligent music generation systems

Availability of data and materials

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation