Polyphonic music generation generative adversarial network with Markov decision process

Huang, Wenkai; Xue, Yihao; Xu, Zefeng; Peng, Guanglong; Wu, Yu

doi:10.1007/s11042-022-12925-w

Polyphonic music generation generative adversarial network with Markov decision process

Published: 05 April 2022

Volume 81, pages 29865–29885, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Wenkai Huang ORCID: orcid.org/0000-0003-3111-7511¹,
Yihao Xue²,
Zefeng Xu²,
Guanglong Peng² &
…
Yu Wu³

422 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

In the process of polyphonic music creation, it is important to combine two or more independent melodies through technical treatment. However, due to the diversity of polyphonic music sequences and the limitations of neural networks, it is difficult to create chords or melodies beyond the training data. As the music sequence increases, the probability of the generator producing the same note will increase, which will destroy the coherence of the music. Therefore, this paper proposes a novel polyphonic music creation model, combining the ideas of the Markov decision process (MDP) and Monte Carlo tree search (MCTS) and improving the Wasserstein Generative Adversarial Network (WGAN) theory. Through the zero-sum game and conditional constraints between generator and discriminator, the model in this study is closer to the unconstrained creation of music, and the growth of music sequence will not affect music coherence. Experimental results show that the algorithm proposed here has a better effect on polyphonic music generation than the latest methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence in the creative industries: a review

Article Open access 02 July 2021

GenAI against humanity: nefarious applications of generative artificial intelligence and large language models

Article Open access 22 February 2024

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

Article 04 June 2022

References

Agarwal S, Saxena V, Singal V, Aggarwal S (2018) LSTM based music generation with dataset preprocessing and reconstruction techniques
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN. arXiv, 1701.07875. [Online]. Available: https://arxiv.org/abs/1701.07875. Accessed 6 Dec 2017
Bi CK et al (2019) Evacuation route recommendation using auto-encoder and Markov decision process. Appl Soft Comput 84. https://doi.org/10.1016/j.asoc.2019.105741
Browne CB et al (2012) A survey of Monte Carlo tree search methods. IEEE Trans Comput Intell AI Games 4:1–43. https://doi.org/10.1109/tciaig.2012.2186810
Article Google Scholar
Chen JN, Zhang C, Luo JT, Xie JF, Wan Y (2020) Driving maneuvers prediction based autonomous driving control by deep monte carlo tree search. IEEE Trans Veh Technol 69:7146–7158. https://doi.org/10.1109/tvt.2020.2991584
Article Google Scholar
Conklin D, Gasser M, Oertl S (2018) Creative chord sequence generation for electronic dance music. Appl Sci-Basel 8. https://doi.org/10.3390/app8091704
Creswell A et al (2018) Generative adversarial networks an overview. IEEE Signal Process Mag 35:53–65. https://doi.org/10.1109/msp.2017.2765202
Article Google Scholar
Dean RT, Forth J (2020) Towards a deep improviser: a prototype deep learning post-tonal free music generator. Neural Comput Appl 32:969–979. https://doi.org/10.1007/s00521-018-3765-x
Article Google Scholar
Dong HW, Yang YH (2018) Convolutional generative adversarial networks with binary neurons for polyphonic music generation. Presented at the 19th International Society for Music Information Retrieval Conference (ISMIR), Paris, France, Sept. 23–27
Dong HW, Hsiao WY, Yang LC (2017) MuseGAN: demonstration of a convolutional GAN based model for generating multi-track piano-rolls. Presented at the 18th International Society for Music Information Retrieval Conference (ISMIR), Suzhou, China, 23–27
Dong HW, Hsiao WY, Yang LC (2017) MuseGAN: multi-track sequential Generative Adversarial Networks for symbolic music generation and accompaniment. arXiv, 1709.06298. [Online]. Available: https://arxiv.org/abs/1709.06298. Accessed 24 Nov 2017
Goienetxea I, Mendialdua I, Rodriguez I, Sierra B (2019)Statistics-based music generation approach considering both rhythm and melody coherence. IEEE Access 7:183365–183382. https://doi.org/10.1109/access.2019.2959696
Article Google Scholar
Goodfellow IJ et al (2014) Generative adversarial networks. Adv Neural Inf Process Syst 3:2672–2680
Google Scholar
Hadjeres G, Nielsen F (2020) Anticipation-RNN: enforcing unary constraints in sequence generation, with application to interactive music generation. Neural Comput Appl 32:995–1005. https://doi.org/10.1007/s00521-018-3868-4
Article Google Scholar
Herremans D, Chew E, Morpheu S (2019) Generating structured music with constrained patterns and tension. IEEE Trans Affect Comput 10:510–523. https://doi.org/10.1109/taffc.2017.2737984
Article Google Scholar
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate sift. arXiv, 1502.03167. [Online]. Available: https://arxiv.org/abs/1502.03167. Accessed 2 Mar 2015
Juan LI, Mingquan Z (2011) Music database construction based on MIDI melody feature extraction. Com Eng App 47(26):124–128
Google Scholar
Lewis D, Schapire R, Callan J (1996) Training algorithms for linear text classifiers. ACM SIGIR-96, pp 298–306
Link: https://pan.baidu.com/s/13VGjGvYCZ8gid0KnVXOZEA, Extraction code:50zj
Liu JS, Chen R (1998) Sequential Monte Carlo methods for dynamic systems. J Am Stat Assoc 93:1032–1044. https://doi.org/10.2307/2669847
Article MathSciNet MATH Google Scholar
Lovejoy WS (1991) A survey of algorithmic methods for partially, observed Markov decision processes. Ann Oper Res 28:47–65. https://doi.org/10.1007/bf02055574
Article MathSciNet MATH Google Scholar
Ma JY, Yu W, Liang PW, Li C, Jiang JJ (2019) FusionGAN: A generative adversarial network for infrared and visible image fusion. Inform Fusion 48:11–26. https://doi.org/10.1016/j.inffus.2018.09.004
Article Google Scholar
Mangal S, Modak R, Joshi P (2019) LSTM based music generation system. arXiv, 1908.01080. [Online]. Available: https://arxiv.org/abs/1908.01080. Accessed 2 Aug 2019
Mao HH, Shin T, Cottrell GW, IEEE (2018) In IEEE 12th International Conference on Semantic Computing IEEE International Conference on Semantic Computing, 377–382
Mo F, Wang X, Li S, Qian H (2018) A music generation model for robotic composers. IEEE International Conference on Robotics and Biomimetics 18511690. https://doi.org/10.1109/ROBIO.2018.8665078
Opitz J, Burst S (2021) Macro F1 and micro F1. arXiv, 1911.03347. [Online]. Available: https://arxiv.org/abs/1911.03347. Accessed 8 Feb 2021
Parras J, Zazo S (2019) Learning attack mechanisms in Wireless Sensor Networks using Markov Decision Processes. Expert Syst Appl 122:376–387. https://doi.org/10.1016/j.eswa.2019.01.023
Article Google Scholar
Polo A, Sevillano X (2019) Musical vision: an interactive bio-inspired sonification tool to convert images into music. J Multimodal User Interfaces 13:231–243. https://doi.org/10.1007/s12193-018-0280-4
Article Google Scholar
Sehnke F et al (2010)Parameter-exploring policy gradients. Neural Netw 23:551–559. https://doi.org/10.1016/j.neunet.2009.12.004
Article Google Scholar
Sironi CF, Liu JL, Winands MHM (2020)Self-adaptive Monte Carlo tree search in general game playing. IEEE Trans Games 12:132–144. https://doi.org/10.1109/tg.2018.2884768
Article Google Scholar
Whorley RP, Conklin D (2016) Music generation from statistical models of harmony. J New Music Res 45:160–183. https://doi.org/10.1080/09298215.2016.1173708
Article Google Scholar
Xie Y, Franz E, Chu MY, Thuerey N (2018) tempoGAN: A temporally coherent, volumetric GAN for super-resolution fluid flow. ACM Trans Graphics 37. https://doi.org/10.1145/3197517.3201304
Yang G et al (2018) Deep de-aliasing generative adversarial networks for fast compressed sensing MRI reconstruction. IEEE Trans Med Imaging 37:1310–1321. https://doi.org/10.1109/tmi.2017.2785879
Article Google Scholar
Zhang H et al (2019) StackGAN plus plus: Realistic image synthesis with stacked generative adversarial networks. IEEE Trans Pattern Anal Mach Intell 41:1947–1962. https://doi.org/10.1109/tpami.2018.2856256
Article Google Scholar
Zhang YZ, Yao KJ, Zhang JD, Jiang F, Warren MA (2020) New Markov decision process based behavioral prediction system for airborne crews. IEEE Access 8:28021–28032. https://doi.org/10.1109/access.2019.2961239
Article Google Scholar

Download references

Acknowledgements

This work was supported by the Guangdong Provincial Key Platform and Major Scientific Research Projects under Grant 2018GXJK138.

Author information

Authors and Affiliations

Center for Research on Leading Technology of Special Equipment, School of Mechanical and Electrical Engineering, Guangzhou University, Guangzhou, 510006, China
Wenkai Huang
School of Mechanical and Electrical Engineering, Guangzhou University, 510006, Guangzhou, China
Yihao Xue, Zefeng Xu & Guanglong Peng
Laboratory Center, Guangzhou University, Guangzhou, 510006, People’s Republic of China
Yu Wu

Authors

Wenkai Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yihao Xue
View author publications
You can also search for this author in PubMed Google Scholar
Zefeng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Guanglong Peng
View author publications
You can also search for this author in PubMed Google Scholar
Yu Wu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization, Huang, W.K. and Xue, Y.H.; methodology, Xue, Y.H. and Xu, Z.F; software, Xue, Y.H.; validation, Huang, W.K. and Xue, Y.H.; formal analysis, Xue, Y.H. and Xu, Z.F; investigation, Huang, W.K. and Xue, Y.H.; resources, Huang, W.K. and Xue, Y.H.; data curation, Huang, W.K. and Xue, Y.H.; writing—original draft preparation, Xue, Y.H. and Xu, Z.F; writing—review and editing, Huang, W.K. and Xu, Z.F.; visualization, Xue, Y.H. and Peng, G.L.; supervision, Huang, W.K. and Xu, Z.F.; project administration, Huang, W.K. and Wu, Y.; funding acquisition, Huang, W.K. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Wenkai Huang.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, W., Xue, Y., Xu, Z. et al. Polyphonic music generation generative adversarial network with Markov decision process. Multimed Tools Appl 81, 29865–29885 (2022). https://doi.org/10.1007/s11042-022-12925-w

Download citation

Received: 07 December 2020
Revised: 17 February 2021
Accepted: 09 March 2022
Published: 05 April 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s11042-022-12925-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Polyphonic music generation generative adversarial network with Markov decision process

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

GenAI against humanity: nefarious applications of generative artificial intelligence and large language models

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Polyphonic music generation generative adversarial network with Markov decision process

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence in the creative industries: a review

GenAI against humanity: nefarious applications of generative artificial intelligence and large language models

Deepfakes generation and detection: state-of-the-art, open challenges, countermeasures, and way forward

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation