Abstract
In the process of polyphonic music creation, it is important to combine two or more independent melodies through technical treatment. However, due to the diversity of polyphonic music sequences and the limitations of neural networks, it is difficult to create chords or melodies beyond the training data. As the music sequence increases, the probability of the generator producing the same note will increase, which will destroy the coherence of the music. Therefore, this paper proposes a novel polyphonic music creation model, combining the ideas of the Markov decision process (MDP) and Monte Carlo tree search (MCTS) and improving the Wasserstein Generative Adversarial Network (WGAN) theory. Through the zero-sum game and conditional constraints between generator and discriminator, the model in this study is closer to the unconstrained creation of music, and the growth of music sequence will not affect music coherence. Experimental results show that the algorithm proposed here has a better effect on polyphonic music generation than the latest methods.
Similar content being viewed by others
References
Agarwal S, Saxena V, Singal V, Aggarwal S (2018) LSTM based music generation with dataset preprocessing and reconstruction techniques
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN. arXiv, 1701.07875. [Online]. Available: https://arxiv.org/abs/1701.07875. Accessed 6 Dec 2017
Bi CK et al (2019) Evacuation route recommendation using auto-encoder and Markov decision process. Appl Soft Comput 84. https://doi.org/10.1016/j.asoc.2019.105741
Browne CB et al (2012) A survey of Monte Carlo tree search methods. IEEE Trans Comput Intell AI Games 4:1–43. https://doi.org/10.1109/tciaig.2012.2186810
Chen JN, Zhang C, Luo JT, Xie JF, Wan Y (2020) Driving maneuvers prediction based autonomous driving control by deep monte carlo tree search. IEEE Trans Veh Technol 69:7146–7158. https://doi.org/10.1109/tvt.2020.2991584
Conklin D, Gasser M, Oertl S (2018) Creative chord sequence generation for electronic dance music. Appl Sci-Basel 8. https://doi.org/10.3390/app8091704
Creswell A et al (2018) Generative adversarial networks an overview. IEEE Signal Process Mag 35:53–65. https://doi.org/10.1109/msp.2017.2765202
Dean RT, Forth J (2020) Towards a deep improviser: a prototype deep learning post-tonal free music generator. Neural Comput Appl 32:969–979. https://doi.org/10.1007/s00521-018-3765-x
Dong HW, Yang YH (2018) Convolutional generative adversarial networks with binary neurons for polyphonic music generation. Presented at the 19th International Society for Music Information Retrieval Conference (ISMIR), Paris, France, Sept. 23–27
Dong HW, Hsiao WY, Yang LC (2017) MuseGAN: demonstration of a convolutional GAN based model for generating multi-track piano-rolls. Presented at the 18th International Society for Music Information Retrieval Conference (ISMIR), Suzhou, China, 23–27
Dong HW, Hsiao WY, Yang LC (2017) MuseGAN: multi-track sequential Generative Adversarial Networks for symbolic music generation and accompaniment. arXiv, 1709.06298. [Online]. Available: https://arxiv.org/abs/1709.06298. Accessed 24 Nov 2017
Goienetxea I, Mendialdua I, Rodriguez I, Sierra B (2019)Statistics-based music generation approach considering both rhythm and melody coherence. IEEE Access 7:183365–183382. https://doi.org/10.1109/access.2019.2959696
Goodfellow IJ et al (2014) Generative adversarial networks. Adv Neural Inf Process Syst 3:2672–2680
Hadjeres G, Nielsen F (2020) Anticipation-RNN: enforcing unary constraints in sequence generation, with application to interactive music generation. Neural Comput Appl 32:995–1005. https://doi.org/10.1007/s00521-018-3868-4
Herremans D, Chew E, Morpheu S (2019) Generating structured music with constrained patterns and tension. IEEE Trans Affect Comput 10:510–523. https://doi.org/10.1109/taffc.2017.2737984
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate sift. arXiv, 1502.03167. [Online]. Available: https://arxiv.org/abs/1502.03167. Accessed 2 Mar 2015
Juan LI, Mingquan Z (2011) Music database construction based on MIDI melody feature extraction. Com Eng App 47(26):124–128
Lewis D, Schapire R, Callan J (1996) Training algorithms for linear text classifiers. ACM SIGIR-96, pp 298–306
Link: https://pan.baidu.com/s/13VGjGvYCZ8gid0KnVXOZEA, Extraction code:50zj
Liu JS, Chen R (1998) Sequential Monte Carlo methods for dynamic systems. J Am Stat Assoc 93:1032–1044. https://doi.org/10.2307/2669847
Lovejoy WS (1991) A survey of algorithmic methods for partially, observed Markov decision processes. Ann Oper Res 28:47–65. https://doi.org/10.1007/bf02055574
Ma JY, Yu W, Liang PW, Li C, Jiang JJ (2019) FusionGAN: A generative adversarial network for infrared and visible image fusion. Inform Fusion 48:11–26. https://doi.org/10.1016/j.inffus.2018.09.004
Mangal S, Modak R, Joshi P (2019) LSTM based music generation system. arXiv, 1908.01080. [Online]. Available: https://arxiv.org/abs/1908.01080. Accessed 2 Aug 2019
Mao HH, Shin T, Cottrell GW, IEEE (2018) In IEEE 12th International Conference on Semantic Computing IEEE International Conference on Semantic Computing, 377–382
Mo F, Wang X, Li S, Qian H (2018) A music generation model for robotic composers. IEEE International Conference on Robotics and Biomimetics 18511690. https://doi.org/10.1109/ROBIO.2018.8665078
Opitz J, Burst S (2021) Macro F1 and micro F1. arXiv, 1911.03347. [Online]. Available: https://arxiv.org/abs/1911.03347. Accessed 8 Feb 2021
Parras J, Zazo S (2019) Learning attack mechanisms in Wireless Sensor Networks using Markov Decision Processes. Expert Syst Appl 122:376–387. https://doi.org/10.1016/j.eswa.2019.01.023
Polo A, Sevillano X (2019) Musical vision: an interactive bio-inspired sonification tool to convert images into music. J Multimodal User Interfaces 13:231–243. https://doi.org/10.1007/s12193-018-0280-4
Sehnke F et al (2010)Parameter-exploring policy gradients. Neural Netw 23:551–559. https://doi.org/10.1016/j.neunet.2009.12.004
Sironi CF, Liu JL, Winands MHM (2020)Self-adaptive Monte Carlo tree search in general game playing. IEEE Trans Games 12:132–144. https://doi.org/10.1109/tg.2018.2884768
Whorley RP, Conklin D (2016) Music generation from statistical models of harmony. J New Music Res 45:160–183. https://doi.org/10.1080/09298215.2016.1173708
Xie Y, Franz E, Chu MY, Thuerey N (2018) tempoGAN: A temporally coherent, volumetric GAN for super-resolution fluid flow. ACM Trans Graphics 37. https://doi.org/10.1145/3197517.3201304
Yang G et al (2018) Deep de-aliasing generative adversarial networks for fast compressed sensing MRI reconstruction. IEEE Trans Med Imaging 37:1310–1321. https://doi.org/10.1109/tmi.2017.2785879
Zhang H et al (2019) StackGAN plus plus: Realistic image synthesis with stacked generative adversarial networks. IEEE Trans Pattern Anal Mach Intell 41:1947–1962. https://doi.org/10.1109/tpami.2018.2856256
Zhang YZ, Yao KJ, Zhang JD, Jiang F, Warren MA (2020) New Markov decision process based behavioral prediction system for airborne crews. IEEE Access 8:28021–28032. https://doi.org/10.1109/access.2019.2961239
Acknowledgements
This work was supported by the Guangdong Provincial Key Platform and Major Scientific Research Projects under Grant 2018GXJK138.
Author information
Authors and Affiliations
Contributions
Conceptualization, Huang, W.K. and Xue, Y.H.; methodology, Xue, Y.H. and Xu, Z.F; software, Xue, Y.H.; validation, Huang, W.K. and Xue, Y.H.; formal analysis, Xue, Y.H. and Xu, Z.F; investigation, Huang, W.K. and Xue, Y.H.; resources, Huang, W.K. and Xue, Y.H.; data curation, Huang, W.K. and Xue, Y.H.; writing—original draft preparation, Xue, Y.H. and Xu, Z.F; writing—review and editing, Huang, W.K. and Xu, Z.F.; visualization, Xue, Y.H. and Peng, G.L.; supervision, Huang, W.K. and Xu, Z.F.; project administration, Huang, W.K. and Wu, Y.; funding acquisition, Huang, W.K. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Huang, W., Xue, Y., Xu, Z. et al. Polyphonic music generation generative adversarial network with Markov decision process. Multimed Tools Appl 81, 29865–29885 (2022). https://doi.org/10.1007/s11042-022-12925-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12925-w