research-article

A Variant Model of TGAN for Music Generation

Authors:
Ping-Sung Cheng

Department of Information Management, National Taichung University of Science and Technology, Taichung City, Taiwan

Department of Information Management, National Taichung University of Science and Technology, Taichung City, Taiwan
View Profile

,
Chieh-Ying Lai

Department of Information Management, National Taichung University of Science and Technology, Taichung City, Taiwan

Department of Information Management, National Taichung University of Science and Technology, Taichung City, Taiwan
View Profile

,
Chun-Chieh Chang

Department of Information Management, National Taichung University of Science and Technology, Taichung City, Taiwan

Department of Information Management, National Taichung University of Science and Technology, Taichung City, Taiwan
View Profile

,
Shu-Fen Chiou

Department of Information Management, National Taichung University of Science and Technology, Taichung City, Taiwan

Department of Information Management, National Taichung University of Science and Technology, Taichung City, Taiwan
View Profile

,
Yu-Chieh Yang

Department of Applied Statistics, National Taichung University of Science and Technology, Taichung City, Taiwan

Department of Applied Statistics, National Taichung University of Science and Technology, Taichung City, Taiwan
View Profile

ASSE '20: Proceedings of the 2020 Asia Service Sciences and Software Engineering ConferenceMay 2020Pages 40–45https://doi.org/10.1145/3399871.3399888

Published:03 July 2020Publication History

ASSE '20: Proceedings of the 2020 Asia Service Sciences and Software Engineering Conference

Pages 40–45

ABSTRACT

In the past five years, we have seen an increase in generative adversarial networks (GANs) and their applications for image generation. Due to the randomness and unpredictability of the structure of music, music generation is well suited to the use of GANs. Numerous studies have been published on music generation by using temporal GANs. However, few studies have focused on the relationships between melodies and chords, and the effects of latent space on time sequence.

We also propose a new method to implement latent structure on GANs for music generation. The main innovation of the proposed model is the use of new discriminator to recognize the time sequence of music and use of a pretrained beat generator to improve the quality of patterned melodies and chords. Results indicated that the pretrained model improved the quality of generated music.

References

Goodfellow I.J., Pouget-Abadie J., Mirza M., Xu B., WardeFarley D., Ozair S., Courville A., and Bengio Y. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems 27, 2672--2680. Curran Associates, Inc.Google ScholarDigital Library
Yang L.C., Chou S. Y., and Yang Y.H. 2017 Midinet: A convolutional generative adversarial network for symbolic-domain music generation using 1d and 2d conditions. In arXiv preprint, 324--331, arXiv: 1703.10847.Google Scholar
Saito M., Matsumoto E., Saito S. Temporal Generative Adversarial Nets with Singular Value Clipping. 2017. IEEE International Conference on Computer Vision (ICCV), 2830--2839.Google Scholar
Vondrick C., Pirsiavash H., and Torralba A. 2016. Generating videos with scene dynamics. In NIPS, 613--621.Google Scholar
Dosovitskiy A., Springenberg J. T., and Brox T. 2014. Learning to generate chairs with convolutional neural networks. In arXiv preprint, 1538--1546, arXiv: 1411.5928.Google Scholar
Mikolov T., Karafiát M., Burget L., Černocký J. and Khudanpur S. 2010. Recurrent neural network based language model. In Proceedings of Interspeech. 1045--1048.Google Scholar
LeCun Y., Bottou L., Bengio Y., and Haffner P. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11, 2278--2324.Google ScholarCross Ref
Mogren O. 2016. C-rnn-gan: Continuous recurrent neural networks with adversarial training. In arXiv preprint, arXiv:1611.09904.Google Scholar
Dong H.W., Hsiao W.Y., Yang L.C., and Yang Y.H. 2018. MuseGAN: Symbolic-domain music generation and accompaniment with multi-track sequential generative adversarial networks. In Proceeding AAAI Conference. Artificial Intelligence.Google Scholar
Dong H.W., Hsiao W.Y., Yang L.C., and Yang Y.H. 2018. MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In Proceedings of the ThirtySecond AAAI Conference on Artificial Intelligence, 34--41, New Orleans, Louisiana, USA.Google Scholar
Jang J.S. and Cheng W.N. 2002. Chord identification based on statistical methods and musical theory. MS Thesis, National Tsing Hua University, Taiwan, 16--34.Google Scholar
Yang Y.J. and Ko P.C. Melody Style Classification and automatic accompaniment Using Melody And Chord Features. MS Thesis, Tatung University, Taiwan, 8--9, 2009.Google Scholar
Radford A., Metz L., and Chintala S. 2015. Unsupervised representation learning with deep convolutional generative adversarial networks. In arXiv preprint, arXiv:1511.06434.Google Scholar
Tulyakov S., Liu M. Y., Yang X., and Kautz J. 2017. Mocogan: Decomposing motion and content for video generation. In arXiv preprint, arXiv:1707.04993.Google Scholar
Xie J., Zhu S. C., and Wu Y. N. 2017. Synthesizing dynamic patterns by spatial-temporal generative convnet. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 7093--7101.Google Scholar
Arjovsky M., Chintala S., and Bottou L. Wasserstein Generative Adversarial Networks. 2017. In Proceedings of the 34th International Conference on Machine Learning, 214--223.Google Scholar
Gulrajani I., Ahmed F., Arjovsky M., Dumoulin V., and Courville A. 2017. Improved training of Wasserstein GANs. In Advances in Neural Information Processing Systems 30 (NIPS 2017). arXiv:1704.00028.Google Scholar
Karras T., Aila T., Laine S., and Lehtinen J. 2017. Progressive growing of gans for improved quality, stability, and variation. In arXiv preprint, arXiv:1710.10196.Google Scholar
Zhang H., Xu T., Li H., Zhang S., Wang X., Huang X., and Metaxas D. 2016. Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In IEEE International Conference on Computer Vision (ICCV).Google Scholar
Jukedeck. Nottingham-dataset. Retrieved from https://github.com/jukedeck/nottingham-dataset.Google Scholar
Ioffe S. and Szegedy C. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In arXiv:1502.03167.Google Scholar
Nair V. and Hinton G. E. 2010. Rectified linear units improve restricted boltzmann machines. In ICML, 807--814. Omnipress.Google Scholar
Ba J. L., Kiros J. R., and Hinton G. E. 2016. Layer normalization. In arXiv preprint, arXiv:1607.06450.Google Scholar
Xu B., Wang N., Chen T., and Li M. 2015. Empirical evaluation of rectified activations in convolutional network. In arXiv preprint, arXiv: 1505.00853Google Scholar
Glorot X. and Bengio Y. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of AISTATS 2010, volume 9, 249--256.Google Scholar

Index Terms

A Variant Model of TGAN for Music Generation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations

Recommendations

Pop Music Generation: From Melody to Multi-style Arrangement
Special Issue on KDD 2018, Regular Papers and Survey Paper

Music plays an important role in our daily life. With the development of deep learning and modern generation techniques, researchers have done plenty of works on automatic music generation. However, due to the special requirements of both melody and ...
Read More
Self-attention generative adversarial networks applied to conditional music generation
Abstract
The task of audio and music generation in the waveform domain has become possible due to recent advances in deep learning. Generative Adversarial Networks (GANs) are a type of generative model that has achieved success in areas such as image, ...
Read More
Structure-Enhanced Pop Music Generation via Harmony-Aware Learning
MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pop music generation has always been an attractive topic for both musicians and scientists for a long time. However, automatically composing pop music with a satisfactory structure is still a challenging issue. In this paper, we propose to leverage ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ASSE '20: Proceedings of the 2020 Asia Service Sciences and Software Engineering Conference
May 2020
163 pages
ISBN:9781450377102
DOI:10.1145/3399871

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 July 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Artificial Intelligence
Generative Adversarial Networks
Music Generation
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 109
  Total Downloads
- Downloads (Last 12 months)31
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Variant Model of TGAN for Music Generation

ASSE '20: Proceedings of the 2020 Asia Service Sciences and Software Engineering Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Pop Music Generation: From Melody to Multi-style Arrangement

Self-attention generative adversarial networks applied to conditional music generation

Structure-Enhanced Pop Music Generation via Harmony-Aware Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A Variant Model of TGAN for Music Generation

ASSE '20: Proceedings of the 2020 Asia Service Sciences and Software Engineering Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Pop Music Generation: From Melody to Multi-style Arrangement

Self-attention generative adversarial networks applied to conditional music generation

Structure-Enhanced Pop Music Generation via Harmony-Aware Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media