On the evaluation of generative models in music

Yang, Li-Chia; Lerch, Alexander

doi:10.1007/s00521-018-3849-7

On the evaluation of generative models in music

Original Article
Published: 03 November 2018

Volume 32, pages 4773–4784, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

5586 Accesses
8 Altmetric
1 Mention
Explore all metrics

Abstract

The modeling of artificial, human-level creativity is becoming more and more achievable. In recent years, neural networks have been successfully applied to different tasks such as image and music generation, demonstrating their great potential in realizing computational creativity. The fuzzy definition of creativity combined with varying goals of the evaluated generative systems, however, makes subjective evaluation seem to be the only viable methodology of choice. We review the evaluation of generative music systems and discuss the inherent challenges of their evaluation. Although subjective evaluation should always be the ultimate choice for the evaluation of creative results, researchers unfamiliar with rigorous subjective experiment design and without the necessary resources for the execution of a large-scale experiment face challenges in terms of reliability, validity, and replicability of the results. In numerous studies, this leads to the report of insignificant and possibly irrelevant results and the lack of comparability with similar and previous generative systems. Therefore, we propose a set of simple musically informed objective metrics enabling an objective and reproducible way of evaluating and comparing the output of music generative systems. We demonstrate the usefulness of the proposed metrics with several experiments on real-world data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

Notes

The deviation here refers to an element-wise standard deviation, which retains the dimension of each feature.
https://www.hooktheory.com/theorytab.
https://github.com/RichardYang40148/mgeval.

References

Agarwala N, Inoue Y, Sly A (2017) Music composition using recurrent neural networks. Stanford University, Technical Report in CS224
Ariza C (2009) The interrogator as critic: the turing test and the evaluation of generative music systems. Comput Music J 33(2):48–70
Article MathSciNet Google Scholar
Asmus EP (1999) Music assessment concepts: a discussion of assessment concepts and models for student assessment introduces this special focus issue. Music Educ J 86(2):19–24
Article Google Scholar
Babbitt M (1960) Twelve-tone invariants as compositional determinants. Music Q 46(2):246–259
Article Google Scholar
Balaban M, Ebcioğlu K, Laske O (eds) (1992) Understanding music with AI: perspectives on music cognition. MIT Press, Cambridge
Google Scholar
Bech S, Zacharov N (2007) Perceptual audio evaluation—theory, method and application. Wiley, London
Google Scholar
Boot P, Volk A, de Haas WB (2016) Evaluating the role of repeated patterns in folk song classification and compression. J New Music Res 45(3):223–238
Article Google Scholar
Bretan M, Weinberg G, Heck L (2017) A unit selection methodology for music generation using deep neural networks. In: International conference on computational creativity (ICCC). Atlanta, Georgia, USA
Briot JP, Hadjeres G, Pachet F (2019) Deep learning techniques for music generation—a survey. Springer, London
Google Scholar
Chordia P, Rae A (2007) Raag recognition using pitch-class and pitch-class dyad distributions. In: International society of music information retrieval (ISMIR), pp 431–436. Vienna, Austria
Chu H, Urtasun R, Fidler S (2016) Song from pi: a musically plausible network for pop music generation. In: International conference on learning representations (ICLR). San Juan, Puerto Rico
Chuan CH, Herremans D (2018) Modeling temporal tonal relations in polyphonic music through deep networks with a novel image-based representation. In: Association for the advancement of artificial intelligence (AAAI). New Orleans, Louisiana, USA
Colton S, Pease A, Ritchie G (2001) The effect of input knowledge on creativity. In: Technical reports of the Navy Center for Applied Research in Artificial Intelligence. Washington, DC, USA
Dong HW, Hsiao WY, Yang LC, Yang YH (2018) Musegan: multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. In: Association for the advancement of artificial intelligence (AAAI). New Orleans, Louisiana, USA
Gatys LA, Ecker AS, Bethge M (2016) A neural algorithm of artistic style. In: The annual meeting of the vision sciences society. St. Pete Beach, Florida, USA
Geisser S (1993) Predictive inference, vol 55. CRC Press, Boca Raton
Book Google Scholar
Geman D, Geman S, Hallonquist N, Younes L (2015) Visual turing test for computer vision systems. Proc Natl Acad Sci 112(12):3618–3623
Article Google Scholar
Gero JS, Kannengiesser U (2004) The situated function–behaviour–structure framework. Des Stud 25(4):373–391
Article Google Scholar
Gurumurthy S, Sarvadevabhatla RK, Radhakrishnan VB (2017) Deligan: generative adversarial networks for diverse and limited data. In: IEEE conference on computer vision and pattern recognition (CVPR). Honolulu, Hawaii, USA
Guyot WM (1978) Summative and formative evaluation. J Bus Educ 54(3):127–129. https://doi.org/10.1080/00219444.1978.10534702
Article Google Scholar
Hadjeres G, Pachet F (2016) Deepbach: a steerable model for bach chorales generation. In: International conference on machine learning (ICML). New York City, NY, USA
Hale CL, Green SK (2009) Six key principles for music assessment. Music Educ J 95(4):27–31. https://doi.org/10.1177/0027432109334772
Article Google Scholar
Henrik Norbeck’s abc tunes. Last accessed Mar 2018. http://www.norbeck.nu/abc/
Huang CZA, Cooijmans T, Roberts A, Courville A, Eck D (2017) Counterpoint by convolution. In: International society of music information retrieval (ISMIR). Suzhou, China
Huang KC, Jung Q, Lu J (2017) Algorithmic music composition using recurrent neural networking. Stanford University, Technical Report in CS221
Huang X, Li Y, Poursaeed O, Hopcroft J, Belongie S (2016) Stacked generative adversarial networks. In: IEEE conference on computer vision and pattern recognition (CVPR). Las Vegas, Nevada, USA
Johnson DD (2017) Generating polyphonic music using tied parallel networks. In: International conference on evolutionary and biologically inspired music and art, pp 128–143. Amsterdam, The Netherlands
Jordanous A (2012) A standardised procedure for evaluating creative systems: computational creativity evaluation based on what it is to be creative. Cognit Comput 4(3):246–279
Article Google Scholar
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. In: International conference on learning representations (ICLR). Toulon, France
Krumhansl C, Toiviainen P et al (2000) Dynamics of tonality induction: a new method and a new model. In: International conference on music perception and cognition (ICMPC). Keele, UK
Lee K (2006) Automatic chord recognition from audio using enhanced pitch class profile. In: International computer music conference (ICMC). New Orleans, Louisiana, USA
Liang F, Gotham M, Johnson M, Shotton J (2017) Automatic stylistic composition of bach chorales with deep lstm. In: International society of music information retrieval (ISMIR). Suzhou, China
Likert R (1932) A technique for the measurement of attitudes. Arch Psychol 22(140):5–55
Google Scholar
Marsden A (2013) Music, intelligence and artificiality. In: Readings in music and artificial intelligence, pp 25–38. Routledge
Meredith D (2016) Computational music analysis. Springer, Berlin
Book Google Scholar
Meyer LB (2008) Emotion and meaning in music. University of Chicago Press, Chicago
Google Scholar
Mogren O (2016) C-rnn-gan: continuous recurrent neural networks with adversarial training. In: Advances in neural information processing systems, constructive machine learning workshop (NIPS CML). Barcelona, Spain
Moog RA (1986) Midi: musical instrument digital interface. J Audio Eng Soc 34(5):394–404
Google Scholar
Mroueh Y, Sercu T (2017) Fisher gan. In: Advances in neural information processing systems (NIPS). Long Beach, CA, USA
O’Brien C, Lerch A (2015) Genre-specific key profiles. In: International computer music conference (ICMC). Denton, Texas, USA
Pati KA, Gururani S, Lerch A (2018) Assessment of student music performances using deep neural networks. Appl Sci 8(4):507. https://doi.org/10.3390/app8040507. http://www.mdpi.com/2076-3417/8/4/507
Pearce M, Meredith D, Wiggins G (2002) Motivations and methodologies for automation of the compositional process. Music Sci 6(2):119–147
Article Google Scholar
Pearce MT, Wiggins GA (2007) Evaluating cognitive models of musical composition. In: International joint workshop on computational creativity, pp 73–80. London, UK
Pease A, Colton S (2011) On impact and evaluation in computational creativity: a discussion of the turing test and an alternative proposal. In: Proceedings of the AISB symposium on AI and philosophy, p 39. York, United Kingdom
Pease T, Mattingly R (2003) Jazz composition: theory and practice. Berklee Press, Boston
Google Scholar
Ritchie G (2007) Some empirical criteria for attributing creativity to a computer program. Minds Mach 17(1):67–99
Article Google Scholar
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. In: Advances in neural information processing systems (NIPS). Barcelona, Spain
Scott DW (2015) Multivariate density estimation: theory, practice, and visualization. Wiley, Hoboken
Book Google Scholar
Shin A, Crestel L, Kato H, Saito K, Ohnishi K, Yamaguchi M, Nakawaki M, Ushiku Y, Harada T (2017) Melody generation for pop music via word representation of musical properties. arXivpreprint arXiv:1710.11549
Silverman BW (1986) Density estimation for statistics and data analysis, vol 26. CRC Press, Boca Raton
Book Google Scholar
Simon I, Morris D, Basu S (2008) Mysong: automatic accompaniment generation for vocal melodies. In: Proceedings of the SIGCHI conference on human factors in computing systems, pp 725–734. Florence, Italy
Sturm BL, Ben-Tal O (2017) Taking the models back to music practice: evaluating generative transcription models built using deep learning. J Creat Music Syst. https://doi.org/10.5920/JCMS.2017.09
Article Google Scholar
Temperley D, Marvin EW (2008) Pitch-class distribution and the identification of key. Music Percept Interdiscip J 25(3):193–212
Article Google Scholar
Theis L, van den Oord A, Bethge M (2016) A note on the evaluation of generative models. In: International conference on learning representations (ICLR). Caribe Hilton, San Juan, Puerto Rico. arXiv:1511.01844
Turing AM (1950) Computing machinery and intelligence. Mind 59(236):433–460
Article MathSciNet Google Scholar
Turlach BA et al (1993) Bandwidth selection in kernel density estimation: a review. Université catholique de Louvain Louvain-la-Neuve
Verbeurgt K, Dinolfo M, Fayer M (2004) Extracting patterns in music for composition via Markov chains. In: International conference on industrial, engineering and other applications of applied intelligent systems, pp 1123–1132. Springer, Ottawa, ON, Canada (2004)
Waite E, Eck D, Roberts A, Abolafia D (2016) Project magenta: generating long-term structure in songs and stories. https://magenta.tensorflow.org/blog/2016/07/15/lookback-rnn-attention-rnn/
Wu CW, Gururani S, Laguna C, Pati A, Vidwans A, Lerch A (2016) Towards the objective assessment of music performances. In: International conference on music perception and cognition (ICMPC). Hyderabad, AP, India
Yang LC, Chou SY, Yang YH (2017) Midinet: a convolutional generative adversarial network for symbolic-domain music generation. In: International society of music information retrieval (ISMIR). Suzhou, China
Zbikowski LM (2002) Conceptualizing music: cognitive structure, theory, and analysis. Oxford University Press, Oxford
Book Google Scholar
Zhang W, Wang J (2016) Design theory and methodology for enterprise systems. Enterp Inf Syst 10(3):245–248. https://doi.org/10.1080/17517575.2015.1080860
Article Google Scholar
Zhang WJ, Yang G, Lin Y, Ji C, Gupta MM (2018) On definition of deep learning. In: World automation congress (WAC). Stevenson, Washington, USA
Zhou Z, Cai H, Rong S, Song Y, Ren K, Zhang W, Wang J, Yu Y (2018) Activation maximization generative adversarial nets. In: International conference on learning representations (ICLR). Vancouver, Canada

Download references

Author information

Authors and Affiliations

Center for Music Technology, Georgia Institute of Technology, Atlanta, USA
Li-Chia Yang & Alexander Lerch

Authors

Li-Chia Yang
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Lerch
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li-Chia Yang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, LC., Lerch, A. On the evaluation of generative models in music. Neural Comput & Applic 32, 4773–4784 (2020). https://doi.org/10.1007/s00521-018-3849-7

Download citation

Received: 18 June 2018
Accepted: 26 October 2018
Published: 03 November 2018
Issue Date: May 2020
DOI: https://doi.org/10.1007/s00521-018-3849-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the evaluation of generative models in music

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

From artificial neural networks to deep learning for music generation: history, concepts and trends

A Systematic Evaluation of GPT-2-Based Music Generation

Deep learning’s shallow gains: a comparative evaluation of algorithms for automatic music generation

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

On the evaluation of generative models in music

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

From artificial neural networks to deep learning for music generation: history, concepts and trends

A Systematic Evaluation of GPT-2-Based Music Generation

Deep learning’s shallow gains: a comparative evaluation of algorithms for automatic music generation

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now