Skip to main content

Analyzing the Perceptual Salience of Audio Features for Musical Emotion Recognition

  • Conference paper
From Sounds to Music and Emotions (CMMR 2012)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7900))

Included in the following conference series:

  • 3519 Accesses

Abstract

While the organization of music in terms of emotional affect is a natural process for humans, quantifying it empirically proves to be a very difficult task. Consequently, no acoustic feature (or combination thereof) has emerged as the optimal representation for musical emotion recognition. Due to the subjective nature of emotion, determining whether an acoustic feature domain is informative requires evaluation by human subjects. In this work, we seek to perceptually evaluate two of the most commonly used features in music information retrieval: mel-frequency cepstral coefficients and chroma. Furthermore, to identify emotion-informative feature domains, we explore which musical features are most relevant in determining emotion perceptually, and which acoustic feature domains are most variant or invariant to those changes. Finally, given our collected perceptual data, we conduct an extensive computational experiment for emotion prediction accuracy on a large number of acoustic feature domains, investigating pairwise prediction both in the context of a general corpus as well as in the context of a corpus that is constrained to contain only specific musical feature transformations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barthet, M., Fazekas, G., Sandler, M.: Multidisciplinary perspectives on music emotion recognition: Implications for content and context-based models. In: Proceedings of the International Symposium on Computer Music Modeling and Retrieval (CMMR), London, UK (June 2012)

    Google Scholar 

  2. Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20, 273–295 (1995)

    MATH  Google Scholar 

  3. Dalla Bella, S., Peretz, I., Rousseau, L., Gosselin, N.: A developmental study of the affective value of tempo and mode in music. Cognition 80(3) (July 2001)

    Google Scholar 

  4. Husain, G., Thompson, W., Glenn Schellenberg, E.: Effects of musical tempo and mode on arousal, mood, and spatial abilities. Music Perception 20(2), 151–171 (2002)

    Article  Google Scholar 

  5. Gagnon, L., Peretz, I.: Mode and tempo relative contributions to happy-sad judgements in equitone melodies. Cognition & Emotion 17(1), 25–40 (2003)

    Article  Google Scholar 

  6. Gerardi, G., Gerken, L.: The development of affective responses to modality and melodic contour. Music Perception 12(3), 279–290 (1995)

    Article  Google Scholar 

  7. Hevner, K.: Experimental studies of the elements of expression in music. American Journal of Psychology 48, 246–268 (1936)

    Article  Google Scholar 

  8. Ipeirotis, P.: Demographics of mechanical turk. In: CeDER Working Papers. NYU Stern School of Business (2010)

    Google Scholar 

  9. Jiang, D., Lu, L., Zhang, H., Tao, J., Cai, L.: Music type classification by spectral contrast feature. In: Proc. Intl. Conf. on Multimedia and Expo., vol. 1, pp. 113–116 (2002)

    Google Scholar 

  10. Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the ACM Conference on Knowledge Discovery and Data Mining, KDD (2002)

    Google Scholar 

  11. Juslin, P.N., Karlsson, J., Lindström, E., Friberg, A., Schoonderwaldt, E.: Play it again with feeling: Computer feedback in musical communication of emotions. Journal of Experimental Psychology: Applied 12(2), 79–95 (2006)

    Article  Google Scholar 

  12. Kim, Y.E., Schmidt, E.M., Migneco, R., Morton, B., Richardson, P., Scott, J., Speck, J.A., Turnbull, D.: Music emotion recognition: A state of the art review. In: Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, Utrecht, Netherlands (2010)

    Google Scholar 

  13. Lee, J.H.: Crowdsourcing music similarity judgments using mechanical turk. In: Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, Utrecht, Netherlands (2010)

    Google Scholar 

  14. Mandel, M.I., Eck, D., Bengio, Y.: Learning tags that vary within a song. In: Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, Utrecht, Netherlands (2010)

    Google Scholar 

  15. Randel, D.M.: The Harvard dictionary of music, 4th edn. Belknap Press of Harvard University Press, Cambridge (2003)

    Google Scholar 

  16. Rigg, M.G.: Speed as a determiner of musical mood. Journal of Experimental Psychology 27, 566–571 (1940)

    Article  Google Scholar 

  17. Schmidt, E.M., Kim, Y.E.: Prediction of time-varying musical mood distributions using Kalman filtering. In: Proc. of the 9th IEEE Intl. Conf. on Machine Learning and Applications (ICMLA), Washington, D.C (2010)

    Google Scholar 

  18. Schmidt, E.M., Kim, Y.E.: Prediction of time-varying musical mood distributions from audio. In: Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, Utrecht, Netherlands (2010)

    Google Scholar 

  19. Schmidt, E.M., Kim, Y.E.: Modeling musical emotion dynamics with conditional random fields. In: Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, Miami, FL (October 2011)

    Google Scholar 

  20. Schmidt, E.M., Prockup, M., Scott, J., Morton, B., Kim, Y.E.: Relating perceptual and feature space invariances in music emotion recognition. In: Proceedings of the International Symposium on Computer Music Modeling and Retrieval (CMMR), London, UK (2012)

    Google Scholar 

  21. Schmidt, E.M., Turnbull, D., Kim, Y.E.: Feature selection for content-based, time-varying musical emotion regression. In: ACM MIR, Philadelphia, PA (2010)

    Google Scholar 

  22. Snow, R., O’Connor, B., Jurafsky, D., Ng, A.: Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks. In: Proc. Empirical Methods in NLP (2008)

    Google Scholar 

  23. Sorokin, A., Forsyth, D.: Utility data annotation with amazon mechanical turk. In: CVPR Workshops (2008)

    Google Scholar 

  24. Speck, J.A., Schmidt, E.M., Morton, B.G., Kim, Y.E.: A comparative study of collaborative vs. traditional annotation methods. In: Proceedings of the International Society for Music Information Retrieval (ISMIR) Conference, Miami, Florida (2011)

    Google Scholar 

  25. Thayer, R.E.: The Biopsychology of Mood and Arousal. Oxford Univ. Press, Oxford (1989)

    Google Scholar 

  26. Webster, G.D., Weir, C.G.: Emotional responses to music: Interactive effects of mode, texture, and tempo. Motivation and Emotion 29, 19–39 (2005)

    Article  Google Scholar 

  27. Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In: NIPS. MIT Press (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schmidt, E.M., Prockup, M., Scott, J., Dolhansky, B., Morton, B.G., Kim, Y.E. (2013). Analyzing the Perceptual Salience of Audio Features for Musical Emotion Recognition. In: Aramaki, M., Barthet, M., Kronland-Martinet, R., Ystad, S. (eds) From Sounds to Music and Emotions. CMMR 2012. Lecture Notes in Computer Science, vol 7900. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41248-6_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41248-6_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41247-9

  • Online ISBN: 978-3-642-41248-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics