Skip to main content

Bi-Modal Deep Boltzmann Machine Based Musical Emotion Classification

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2016 (ICANN 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9887))

Included in the following conference series:

Abstract

Music plays an important role in many people’s lives. When listening to music, we usually choose those music pieces that best suit our current moods. However attractive, automating this task remains a challenge. To this end the approaches in the literature exploit different kinds of information (audio, visual, social, etc.) about individual music pieces. In this work, we study the task of classifying music into different mood categories by integrating information from two domains: audio and semantic. We combine information extracted directly from audio with information about the corresponding tracks’ lyrics using a bi-modal Deep Boltzmann Machine architecture and show the effectiveness of this approach through empirical experiments using the largest music dataset publicly available for research and benchmark purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The million song dataset. In: Proceedings of 12th International Society for Music Information Retrieval Conference, pp. 591–596 (2011)

    Google Scholar 

  2. Corona, H., O’Mahony, M.P.: An exploration of mood classification in the million songs dataset. In: Proceedings of 12th Sound and Music Computing Conference (2015)

    Google Scholar 

  3. He, H., Jin, J., Xiong, Y., Chen, B., Sun, W., Zhao, L.: Language feature mining for music emotion classification via supervised learning from lyrics. In: Kang, L., Cai, Z., Yan, X., Liu, Y. (eds.) ISICA 2008. LNCS, vol. 5370, pp. 426–435. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. Hu, X., Choi, K., Downie, J.S.: A framework for evaluating multimodal music mood classification. J. Assoc. Inf. Sci. Technol. (2016) (In press)

    Google Scholar 

  5. Hu, X., Downie, J.S.: When lyrics outperform audio for music mood classification: a feature analysis. In: Proceedings of 11th International Society for Music Information Retrieval Conference, pp. 619–624 (2010)

    Google Scholar 

  6. Hu, X., Downie, J.S., Ehmann, A.F.: Lyric text mining in music mood classification. In: Proceedings of 10th International Society for Music Information Retrieval Conference, pp. 411–416 (2009)

    Google Scholar 

  7. Hu, X., Downie, J.S., Laurier, C., Bay, M., Ehmann, A.F.: The 2007 MIREX audio mood classification task: lessons learned. In: Proceedings of 9th International Conference on Music Information Retrieval, pp. 462–467 (2008)

    Google Scholar 

  8. Hu, Y., Chen, X., Yang, D.: Lyric-based song emotion detection with affective lexicon and fuzzy clustering method. In: Proceedings of 10th International Society for Music Information Retrieval Conference, pp. 123–128 (2009)

    Google Scholar 

  9. Kim, Y.E., Schmidt, E.M., Migneco, R., Morton, B.G., Richardson, P., Scott, J.J., Speck, J.A., Turnbull, D.: State of the art report: music emotion recognition: a state of the art review. In: Proceedings of 11th International Society for Music Information Retrieval Conference, pp. 255–266 (2010)

    Google Scholar 

  10. Laurier, C., Grivolla, J., Herrera, P.: Multimodal music mood classification using audio and lyrics. In: Proceedings of 7th International Conference on Machine Learning and Applications, pp. 688–693 (2008)

    Google Scholar 

  11. Li, T., Mitsunori, O., Tzanetakis, G. (eds.): Music Data Mining. CRC Press, Boca Raton (2012)

    Google Scholar 

  12. Li, T., Ogihara, M.: Detecting emotion in music. In: Proceedings of 4th International Society for Music Information Retrieval Conference (2003)

    Google Scholar 

  13. Lu, L., Liu, D., Zhang, H.: Automatic mood detection and tracking of music audio signals. IEEE Trans. Audio Speech Lang. Process. 14(1), 5–18 (2006)

    Article  MathSciNet  Google Scholar 

  14. Mayer, R., Neumayer, R., Rauber, A.: Combination of audio and lyrics features for genre classification in digital audio collections. In: Proceedings of 16th International Conference on Multimedia, pp. 159–168 (2008)

    Google Scholar 

  15. Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161–1178 (1980)

    Article  Google Scholar 

  16. Salakhutdinov, R., Hinton, G.E.: Deep boltzmann machines. In: Proceedings of 12th International Conference on Artificial Intelligence and Statistics, pp. 448–455 (2009)

    Google Scholar 

  17. Schindler, A., Mayer, R., Rauber, A.: Facilitating comprehensive benchmarking experiments on the million song dataset. In: Proceedings of 2012 International Society for Music Information Retrieval Conference, pp. 469–474 (2012)

    Google Scholar 

  18. Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. Technical report, DTIC Document (1986)

    Google Scholar 

  19. Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep boltzmann machines. J. Mach. Learn. Res. 15(1), 2949–2980 (2014)

    MathSciNet  MATH  Google Scholar 

  20. Xue, H., Xue, L., Su, F.: Multimodal music mood classification by fusion of audio and lyrics. In: He, X., Luo, S., Tao, D., Xu, C., Yang, J., Hasan, M.A. (eds.) MMM 2015, Part II. LNCS, vol. 8936, pp. 26–37. Springer, Heidelberg (2015)

    Google Scholar 

  21. Yang, Y.H., Chen, H.H.: Machine recognition of music emotion: a review. ACM Trans. Intell. Syst. Technol. 3(3), 338–343 (2012)

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China (No. 61332018), the National Department Public Benefit Research Foundation (No. 201510209), and the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenge Rong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Huang, M., Rong, W., Arjannikov, T., Jiang, N., Xiong, Z. (2016). Bi-Modal Deep Boltzmann Machine Based Musical Emotion Classification. In: Villa, A., Masulli, P., Pons Rivero, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2016. ICANN 2016. Lecture Notes in Computer Science(), vol 9887. Springer, Cham. https://doi.org/10.1007/978-3-319-44781-0_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-44781-0_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-44780-3

  • Online ISBN: 978-3-319-44781-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics