Skip to main content
Log in

Popular music representation: chorus detection & emotion recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper proposes a popular music representation strategy based on the song’s emotion. First, a piece of popular music is decomposed into chorus and verse segments through the proposed chorus detection algorithm. Three descriptive features: intensity, frequency band and rhythm regularity are extracted from the structured segments for emotion detection. A hierarchical Adaboost classifier is employed to recognize the emotion of a piece of popular music. The general emotion of the music is classified according to Thayer’s model into four emotions: happy, angry, depressed and relaxed. Experiments conducted on a 350-popular-music database show the average recall and precision of our proposed chorus detection are approximately 95 % and 84 %, respectively; and the average precision rate of emotion detection is 92 %. Additional tests are performed on songs with cover versions in different lyrics and languages, and the resultant precision rate is 90 %. The proposes approaches have been tested and proven by the professional online music company, KKBOX Inc. and show promising performance for effectively and efficiently identifying the emotions of a variety of popular music.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Ahalt SC, Krishnamurty AK, Chen P, Melton DE (1990) Competitive learning algorithms for vector quantization. Neural Netw 3:277–291

    Article  Google Scholar 

  2. Bartsch MA, Wakefield GH (2001) To catch a chorus: using chroma-based representations for audio thumb nailing. In: Proc IEEE workshop on the Appl of Signal Process to Audio and Acoust, pp 15–18

  3. Blackburn S, De Roure D (1998) A tool for content based navigation of music. In: Proc the 6th ACM Multimed, pp 361–368

  4. Bolte CE (1984) Secrets of successful song writing. Arco Publishing, New York

    Google Scholar 

  5. Cai R, Zhang C, Zhang L, Ma W (2007) Scalable music recommendation by search. In: Proc the 15th Int Conf on Multimed, pp 1065–1074

  6. Casey MA, Veltkamp R, Goto M, Leman M, Rhodes C, Slaney M (2008) Information retrieval: current directions and future challenges. IEEE Trans Proc IEEE 96(4):668–696

    Google Scholar 

  7. Chang CY, Wu CK, Lo CY, Wang CJ, Chung PC (2011) Music emotion recognition with consideration of personal preference. In: Proc IEEE Int Conf on Multidimensional Systems, pp 1–4

  8. Cheng HT, Yang YH, Lin YC, Chen HH (2009) Multimodal structure segmentation and analysis of music using audio and textual information. In: Proc IEEE International Symposium on Circuits and Systems, pp 1677–1680

  9. Chin YH, Lin CH, Siahaan E, Wang IC, Wang JC (2013) Music emotion classification using double-layer support vector machines. In: Proc IEEE Int Conf on Orange Technologies, pp 193–196

  10. Cooper M, Foote J (2001) Scene boundary detection via video self-similarity analysis. In: Proc Int Conf on Image Process, pp 378–381

  11. De León PP, Iñesta J (2007) Pattern recognition approach for music style identification using shallow statistical descriptors. IEEE Trans Syst Man Cybern Part C Appl Rev 37(2):248–257

    Article  Google Scholar 

  12. Deng JD, Simmermacher C, Cranefield S (2008) A study on feature analysis for musical instrument classification. IEEE Trans Syst Man Cybern B Cybern 38(2):429–438

    Article  Google Scholar 

  13. Foote J (1999) Visualizing music and audio using self-similarity. In: Proc ACM Multimed, pp 77–80

  14. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139

    Article  MATH  MathSciNet  Google Scholar 

  15. Fujigara H, Goto M, Kitahara T, Okuno HG (2009) A modeling of singing voice robust to accompaniment sounds and its application to singer identification and vocal-timbre-similarity-based music information retrieval. IEEE Trans Audio Speech Lang Process 18(3):638–648

    Article  Google Scholar 

  16. Goto M (2006) A chorus section detection method for musical audio signals and its application to a music listening station. IEEE Trans Audio Speech Lang Process 14(5):1783–1794

    Article  Google Scholar 

  17. Grossberg S (1987) Competitive learning: from iterative activation to adaptive resonance. Cogn Sci 11:23–63

    Article  Google Scholar 

  18. Hecht-Nielsen R (1987) Counter propagation networks. Appl Optics 26:4979–4984

    Article  Google Scholar 

  19. Islam M, Lee H, Paul A, Baek J (2007) Content-based music retrieval using beat information. In: Proc of Int Conf on Fuzzy Syst and Knowl Discov (FSKD), pp 317–321

  20. Juslin PN, Sloboda JA (2001) Music and emotion: Theory and research. Oxford University Press, New York

    Google Scholar 

  21. Kim YE, Schmidt EM, Migneco R, Morton BG, Richardson P, Scott J, Speck JA, Turnbull D (2010) Music emotion recognition: a state of the art review. In: Proc of International Society for Music Information Retrieval, pp 255–266

  22. Korhonen MD, Clausi DA, Jernigan ME (2006) Modeling emotional content of music using system identification. IEEE Trans Syst Man Cybern B Cybern 36(8):588–599

    Google Scholar 

  23. Kosugi N, Nishihara Y, Kon’ya S, Yamanuro M, Kushima K (1999) Music retrieval by humming. In: Proc Pac Rim Conf on Commun, Comput and Signal Process, pp 404–407

  24. Kosugi N, Nishihara Y, Sakata T, Yamamuro M, Kushima K (2000) A practical query-by humming system for a large music database. In: Proc the 8th ACM, pp 333–342

  25. Kuo F, Shan M (2009) Music retrieval by melody style. In: Proc Int Symp on Multimed, pp 613–618

  26. Lee SH, Yeh CH, Kuo CC J (2004) Automatic movie skimming with story units via general tempo analysis. In: Proc SPIE Electron Image Storage and Retr Methods and Appl for Multimed, pp 396–407

  27. Li G, An C, Pang J, Tan M, Tu X (2004) Color image adaptive clustering segmentation. In: Proc Third Int Conf on Image and Graphics, pp 104–107

  28. Li Y, Lee SH, Yeh CH, Kuo CCJ (2006) Techniques for movie content analysis and skimming. IEEE Signal Process Mag 23(2):79–89

    Article  MATH  Google Scholar 

  29. Lowrance R, Wagner RA (1975) An extension of the string-to-string correction problem. JACM 22(2):177–183, 1975

    Article  MATH  MathSciNet  Google Scholar 

  30. Lu L, Liu D, Zhang HJ (2005) Automatic mood detection and tracking of music audio signals. IEEE Trans Audio Speech Lang Process 14(1):5–18

    Article  MathSciNet  Google Scholar 

  31. Maddage NC, Xu C, Kankanhallo MS, Shao X (2004) Content-based music structure analysis with applications to music semantics understanding. In: Proc ACM Multimedia, pp 112–119

  32. Mathew C, Foote J (2002) Automatic music summarization via similarity analysis. In: Proc of Int Conf on Music Inf Retr, pp 81–85

  33. Matthew C, Foote J (2003) Summarizing popular music via structural similarity analysis. In: Proc IEEE workshop on the Appl of Signal Process to Audio and Acoust, pp 127–130

  34. McNab R, Smith L, Witten I, Henderson C, Cunningham S (1996) Towards the digital music library: tune retrieval form acoustic input. In: Proc ACM Digit Libr’96, pp 11–18

  35. MIREX2009. http://www.music-ir.org/mirex/wiki/2009:Structural_Segmentation

  36. Mulder T, Martens J, Pauws S, Vignoli F, Lesaffre M, Lenman M, Baets B, Meyer H (2006) Factors affecting music retrieval in query by melody. IEEE Trans Multimedia 8(4):728–739

    Article  Google Scholar 

  37. Negus K (1996) Popular music in theory: An introduction. University Press of New England, New Hampshire

    Google Scholar 

  38. Rumelhart DE, Zipser D (1985) Feature discovery by competitive learning. Cogn Sci 9:75–112

    Article  Google Scholar 

  39. Schutz A, Slock D (2009) Periodic signal modeling for the octave problem in music transcription. In: Proc Digit Signal Process, pp 1–6

  40. Shiu Y, Jeong H, Kuo CC J (2006) Similar segment detection for music structure analysis via Viterbi algorithm. In: Proc IEEE Int Conf on Multimed and Expo, pp 789–792

  41. Shuker R (2007) Understanding popular music culture. Routledge, New York

    Google Scholar 

  42. Stein DJ (2005) Engaging music: essay in music analysis. Oxford University Press, New York

    Google Scholar 

  43. Thayer R (1989) The biopsychology of mood and arousal. Oxford University Press, New York

    Google Scholar 

  44. Tsai T, Hung J (2006) Content-based retrieval of mp3 songs for one singer using quantization tree indexing and melody-line tracking method. In: Proc the Int Conf on Acoust, Speech and Signal Process, pp 505–508

  45. Tzacheva AA, Bell KJ (2010) Music information retrieval with temporal features and timbre. Springer Act Media Technol 6335:212–219

    Article  Google Scholar 

  46. Xu L, Krzyzak A (1993) Rival penalized competitive learning for clustering analysis, RBF Net, and curve detection. IEEE Trans Neural Netw 4(4):636–649

    Article  Google Scholar 

  47. Yang YH, Chen HH (2012) Machine recognition of music emotion: a review. ACM Transactions on Intelligent system and Technology (TIST) 3(3)

  48. Yeh CH, Lin HH, Chang HT (2009) An efficient emotion detection scheme for popular music. In: Proc IEEE Int Symp on Circuits & Syst, pp 1799–1802

  49. Yeh CH, Lin YD, Lee MS, Tseng WY (2010) Popular music analysis: chorus and emotion detection. In: Proc APSIPA ASC 2010

  50. Zhu Y, Xu C, Kankanhalli M (2003) Melody curve processing for music retrieval. In Proc Int Conf on Multimed and Expo, pp 285–288

Download references

Acknowledgment

This work was supported in part by KKBOX Inc. Our thanks to Wen-Hung Xu for executing the program on the test data in this work; his timely assistance is greatly appreciated.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chia-Yen Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yeh, CH., Tseng, WY., Chen, CY. et al. Popular music representation: chorus detection & emotion recognition. Multimed Tools Appl 73, 2103–2128 (2014). https://doi.org/10.1007/s11042-013-1687-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1687-2

Keywords

Navigation