Abstract
This paper proposes a popular music representation strategy based on the song’s emotion. First, a piece of popular music is decomposed into chorus and verse segments through the proposed chorus detection algorithm. Three descriptive features: intensity, frequency band and rhythm regularity are extracted from the structured segments for emotion detection. A hierarchical Adaboost classifier is employed to recognize the emotion of a piece of popular music. The general emotion of the music is classified according to Thayer’s model into four emotions: happy, angry, depressed and relaxed. Experiments conducted on a 350-popular-music database show the average recall and precision of our proposed chorus detection are approximately 95 % and 84 %, respectively; and the average precision rate of emotion detection is 92 %. Additional tests are performed on songs with cover versions in different lyrics and languages, and the resultant precision rate is 90 %. The proposes approaches have been tested and proven by the professional online music company, KKBOX Inc. and show promising performance for effectively and efficiently identifying the emotions of a variety of popular music.
Similar content being viewed by others
References
Ahalt SC, Krishnamurty AK, Chen P, Melton DE (1990) Competitive learning algorithms for vector quantization. Neural Netw 3:277–291
Bartsch MA, Wakefield GH (2001) To catch a chorus: using chroma-based representations for audio thumb nailing. In: Proc IEEE workshop on the Appl of Signal Process to Audio and Acoust, pp 15–18
Blackburn S, De Roure D (1998) A tool for content based navigation of music. In: Proc the 6th ACM Multimed, pp 361–368
Bolte CE (1984) Secrets of successful song writing. Arco Publishing, New York
Cai R, Zhang C, Zhang L, Ma W (2007) Scalable music recommendation by search. In: Proc the 15th Int Conf on Multimed, pp 1065–1074
Casey MA, Veltkamp R, Goto M, Leman M, Rhodes C, Slaney M (2008) Information retrieval: current directions and future challenges. IEEE Trans Proc IEEE 96(4):668–696
Chang CY, Wu CK, Lo CY, Wang CJ, Chung PC (2011) Music emotion recognition with consideration of personal preference. In: Proc IEEE Int Conf on Multidimensional Systems, pp 1–4
Cheng HT, Yang YH, Lin YC, Chen HH (2009) Multimodal structure segmentation and analysis of music using audio and textual information. In: Proc IEEE International Symposium on Circuits and Systems, pp 1677–1680
Chin YH, Lin CH, Siahaan E, Wang IC, Wang JC (2013) Music emotion classification using double-layer support vector machines. In: Proc IEEE Int Conf on Orange Technologies, pp 193–196
Cooper M, Foote J (2001) Scene boundary detection via video self-similarity analysis. In: Proc Int Conf on Image Process, pp 378–381
De León PP, Iñesta J (2007) Pattern recognition approach for music style identification using shallow statistical descriptors. IEEE Trans Syst Man Cybern Part C Appl Rev 37(2):248–257
Deng JD, Simmermacher C, Cranefield S (2008) A study on feature analysis for musical instrument classification. IEEE Trans Syst Man Cybern B Cybern 38(2):429–438
Foote J (1999) Visualizing music and audio using self-similarity. In: Proc ACM Multimed, pp 77–80
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55:119–139
Fujigara H, Goto M, Kitahara T, Okuno HG (2009) A modeling of singing voice robust to accompaniment sounds and its application to singer identification and vocal-timbre-similarity-based music information retrieval. IEEE Trans Audio Speech Lang Process 18(3):638–648
Goto M (2006) A chorus section detection method for musical audio signals and its application to a music listening station. IEEE Trans Audio Speech Lang Process 14(5):1783–1794
Grossberg S (1987) Competitive learning: from iterative activation to adaptive resonance. Cogn Sci 11:23–63
Hecht-Nielsen R (1987) Counter propagation networks. Appl Optics 26:4979–4984
Islam M, Lee H, Paul A, Baek J (2007) Content-based music retrieval using beat information. In: Proc of Int Conf on Fuzzy Syst and Knowl Discov (FSKD), pp 317–321
Juslin PN, Sloboda JA (2001) Music and emotion: Theory and research. Oxford University Press, New York
Kim YE, Schmidt EM, Migneco R, Morton BG, Richardson P, Scott J, Speck JA, Turnbull D (2010) Music emotion recognition: a state of the art review. In: Proc of International Society for Music Information Retrieval, pp 255–266
Korhonen MD, Clausi DA, Jernigan ME (2006) Modeling emotional content of music using system identification. IEEE Trans Syst Man Cybern B Cybern 36(8):588–599
Kosugi N, Nishihara Y, Kon’ya S, Yamanuro M, Kushima K (1999) Music retrieval by humming. In: Proc Pac Rim Conf on Commun, Comput and Signal Process, pp 404–407
Kosugi N, Nishihara Y, Sakata T, Yamamuro M, Kushima K (2000) A practical query-by humming system for a large music database. In: Proc the 8th ACM, pp 333–342
Kuo F, Shan M (2009) Music retrieval by melody style. In: Proc Int Symp on Multimed, pp 613–618
Lee SH, Yeh CH, Kuo CC J (2004) Automatic movie skimming with story units via general tempo analysis. In: Proc SPIE Electron Image Storage and Retr Methods and Appl for Multimed, pp 396–407
Li G, An C, Pang J, Tan M, Tu X (2004) Color image adaptive clustering segmentation. In: Proc Third Int Conf on Image and Graphics, pp 104–107
Li Y, Lee SH, Yeh CH, Kuo CCJ (2006) Techniques for movie content analysis and skimming. IEEE Signal Process Mag 23(2):79–89
Lowrance R, Wagner RA (1975) An extension of the string-to-string correction problem. JACM 22(2):177–183, 1975
Lu L, Liu D, Zhang HJ (2005) Automatic mood detection and tracking of music audio signals. IEEE Trans Audio Speech Lang Process 14(1):5–18
Maddage NC, Xu C, Kankanhallo MS, Shao X (2004) Content-based music structure analysis with applications to music semantics understanding. In: Proc ACM Multimedia, pp 112–119
Mathew C, Foote J (2002) Automatic music summarization via similarity analysis. In: Proc of Int Conf on Music Inf Retr, pp 81–85
Matthew C, Foote J (2003) Summarizing popular music via structural similarity analysis. In: Proc IEEE workshop on the Appl of Signal Process to Audio and Acoust, pp 127–130
McNab R, Smith L, Witten I, Henderson C, Cunningham S (1996) Towards the digital music library: tune retrieval form acoustic input. In: Proc ACM Digit Libr’96, pp 11–18
MIREX2009. http://www.music-ir.org/mirex/wiki/2009:Structural_Segmentation
Mulder T, Martens J, Pauws S, Vignoli F, Lesaffre M, Lenman M, Baets B, Meyer H (2006) Factors affecting music retrieval in query by melody. IEEE Trans Multimedia 8(4):728–739
Negus K (1996) Popular music in theory: An introduction. University Press of New England, New Hampshire
Rumelhart DE, Zipser D (1985) Feature discovery by competitive learning. Cogn Sci 9:75–112
Schutz A, Slock D (2009) Periodic signal modeling for the octave problem in music transcription. In: Proc Digit Signal Process, pp 1–6
Shiu Y, Jeong H, Kuo CC J (2006) Similar segment detection for music structure analysis via Viterbi algorithm. In: Proc IEEE Int Conf on Multimed and Expo, pp 789–792
Shuker R (2007) Understanding popular music culture. Routledge, New York
Stein DJ (2005) Engaging music: essay in music analysis. Oxford University Press, New York
Thayer R (1989) The biopsychology of mood and arousal. Oxford University Press, New York
Tsai T, Hung J (2006) Content-based retrieval of mp3 songs for one singer using quantization tree indexing and melody-line tracking method. In: Proc the Int Conf on Acoust, Speech and Signal Process, pp 505–508
Tzacheva AA, Bell KJ (2010) Music information retrieval with temporal features and timbre. Springer Act Media Technol 6335:212–219
Xu L, Krzyzak A (1993) Rival penalized competitive learning for clustering analysis, RBF Net, and curve detection. IEEE Trans Neural Netw 4(4):636–649
Yang YH, Chen HH (2012) Machine recognition of music emotion: a review. ACM Transactions on Intelligent system and Technology (TIST) 3(3)
Yeh CH, Lin HH, Chang HT (2009) An efficient emotion detection scheme for popular music. In: Proc IEEE Int Symp on Circuits & Syst, pp 1799–1802
Yeh CH, Lin YD, Lee MS, Tseng WY (2010) Popular music analysis: chorus and emotion detection. In: Proc APSIPA ASC 2010
Zhu Y, Xu C, Kankanhalli M (2003) Melody curve processing for music retrieval. In Proc Int Conf on Multimed and Expo, pp 285–288
Acknowledgment
This work was supported in part by KKBOX Inc. Our thanks to Wen-Hung Xu for executing the program on the test data in this work; his timely assistance is greatly appreciated.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yeh, CH., Tseng, WY., Chen, CY. et al. Popular music representation: chorus detection & emotion recognition. Multimed Tools Appl 73, 2103–2128 (2014). https://doi.org/10.1007/s11042-013-1687-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1687-2