Abstract
Emotion recognition of music objects is a promising and important research issues in the field of music information retrieval. Usually, music emotion recognition could be considered as a training/classification problem. However, even given a benchmark (a training data with ground truth) and using effective classification algorithms, music emotion recognition remains a challenging problem. Most previous relevant work focuses only on acoustic music content without considering individual difference (i.e., personalization issues). In addition, assessment of emotions is usually self-reported (e.g., emotion tags) which might introduce inaccuracy and inconsistency. Electroencephalography (EEG) is a non-invasive brain-machine interface which allows external machines to sense neurophysiological signals from the brain without surgery. Such unintrusive EEG signals, captured from the central nervous system, have been utilized for exploring emotions. This paper proposes an evidence-based and personalized model for music emotion recognition. In the training phase for model construction and personalized adaption, based on the IADS (the International Affective Digitized Sound system, a set of acoustic emotional stimuli for experimental investigations of emotion and attention), we construct two predictive and generic models \(AN\!N_1\) (“EEG recordings of standardized group vs. emotions”) and \(AN\!N_2\) (“music audio content vs. emotion”). Both models are trained by an artificial neural network. We then collect a subject’s EEG recordings when listening the selected IADS samples, and apply the \(AN\!N_1\) to determine the subject’s emotion vector. With the generic model and the corresponding individual differences, we construct the personalized model H by the projective transformation. In the testing phase, given a music object, the processing steps are: (1) to extract features from the music audio content, (2) to apply \(AN\!N_2\) to calculate the vector in the arousal-valence emotion space, and (3) to apply the transformation matrix H to determine the personalized emotion vector. Moreover, with respect to a moderate music object, we apply a sliding window on the music object to obtain a sequence of personalized emotion vectors, in which those predicted vectors will be fitted and organized as an emotion trail for revealing dynamics in the affective content of music object. Experimental results suggest the proposed approach is effective.






















Similar content being viewed by others
References
Bradley, M.M., Lang, P.J.: Measuring emotion: the self-assessment manikin and the semantic differential. J. Behav. Therapy Exp. Psychiatry 25(1), 49–59 (1994)
Grimm, I., Kroshel, K.: Evaluation of natural emotions using self assessment manikins. In: Proceedings of IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 381–385 (2005)
Ye, R.-C.: Exploring EEG spectral dynamics of music-induced emotions (master’s thesis). Technical report, Master Program of Sound and Music Innovative Technologies, National Chiao Tung University (2010)
Siegert, I., Böck, R., Vlasenko, B., Philippou-Hübner, D., Wendemu, A.: Appropriate emotional labelling of non-acted speech using basic emotions, geneva emotion wheel and self assessment manikins. In: Proceedings of IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2011)
Picard, R.W.: Affective Computing. The MIT Press, Cambridge, MA (1995)
Wu, D., Parsons, T.D., Mower, E., Narayanan, S.: Speech emotion estimation in 3d space. In: Proceedings of IEEE International Conference on Multimedia and Expo (ICME), pp. 737–742 (2010)
Yang, Y.-H., Lin, Y.-C., Su, Y.-F., Chen, H.H.: A regression approach to music emotion recognition. IEEE Trans. Audio Speech Lang. Process. 16(2), 448–457 (2008)
Li, H., Ren, F.: The study on text emotional orientation based on a three-dimensional emotion space model. In: Proceedings of International Conference on Natural Language Processing and Knowledge Engineering, pp. 1–6 (2009)
Sun, K., Yu, J., Huang, Y., Hu, X.: An improved valence-arousal emotion space for video affective content representation and recognition. In: Proceedings of IEEE International Conference on Multimedia and Expo, pp. 566–569 (2009)
Morishima, S., Harashima, H.: Emotion space for analysis and synthesis of facial expression. In: Proceedings of IEEE International Workshop on Robot and Human Communication, pp. 188–193 (1993)
Hsu, J.-L., Zhen, Y.-L., Lin, T.-C., Chiu, Y.-S.: Personalized music emotion recognition using electroencephalography (EEG). In: Proceedings of IEEE International Symposium on Multimedia (ISM), pp. 277–278 (2014). doi:10.1109/ISM.2014.19
Cabredo, R., Legaspi, R., Numao, M.: Identifying emotion segments in music by discovering motifs in physiological data. In: Proceedings of International Society for Music Information Retrieval (ISMIR), pp. 753–758 (2011)
Cabredo, R., Legaspi, R., Inventado, P.S., Numao, M.: An emotion model for music using brain waves. In: Proceedings of International Society for Music Information Retrieval (ISMIR), pp. 265–270 (2012)
Takahashi, K.: Remarks on computational emotion recognition from vital information. In: Proceedings of International Symposium on Image and Signal Processing and Analysis (ISPA), pp. 299–304 (2009)
Tseng, K.C., Wang, Y.-T., Lin, B.-S., Hsieh, P.H.: Brain computer interface-based multimedia controller. In: Proceedings of International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 277–280 (2012)
Wu, M.-H., Wang, C.-J., Yang, Y.-K., Wang, J.-S., Chung, P.-C.: Emotional quality level recognition based on HRV. In: Proceedings of International Joint Conference on Neural Networks (IJCNN), pp. 1–6 (2010)
Yang, Y., Zhou, J.: Recognition and analyses of EEG and ERP signals related to emotion: from the perspective of psychology. In: Proceedings of International Conference on Neural Interface and Control, pp. 96–99 (2005)
Lee, C.K., Yoo, S.K., Park, Y.J., Kim, N.H., Jeong, K.S., Lee, B.: Using neural network to recognize human emotions from heart rate variability and skin resistance. In: Proceedings of 27th Annual Conference of the International Engineering in Medicine and Biology (2005)
Sharma, T., Bhardwaj, S., Maringanti, H.B.: Emotion estimation using physiological signals. In: Proceedings of TENCON, pp. 1–5 (2008)
Lindquist, K.A., Wager, T.D., Kober, H., Bliss-Moreau, E., Barrett, L.F.: The brain basis of emotion: a meta-analytic review. Behav. Brain Sci. 35(3), 121–202 (2012)
Stemmler, G., Wacker, J.: Personality, emotion, and individual differences in physiological responses. Biol. Psychol. 84(3), 541–551 (2010). doi:10.1016/j.biopsycho.2009.09.012
MacKinnon, D.P., Lockwood, C.M., Hoffman, J.M., West, S.G., Sheets, V.: A comparison of methods to test mediation and other intervening variable effects. Psychol. Methods 7(1), 83–104 (2002)
McIntosh, A.R., Lobaugh, N.J.: Partial least squares analysis of neuroimaging data: applications and advances. Neuroimage 23(Suppl 1), 250–63 (2004). doi:10.1016/j.neuroimage.2004.07.020
Coan, J., Allen, J., McKnight, P.: A capability model of individual differences in frontal EEG asymmetry. Biol. Psychol. 72(2), 198–207 (2006). doi:10.1016/j.biopsycho.2005.10.003
Eerola, T., Vuoskoski, J.K.: A comparison of the discrete and dimensional models of emotion in music. Psychol. Music 39(1), 18–49 (2011)
Russell, J.A.: A circumplex model of affect. J. Person. Soc. Psychol. 39(6), 1161–1178 (1980)
Robert, E.: Thayer: Biopsychology of Mood and Arousal. Oxford Univ. Press, New York, USA (1989)
Tsai, C.-G.: The Cognitive Psychology of Music. National Taiwan University Press, Taipei (2013). (in Chinese)
Hunter, P.G., Schellenberg, E.G., Schimmack, U.: Feelings and perceptions of happiness and sadness induced by music: similarities, differences, and mixed emotions. Psychol. Aesth. Creat. Arts 4(1), 47–56 (2010)
Koelstra, S., Mühl, C., Soleymani, M., Lee, J.-S., Yazdani, A., Ebrahimi, T., Pun, T., Nijholt, A., Patras, I.: DEAP: a database for emotion analysis using physiological signals. IEEE Trans. Affect. Comput. 3(1), 18–31 (2012)
Wu, W., Lee, J.: Improvement of HRV methodology for positive/negative emotion assessment. In: Proceedings of International ICST Conference on Collaborative Computing: Networking, Applications and Worksharing, pp. 1–6 (2009). doi:10.4108/ICST.COLLABORATECOM2009.8296
Oude Bos, D.: EEG-based emotion recognition—the influence of visual and auditory stimuli. In: Capita Selecta (MSc Course). University of Twente (2006)
Jenke, R., Peer, A., Buss, M.: Feature extraction and selection for emotion recognition from EEG. IEEE Trans. Affect. Comput. 5(3), 327–339 (2014). doi:10.1109/TAFFC.2014.2339834
Ishino, K., Masafumi, H.: A feeling estimation system using a simple electroencephalograph. Proc. IEEE Int. Conf. Syst. Man Cybern. 5, 4204–4209 (2003)
Lin, Y.-P., Wang, C.-H., Wu, T.-L., Jeng, S.-K., Chen, J.-H.: Support vector machine for EEG signal classification during listening to emotional music. In: Proceedings of IEEE 10th Workshop on Multimedia Signal Processing, pp. 127–130 (2008). doi:10.1109/MMSP.2008.4665061
Lin, Y.-P., Wang, C.-H., Wu, T.-L., Jeng, S.-K., Chen, J.-H.: EEG-based emotion recognition in music listening: A comparison of schemes for multiclass support vector machine. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP ’09, pp. 489–492. IEEE Computer Society, Washington, DC, USA (2009). doi:10.1109/ICASSP.2009.4959627
Lin, Y.-P., Wang, C.-H., Jung, T.-P., Wu, T.-L., Jeng, S.-K., Duann, J.-R., Chen, J.-H.: EEG-based emotion recognition in music listening. IEEE Trans. Biomed. Eng. 57(7), 1798–1806 (2010)
Wang, J.-C., Yang, Y.-H., Wang, H.-M., Jeng, S.-K.: Personalized music emotion recognition via model adaptation. In: Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1–7 (2012)
Wang, S., Zhu, Y., Wu, G., Ji, Q.: Hybrid video emotional tagging using users’ EEG and video content. Multimed. Tools Appl. 72(2), 1257–1283 (2014). doi:10.1007/s11042-013-1450-8
Yazdani, A., Skodras, E., Fakotakis, N., Ebrahimi, T.: Multimedia content analysis for emotional characterization of music video clips. EURASIP J. Image Video Process. (26) (2013). doi:10.1186/1687-5281-2013-26
Wu, Z.-A.: Diagnostic Studies in Neurology. Yang-Chih Book Co., Ltd (in Chinese), Taipei (1998)
Levy, M., Sandler, M.: Music information retrieval using social tags and audio. IEEE Trans. Multimed. 11(3), 383–395 (2009)
Hsu, J.-L., Li, Y.-F.: A cross-modal method of labeling music tags. Multimed. Tools Appl. 58(3), 521–541 (2012). doi:10.1007/s11042-011-0729-x
Hsu, J.-L., Huang, C.-C.: Designing a graph-based framework to support a multi-modal approach for music information retrieval. Multimed. Tools Appl. 1–27 (2014). doi:10.1007/s11042-014-1860-2
Lartillot, O., Toiviainen, P., Eerola, T.: The MIRtoolbox. http://www.jyu.fi/hum/laitokset/musiikki/en/research/coe/materials/mirtoolbox. Accessed 16 Dec 2015
Lerch, A.: An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics. Wiley, Hoboken, New Jersey (2012)
Lerch, A.: Audio Content Analysis Library. http://www.audiocontentanalysis.org/code/. Accessed 16 Dec 2015
Bradley, M.M., Lang, P.J.: International affective digitized sounds (IADS): Stimuli, instruction manual and affective ratings (Tech. Rep. No. B-2). Technical report, The Center for Research in Psychophysiology, University of Florida, Gainesville, FL (1999)
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, New York (2003)
Wong, T., Kovesi, P., Datta, A.: Projective transformations for image transition animations. In: Proceedings of International Conference on Image Analysis and Processing, pp. 493–500 (2007)
Hagan, M.T., Demuth, H.B., Beale, M.H.: Neural Network Design. PWS Publishing, Boston, MA (1996)
Delorme, A., Makeig, S.: EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics. J. Neurosci. Methods 134, 9–21 (2004)
Elizabeth, Kahn, J.: Harmony and Invention—Antonio Vivaldi: The Four Seasons. http://soaugusta.org/_composers/_bak/January21/. Accessed 25 Mar 2015
Acknowledgements
This research was supported by the National Science Council under Contract Nos. MOST-104-2221-E-030-015-MY2 and NSC-102-2218-E-030-002. The authors would like to thank Dr. Yuan-Pin Lin (Swartz Center for Computational Neuroscience, Institute for Neural Computation, University of California, San Diego) for sharing neuroscience insights related to emotion recognition, inspiring research ideas, and shaping our future directions.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Communicated by B. Prabhakaran.
This research was supported by the National Science Council under Contract No. MOST-104-2221-E-030-015-MY2 and NSC-102-2218-E-030-002.
Appendix
Appendix
The Four Seasons is an example of program music (i.e., music with a narrative element). Each of the four sonnets is expressed in a concerto, which in turn is divided into three phrases or ideas, reflected in the three movements (fast-slow-fast) of each concerto. The arrangement is as follows:
-
Vivaldi Concerto No. 1 in E major, Op.8, RV269, “La primavera” (Spring):

Setting the mood of the opening movement, the opening ritornello (recurrent phrase) is marked in the score “The spring has returned.” The first violin solo is marked “Song of the birds,” while after a return of the ritornello, comes a soft murmuring on the violin. After the next ritornello comes the lightning and thunder, followed by an extensive return to the singing birds and gaiety.
The slow movement is a musical description of the snoozing goatherd, watched over by his dog, whose bark is imitated throughout the movement on the violas with repeated notes to be played “very loud and abruptly”.
The third movement, a rustic dance, opens with a suggestion of rustic bagpipes, complete with an imitation of their drones by sustained notes on the low strings [53].
-
Vivaldi Concerto No. 4 in F minor, Op.8, RV297, “L’inverno” (Winter):

The strings, with trills in the violins, describe the shivering in the winter cold. Swift arpeggios and scales by the solo violin describe the whipping of the wind, while a series of abrupt chords suggest stamping feet and running to get warm. But rapid tremolos show that all this activity is useless, since the teeth continue to chatter. Violin pizzicati depict the falling raindrops, after which a warm melody on the solo violin describes the pleasant indoors with its roaring fire. The Finale opens with sliding phrases by the violin—walking and slipping on thin ice. The orchestra joins with a slower rhythm to indicate the hesitant steps and fear of falling. But then we are back indoors, enjoying the warmth while the winds howl outside [53].
Rights and permissions
About this article
Cite this article
Hsu, JL., Zhen, YL., Lin, TC. et al. Affective content analysis of music emotion through EEG. Multimedia Systems 24, 195–210 (2018). https://doi.org/10.1007/s00530-017-0542-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-017-0542-0