Abstract
Digital multimedia may elicit a mixture of human emotions. Most current emotional tagging research typically tags the multimedia data with a single emotion, ignoring the phenomenon of multi-emotion coexistence. To address this problem, we propose a novel multi-emotion tagging approach by explicitly modeling the dependencies among emotions. First, several audio or visual features are extracted from the multimedia data. Second, four traditional multi-label learning methods: Binary Relevance, Random k label sets, Binary Relevance k Nearest Neighbours and Multi-Label k Nearest Neighbours, are used as the classifiers to obtain the measurements of emotional tags. Then, a Bayesian network is automatically constructed to capture the relationships among emotional tags. Finally, the Bayesian network is used to infer the data’s multi-emotion tags by combining the measurements obtained from those traditional methods with the dependencies among emotions. Experiments on two multi-label media data sets demonstrate the superiority of our approach to the existing methods.
Similar content being viewed by others
References
Arifin S, Cheung PYK (2007) A novel probabilistic approach to modeling the pleasure-arousal-dominance content of the video based on “working memory”. In: International Conference on Semantic Computing, ICSC 2007. IEEE, pp 147–154
Bhattacharya S, Sukthankar R, Shah M (2010) A framework for photo-quality assessment and enhancement based on visual aesthetics. In: Proceedings of the international conference on multimedia. ACM, pp 271–280
Bischoff K, Firan CS, Paiu R, Nejdl W, Laurier C, Sordo M (2009) Music mood and theme classification-a hybrid approach. In: Proceedings of the international conference on music information retrieval, pp 657–662
Datta R, Joshi D, Li J, Wang JZ (2006) Studying aesthetics in photographic images using a computational approach. In: Computer vision–ECCV 2006, pp 288–301
de Campos CP, Ji Q (2011) Efficient structure learning of Bayesian networks using constraints. J Mach Learn Res 12:663–689
Dorai C, Venkatesh S (2001) Computational media aesthetics: finding meaning beautiful. IEEE Multimedia 8(4):10–12
Eerola T, Lartillot O, Toiviainen P (2009) Prediction of multidimensional emotional ratings in music from audio using multivariate regression models. In: Proceedings of the international conference on music information retrieval, pp 621–626
Feng Y, Zhuang Y, Pan Y (2003) Popular music retrieval by detecting mood. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. ACM, pp 375–376
Fürnkranz J, Hüllermeier E, Loza Mencía E, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153
Ghamrawi N, McCallum A (2005) Collective multi-label classification. In: Proceedings of the 14th ACM international Conference on Information and Knowledge Management, CIKM ’05. ACM, New York, pp 195–200
Godbole S, Sarawagi S (2004) Discriminative methods for multi-labeled classification. In: Proceedings of the 8th Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 22–30
Gross JJ, Levenson RW (1995) Emotion elicitation using films. Cogn Emot 9(1):87–108
Hanjalic A, Xu LQ (2005) Affective video content representation and modeling. IEEE Trans Multimedia 7(1):143–154
Hevner K (1935) Expression in music: a discussion of experimental studies and theories. Psychol Rev 42(2):186
Holmes G, Read J, Pfahringer B, Frank E (2009) Classifier chains for multi-label classification. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: part II, ECML PKDD ’09. Springer, Berlin, Heidelberg, pp 254–269
Huang S-J, Yu Y, Zhou Z-H (2012) Multi-label hypothesis reuse. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge Discovery and Data mining, KDD ’12. ACM, New York, pp 525–533
Joshi D, Datta R, Fedorovskaya E, Luong Q-T, Wang JZ, Li J, Luo J (2011) Aesthetics and emotions in images. IEEE Signal Process Mag 28(5):94–115
Kang HB (2003) Affective contents retrieval from video with relevance feedback. In: Sembok TMT, Zaman HB, Chen H, Urs SR, Myaeng S-H (eds) Digital libraries: technology and management of indigenous knowledge for global access. Lecture Notes in Computer Science, vol. 2911. Springer Berlin Heidelberg, pp 243–252
Kim EY, Kim SJ, Koo HJ, Jeong K, Kim JI (2005) Emotion-based textile indexing using colors and texture. In: Wang L, Jin Y (eds) Fuzzy systems and knowledge discovery. Lecture Notes in Computer Science, vol. 3613. Springer Berlin Heidelberg, pp 1077–1080
Kim YE, Schmidt EM, Migneco R, Morton BG, Richardson P, Scott J, Speck JA, Turnbull D (2010) Music emotion recognition: a state of the art review. In: Proc. ISMIR. Citeseer, pp 255–266
Li D, Sethi IK, Dimitrova N, McGee T (2001) Classification of general audio data for content-based retrieval. Pattern Recogn Lett 22(5):533–544
Li T, Ogihara M (2003) Detecting emotion in music. In: Proceedings of the international symposium on music information retrieval. Washington, pp 239–240
Liu D, Lu L, Zhang H-J (2003) Automatic mood detection from acoustic music data. In: Proceedings of the international symposium on music information retrieval, pp 81–87
Liu C-C, Yang Y-H, Wu P-H, Chen HH (2006) Detecting and classifying emotion in popular music. In: Proceedings of the joint international conference on information sciences, pp 996–999
Luca C, Sergio B, Riccardo L (2013) Affective recommendation of movies based on selected connotative features. IEEE Trans Circ Syst Video Technol 23(4):636–647
MacDorman KF, Ough S, Hoa C-C (2007) Automatic emotion prediction of song excerpts: index construction, algorithm design, and empirical comparison. J New Music Res 36(4):281–299
Marcelino R, Teixeira A, Yamasaki T, Aizawa K (2011) Determination of emotional content of video clips by low-level audiovisual features. Multimed Tools Appl 61(1):21–49
Moncrieff S, Dorai C, Venkatesh S (2001) Affect computing in film through sound energy dynamics. In: Proceedings of the ninth ACM international conference on multimedia. ACM, pp 525–527
Myint EEP, Pwint M (2010) An approach for mulit-label music mood classification. In: 2nd International Conference on Signal Processing Systems, (ICSPS) 2010, vol 1. IEEE, pp 290–294
Oliveira E, Martins P, Chambel T (2011) Ifelt: accessing movies through our emotions. In: Proceddings of the 9th international interactive conference on interactive television, EuroITV ’11. ACM, pp 105–114
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann
Philippot P (1993) Inducing and assessing differentiated emotion-feeling states in the laboratory. Cogn Emot 7(2):171–193
Russell JA (1997) Reading emotion from and into faces: resurrecting a dimensional-contextual perspective. In: Russell JA, Fernández-Dols JM (eds) The psychology of facial expression. Studies in emotion and social interaction, 2nd series. Cambridge University Press, New York, NY, pp 295–320
Sanden C, Zhang JZ (2011) An empirical study of multi-label classifiers for music tag annotation. In: 12th international society for music information retrieval conference, (ISMIR 2011). Citeseer, pp 717–722
Santos AM, Canuto AMP, Neto AF (2011) A comparative analysis of classification methods to multi-label tasks in different application domains. Int J Comput Inform Syst Ind Manage Appl. 3:218–227
Schapire RE, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Mach Learn 39(2):135–168
Schuller B, Johannes D, Gerhard R (2010) Determination of nonprototypical valence and arousal in popular music: features and performances. EURASIP Journal on Audio, Speech, and Music Processing 2010(735854):1–19. doi:10.1155/2010/735854
Shibata T, Kato T (1999) “Kansei” image retrieval system for street landscape-discrimination and graphical parameters based on correlation of two images. In: IEEE international conference on systems, man, and cybernetics, 1999. IEEE SMC’99 conference proceedings, vol 6. IEEE, pp 247–252
Smith CA (1989) Dimensions of appraisal and physiological response in emotion. J Personal Soc Psychol 56(3):339
Smith CA, Lazarus RS (1991) Emotion and adaptation. Oxford University Press, New York
Sorower MS (2010) A literature survey on algorithms for multi-label learning. Ph.D Qualifying Review Paper, Oregon State University
Spyromitros E, Tsoumakas G, Vlahavas I (2008) An empirical study of lazy multilabel classification algorithms. In: Darzentas J, Vouros GA, Vosinakis S, Arnellos A (eds) Artificial intelligence: theories, models and applications. Lecture Notes in Computer Science, vol. 5138. Springer Berlin Heidelberg, pp 401–406
Sun K, Yu J (2007) Video affective content representation and recognition using video affective tree and hidden markov models. In: Paiva ACR, Prada R, Picard RW (eds) Affective computing and intelligent interaction. Lecture Notes in Computer Science, vol. 4738. Springer Berlin Heidelberg, pp 594–605
Sun L, Ji S, Ye J (2008) Hypergraph spectral learning for multi-label classification. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge Discovery and Data mining, KDD ’08. ACM, New York, pp 668–676
Trohidis K, Tsoumakas G, Kalliris G, Vlahavas I (2008) Multi-label classification of music into emotions. In: ISMIR 2008: proceedings of the 9th international conference of music information retrieval, pp 325–330
Tsoumakas G, Vlahavas I (2007) Random k-labelsets: An ensemble method for multilabel classification. In: Proceedings of the 18th European conference on machine learning, pp 406–417
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Maimon O, Rokach L (eds) Data Mining and Knowledge Discovery Handbook, 2nd edn. Springer, New York, pp 667–685
Wang HL, Cheong L-F (2006) Affective understanding in film. IEEE Trans Circ Syst Video Technol 16(6):689–704
Wang CW, Cheng WH, Chen JC, Yang SS, Wu JL (2006) Film narrative exploration through the analysis of aesthetic elements. In: Cham TJ, Cai J, Dorai C, Rajan D, Chua TS, Chia LT (eds) Advances in multimedia modeling. Lecture Notes in Computer Science, vol. 4351. Springer Berlin Heidelberg, pp 606–615
Wang W, He Q (2008) A survey on emotional semantic image retrieval. In: 15th IEEE International Conference on Image Processing, ICIP 2008. IEEE, pp 117–120
Wang S, Wang X (2010) Emotional semantic detection from multimedia: a brief overview. In: Dai Y, Chakraborty B, Shi M (eds) Kansei engineering and soft computing: theory and practice 2010. IGI press, pp 126–146
Wang Z, Wang S, He M, Ji Q (2013) Emotional tagging of videos by exploring multi-emotion coexistence. In: IEEE international conference on automatic face & gesture recognition and workshops, (FG 2013). IEEE
Watanapa SC, Thipakorn B, Charoenkitarn N(2008) A sieving ANN for emotion-based movie clip classification. IEICE TRans Inf Syst 91(5):1562–1572
Wei CY, Dimitrova N, Chang SF (2004) Color-mood analysis of films based on syntactic and psychological models. In: IEEE International Conference on Multimedia and Expo, ICME’04, vol 2. IEEE, pp 831–834
Wei-ning W, Ying-lin Y, Sheng-ming J (2006) Image retrieval by emotional semantics: A study of emotional space and feature extraction. In: IEEE international conference on Systems, Man and Cybernetics, SMC’06, vol 4. IEEE, pp 3534–3539
Winoto P, Tang TY (2010) The role of user mood in movie recommendations. Expert Syst Appl 37(8):6086–6092
Wu T-L, Jeng S-K (2008) Probabilistic estimation of a novel music emotion model. In: Satoh S, Nack F, Etoh M (eds) Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol. 4903. Springer Berlin Heidelberg, pp 487–497
Xu M, He X, Jin JS, Peng Y, Xu C, Guo W (2011) Using scripts for affective content retrieval.In: Qiu G, Lam KM, Kiya H, Xue XY, Kuo CCJ, Lew MS (eds) Advances in Multimedia Information Processing - PCM 2010. Lecture Notes in Computer Science, vol. 6298. Springer Berlin Heidelberg, pp 43–51
Xu M, Xu C, He X, Jin JS, Luo S, Rui Y (2013) Hierarchical affective content analysis in arousal and valence dimensions. Signal Process 93(8):2140–2150
Yang Y-H, Lin Y-C, Su Y-F, Chen HH (2007) Music emotion classification: a regression approach. In: IEEE international conference on multimedia and expo, 2007. IEEE, pp 208–211
Yang Y-H, Lin Y-C, Su Y-F, Chen HH (2008) A regression approach to music emotion recognition. IEEE Trans Audio Speech Lang Process 16(2):448–457
Yang Yh, Chen HH (2012) Machine recognition of music emotion: a review. ACM Trans Intell Syst Technol 3(3):40
Yoo HW, Cho SB(2007) Video scene retrieval with interactive genetic algorithm. Multimed Tools Appl 34(3):317–336
Zhang M-L, Zhou Z-H (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048
Zhang S, Tian Q, Jiang S, Huang Q, Gao W (2008) Affective mtv analysis based on arousal and valence features. In: IEEE international conference on multimedia and expo, 2008. IEEE, pp 1369–1372
Zhang S, Huang Q, Jiang S, Gao W, Tian Q (2010) Affective visualization and retrieval for music video. IEEE Trans Multimedia 12(6):510–522
Zhang M, Zhang K (2010) Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 999–1008
Acknowledgements
This paper is supported by the NSFC (61175037, 61228304), Special Innovation Project on Speech of Anhui Province (11010202192), Project from Anhui Science and Technology Agency(1106c0805008) and the Fundamental Research Funds for the Central Universities. We also acknowledge partial support from the US National Science Foundation under grant # 1205664.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, S., Wang, Z. & Ji, Q. Multiple emotional tagging of multimedia data by exploiting dependencies among emotions. Multimed Tools Appl 74, 1863–1883 (2015). https://doi.org/10.1007/s11042-013-1722-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1722-3