Multiple emotional tagging of multimedia data by exploiting dependencies among emotions

Wang, Shangfei; Wang, Zhaoyu; Ji, Qiang

doi:10.1007/s11042-013-1722-3

Multiple emotional tagging of multimedia data by exploiting dependencies among emotions

Published: 12 October 2013

Volume 74, pages 1863–1883, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Shangfei Wang¹,
Zhaoyu Wang¹ &
Qiang Ji²

416 Accesses
15 Citations
Explore all metrics

Abstract

Digital multimedia may elicit a mixture of human emotions. Most current emotional tagging research typically tags the multimedia data with a single emotion, ignoring the phenomenon of multi-emotion coexistence. To address this problem, we propose a novel multi-emotion tagging approach by explicitly modeling the dependencies among emotions. First, several audio or visual features are extracted from the multimedia data. Second, four traditional multi-label learning methods: Binary Relevance, Random k label sets, Binary Relevance k Nearest Neighbours and Multi-Label k Nearest Neighbours, are used as the classifiers to obtain the measurements of emotional tags. Then, a Bayesian network is automatically constructed to capture the relationships among emotional tags. Finally, the Bayesian network is used to infer the data’s multi-emotion tags by combining the measurements obtained from those traditional methods with the dependencies among emotions. Experiments on two multi-label media data sets demonstrate the superiority of our approach to the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modeling Emotions with Social Tags

The Influence of Context Knowledge for Multi-modal Affective Annotation

Multi-Label Emotion Tagging for Online News by Supervised Topic Model

References

Arifin S, Cheung PYK (2007) A novel probabilistic approach to modeling the pleasure-arousal-dominance content of the video based on “working memory”. In: International Conference on Semantic Computing, ICSC 2007. IEEE, pp 147–154
Bhattacharya S, Sukthankar R, Shah M (2010) A framework for photo-quality assessment and enhancement based on visual aesthetics. In: Proceedings of the international conference on multimedia. ACM, pp 271–280
Bischoff K, Firan CS, Paiu R, Nejdl W, Laurier C, Sordo M (2009) Music mood and theme classification-a hybrid approach. In: Proceedings of the international conference on music information retrieval, pp 657–662
Datta R, Joshi D, Li J, Wang JZ (2006) Studying aesthetics in photographic images using a computational approach. In: Computer vision–ECCV 2006, pp 288–301
de Campos CP, Ji Q (2011) Efficient structure learning of Bayesian networks using constraints. J Mach Learn Res 12:663–689
MATH MathSciNet Google Scholar
Dorai C, Venkatesh S (2001) Computational media aesthetics: finding meaning beautiful. IEEE Multimedia 8(4):10–12
Article Google Scholar
Eerola T, Lartillot O, Toiviainen P (2009) Prediction of multidimensional emotional ratings in music from audio using multivariate regression models. In: Proceedings of the international conference on music information retrieval, pp 621–626
Feng Y, Zhuang Y, Pan Y (2003) Popular music retrieval by detecting mood. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval. ACM, pp 375–376
Fürnkranz J, Hüllermeier E, Loza Mencía E, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153
Article Google Scholar
Ghamrawi N, McCallum A (2005) Collective multi-label classification. In: Proceedings of the 14th ACM international Conference on Information and Knowledge Management, CIKM ’05. ACM, New York, pp 195–200
Godbole S, Sarawagi S (2004) Discriminative methods for multi-labeled classification. In: Proceedings of the 8th Pacific-Asia conference on knowledge discovery and data mining. Springer, pp 22–30
Gross JJ, Levenson RW (1995) Emotion elicitation using films. Cogn Emot 9(1):87–108
Article Google Scholar
Hanjalic A, Xu LQ (2005) Affective video content representation and modeling. IEEE Trans Multimedia 7(1):143–154
Article Google Scholar
Hevner K (1935) Expression in music: a discussion of experimental studies and theories. Psychol Rev 42(2):186
Article Google Scholar
Holmes G, Read J, Pfahringer B, Frank E (2009) Classifier chains for multi-label classification. In: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: part II, ECML PKDD ’09. Springer, Berlin, Heidelberg, pp 254–269
Huang S-J, Yu Y, Zhou Z-H (2012) Multi-label hypothesis reuse. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge Discovery and Data mining, KDD ’12. ACM, New York, pp 525–533
Joshi D, Datta R, Fedorovskaya E, Luong Q-T, Wang JZ, Li J, Luo J (2011) Aesthetics and emotions in images. IEEE Signal Process Mag 28(5):94–115
Article Google Scholar
Kang HB (2003) Affective contents retrieval from video with relevance feedback. In: Sembok TMT, Zaman HB, Chen H, Urs SR, Myaeng S-H (eds) Digital libraries: technology and management of indigenous knowledge for global access. Lecture Notes in Computer Science, vol. 2911. Springer Berlin Heidelberg, pp 243–252
Kim EY, Kim SJ, Koo HJ, Jeong K, Kim JI (2005) Emotion-based textile indexing using colors and texture. In: Wang L, Jin Y (eds) Fuzzy systems and knowledge discovery. Lecture Notes in Computer Science, vol. 3613. Springer Berlin Heidelberg, pp 1077–1080
Kim YE, Schmidt EM, Migneco R, Morton BG, Richardson P, Scott J, Speck JA, Turnbull D (2010) Music emotion recognition: a state of the art review. In: Proc. ISMIR. Citeseer, pp 255–266
Li D, Sethi IK, Dimitrova N, McGee T (2001) Classification of general audio data for content-based retrieval. Pattern Recogn Lett 22(5):533–544
Article MATH Google Scholar
Li T, Ogihara M (2003) Detecting emotion in music. In: Proceedings of the international symposium on music information retrieval. Washington, pp 239–240
Liu D, Lu L, Zhang H-J (2003) Automatic mood detection from acoustic music data. In: Proceedings of the international symposium on music information retrieval, pp 81–87
Liu C-C, Yang Y-H, Wu P-H, Chen HH (2006) Detecting and classifying emotion in popular music. In: Proceedings of the joint international conference on information sciences, pp 996–999
Luca C, Sergio B, Riccardo L (2013) Affective recommendation of movies based on selected connotative features. IEEE Trans Circ Syst Video Technol 23(4):636–647
Article Google Scholar
MacDorman KF, Ough S, Hoa C-C (2007) Automatic emotion prediction of song excerpts: index construction, algorithm design, and empirical comparison. J New Music Res 36(4):281–299
Article Google Scholar
Marcelino R, Teixeira A, Yamasaki T, Aizawa K (2011) Determination of emotional content of video clips by low-level audiovisual features. Multimed Tools Appl 61(1):21–49
Google Scholar
Moncrieff S, Dorai C, Venkatesh S (2001) Affect computing in film through sound energy dynamics. In: Proceedings of the ninth ACM international conference on multimedia. ACM, pp 525–527
Myint EEP, Pwint M (2010) An approach for mulit-label music mood classification. In: 2nd International Conference on Signal Processing Systems, (ICSPS) 2010, vol 1. IEEE, pp 290–294
Oliveira E, Martins P, Chambel T (2011) Ifelt: accessing movies through our emotions. In: Proceddings of the 9th international interactive conference on interactive television, EuroITV ’11. ACM, pp 105–114
Pearl J (1988) Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann
Philippot P (1993) Inducing and assessing differentiated emotion-feeling states in the laboratory. Cogn Emot 7(2):171–193
Article Google Scholar
Russell JA (1997) Reading emotion from and into faces: resurrecting a dimensional-contextual perspective. In: Russell JA, Fernández-Dols JM (eds) The psychology of facial expression. Studies in emotion and social interaction, 2nd series. Cambridge University Press, New York, NY, pp 295–320
Sanden C, Zhang JZ (2011) An empirical study of multi-label classifiers for music tag annotation. In: 12th international society for music information retrieval conference, (ISMIR 2011). Citeseer, pp 717–722
Santos AM, Canuto AMP, Neto AF (2011) A comparative analysis of classification methods to multi-label tasks in different application domains. Int J Comput Inform Syst Ind Manage Appl. 3:218–227
Google Scholar
Schapire RE, Singer Y (2000) Boostexter: a boosting-based system for text categorization. Mach Learn 39(2):135–168
Article MATH Google Scholar
Schuller B, Johannes D, Gerhard R (2010) Determination of nonprototypical valence and arousal in popular music: features and performances. EURASIP Journal on Audio, Speech, and Music Processing 2010(735854):1–19. doi:10.1155/2010/735854
Article Google Scholar
Shibata T, Kato T (1999) “Kansei” image retrieval system for street landscape-discrimination and graphical parameters based on correlation of two images. In: IEEE international conference on systems, man, and cybernetics, 1999. IEEE SMC’99 conference proceedings, vol 6. IEEE, pp 247–252
Smith CA (1989) Dimensions of appraisal and physiological response in emotion. J Personal Soc Psychol 56(3):339
Article Google Scholar
Smith CA, Lazarus RS (1991) Emotion and adaptation. Oxford University Press, New York
Google Scholar
Sorower MS (2010) A literature survey on algorithms for multi-label learning. Ph.D Qualifying Review Paper, Oregon State University
Spyromitros E, Tsoumakas G, Vlahavas I (2008) An empirical study of lazy multilabel classification algorithms. In: Darzentas J, Vouros GA, Vosinakis S, Arnellos A (eds) Artificial intelligence: theories, models and applications. Lecture Notes in Computer Science, vol. 5138. Springer Berlin Heidelberg, pp 401–406
Sun K, Yu J (2007) Video affective content representation and recognition using video affective tree and hidden markov models. In: Paiva ACR, Prada R, Picard RW (eds) Affective computing and intelligent interaction. Lecture Notes in Computer Science, vol. 4738. Springer Berlin Heidelberg, pp 594–605
Sun L, Ji S, Ye J (2008) Hypergraph spectral learning for multi-label classification. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge Discovery and Data mining, KDD ’08. ACM, New York, pp 668–676
Trohidis K, Tsoumakas G, Kalliris G, Vlahavas I (2008) Multi-label classification of music into emotions. In: ISMIR 2008: proceedings of the 9th international conference of music information retrieval, pp 325–330
Tsoumakas G, Vlahavas I (2007) Random k-labelsets: An ensemble method for multilabel classification. In: Proceedings of the 18th European conference on machine learning, pp 406–417
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Maimon O, Rokach L (eds) Data Mining and Knowledge Discovery Handbook, 2nd edn. Springer, New York, pp 667–685
Google Scholar
Wang HL, Cheong L-F (2006) Affective understanding in film. IEEE Trans Circ Syst Video Technol 16(6):689–704
Article Google Scholar
Wang CW, Cheng WH, Chen JC, Yang SS, Wu JL (2006) Film narrative exploration through the analysis of aesthetic elements. In: Cham TJ, Cai J, Dorai C, Rajan D, Chua TS, Chia LT (eds) Advances in multimedia modeling. Lecture Notes in Computer Science, vol. 4351. Springer Berlin Heidelberg, pp 606–615
Wang W, He Q (2008) A survey on emotional semantic image retrieval. In: 15th IEEE International Conference on Image Processing, ICIP 2008. IEEE, pp 117–120
Wang S, Wang X (2010) Emotional semantic detection from multimedia: a brief overview. In: Dai Y, Chakraborty B, Shi M (eds) Kansei engineering and soft computing: theory and practice 2010. IGI press, pp 126–146
Wang Z, Wang S, He M, Ji Q (2013) Emotional tagging of videos by exploring multi-emotion coexistence. In: IEEE international conference on automatic face & gesture recognition and workshops, (FG 2013). IEEE
Watanapa SC, Thipakorn B, Charoenkitarn N(2008) A sieving ANN for emotion-based movie clip classification. IEICE TRans Inf Syst 91(5):1562–1572
Article Google Scholar
Wei CY, Dimitrova N, Chang SF (2004) Color-mood analysis of films based on syntactic and psychological models. In: IEEE International Conference on Multimedia and Expo, ICME’04, vol 2. IEEE, pp 831–834
Wei-ning W, Ying-lin Y, Sheng-ming J (2006) Image retrieval by emotional semantics: A study of emotional space and feature extraction. In: IEEE international conference on Systems, Man and Cybernetics, SMC’06, vol 4. IEEE, pp 3534–3539
Winoto P, Tang TY (2010) The role of user mood in movie recommendations. Expert Syst Appl 37(8):6086–6092
Article Google Scholar
Wu T-L, Jeng S-K (2008) Probabilistic estimation of a novel music emotion model. In: Satoh S, Nack F, Etoh M (eds) Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol. 4903. Springer Berlin Heidelberg, pp 487–497
Xu M, He X, Jin JS, Peng Y, Xu C, Guo W (2011) Using scripts for affective content retrieval.In: Qiu G, Lam KM, Kiya H, Xue XY, Kuo CCJ, Lew MS (eds) Advances in Multimedia Information Processing - PCM 2010. Lecture Notes in Computer Science, vol. 6298. Springer Berlin Heidelberg, pp 43–51
Xu M, Xu C, He X, Jin JS, Luo S, Rui Y (2013) Hierarchical affective content analysis in arousal and valence dimensions. Signal Process 93(8):2140–2150
Article Google Scholar
Yang Y-H, Lin Y-C, Su Y-F, Chen HH (2007) Music emotion classification: a regression approach. In: IEEE international conference on multimedia and expo, 2007. IEEE, pp 208–211
Yang Y-H, Lin Y-C, Su Y-F, Chen HH (2008) A regression approach to music emotion recognition. IEEE Trans Audio Speech Lang Process 16(2):448–457
Article Google Scholar
Yang Yh, Chen HH (2012) Machine recognition of music emotion: a review. ACM Trans Intell Syst Technol 3(3):40
Article Google Scholar
Yoo HW, Cho SB(2007) Video scene retrieval with interactive genetic algorithm. Multimed Tools Appl 34(3):317–336
Article Google Scholar
Zhang M-L, Zhou Z-H (2007) Ml-knn: a lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048
Article MATH Google Scholar
Zhang S, Tian Q, Jiang S, Huang Q, Gao W (2008) Affective mtv analysis based on arousal and valence features. In: IEEE international conference on multimedia and expo, 2008. IEEE, pp 1369–1372
Zhang S, Huang Q, Jiang S, Gao W, Tian Q (2010) Affective visualization and retrieval for music video. IEEE Trans Multimedia 12(6):510–522
Article Google Scholar
Zhang M, Zhang K (2010) Multi-label learning by exploiting label dependency. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 999–1008

Download references

Acknowledgements

This paper is supported by the NSFC (61175037, 61228304), Special Innovation Project on Speech of Anhui Province (11010202192), Project from Anhui Science and Technology Agency(1106c0805008) and the Fundamental Research Funds for the Central Universities. We also acknowledge partial support from the US National Science Foundation under grant # 1205664.

Author information

Authors and Affiliations

Key Lab of Computing and Communication Software of Anhui Province, School of Computer Science and Technology, University of Science and Technology of China, Hefei, Anhui, 230027, People’s Republic of China
Shangfei Wang & Zhaoyu Wang
Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA
Qiang Ji

Authors

Shangfei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhaoyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Ji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shangfei Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, S., Wang, Z. & Ji, Q. Multiple emotional tagging of multimedia data by exploiting dependencies among emotions. Multimed Tools Appl 74, 1863–1883 (2015). https://doi.org/10.1007/s11042-013-1722-3

Download citation

Published: 12 October 2013
Issue Date: March 2015
DOI: https://doi.org/10.1007/s11042-013-1722-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple emotional tagging of multimedia data by exploiting dependencies among emotions

Abstract

Access this article

Similar content being viewed by others

Modeling Emotions with Social Tags

The Influence of Context Knowledge for Multi-modal Affective Annotation

Multi-Label Emotion Tagging for Online News by Supervised Topic Model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multiple emotional tagging of multimedia data by exploiting dependencies among emotions

Abstract

Access this article

Similar content being viewed by others

Modeling Emotions with Social Tags

The Influence of Context Knowledge for Multi-modal Affective Annotation

Multi-Label Emotion Tagging for Online News by Supervised Topic Model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation