Implicit video emotion tagging from audiences’ facial expression

Wang, Shangfei; Liu, Zhilei; Zhu, Yachen; He, Menghua; Chen, Xiaoping; Ji, Qiang

doi:10.1007/s11042-013-1830-0

Implicit video emotion tagging from audiences’ facial expression

Published: 12 January 2014

Volume 74, pages 4679–4706, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Shangfei Wang¹,
Zhilei Liu¹,
Yachen Zhu¹,
Menghua He¹,
Xiaoping Chen¹ &
…
Qiang Ji²

845 Accesses
10 Citations
Explore all metrics

Abstract

In this paper, we propose a novel implicit video emotion tagging approach by exploring the relationships between videos’ common emotions, subjects’ individualized emotions and subjects’ outer facial expressions. First, head motion and face appearance features are extracted. Then, the spontaneous facial expressions of subjects are recognized by Bayesian networks. After that, the relationships between the outer facial expressions, the inner individualized emotions and the video’s common emotions are captured by another Bayesian network, which can be used to infer the emotional tags of videos. To validate the effectiveness of our approach, an emotion tagging experiment is conducted on the NVIE database. The experimental results show that head motion features improve the performance of both facial expression recognition and emotion tagging, and that the captured relations between the outer facial expressions, the inner individualized emotions and the common emotions improve the performance of common and individualized emotion tagging.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A review on sentiment analysis and emotion detection from text

Article 28 August 2021

Pansy Nandwani & Rupali Verma

Facial emotion recognition using convolutional neural networks (FERC)

Article 18 February 2020

Ninad Mehendale

Facial emotion recognition based real-time learner engagement detection system in online learning context using deep learning models

Article 09 September 2022

Swadha Gupta, Parteek Kumar & Raj Kumar Tekchandani

References

Arapakis I, Konstas I, Jose JM (2009) Using facial expressions and peripheral physiological signals as implicit indicators of topical relevance. In: Proceedings of the 17th ACM international conference on multimedia, MM ’09. ACM, New York, pp 461–470
Arapakis I, Moshfeghi Y, Joho H, Ren R, Hannah D, Jose JM (2009) Enriching user profiling with affective features for the improvement of a multimodal recommender system. In: Proceedings of the ACM international conference on image and video retrieval, CIVR ’09. ACM, New York, pp 29:1–29:8
Arapakis I, Moshfeghi Y, Joho H, Ren R, Hannah D, Jose JM (2009) Integrating facial expressions into user profiling for the improvement of a multimodal recommender system. In: Proceedings of the 2009 IEEE international conference on multimedia and expo, ICME’09. IEEE Press, Piscataway, pp 1440–1443
Canini L, Gilroy S, Cavazza M, Leonardi R, Benini S (2010) Users’ response to affective film content: a narrative perspective. In: 2010 international workshop on content-based multimedia indexing (CBMI). pp 1–6
Coan JA, Allen JJB (2007) Handbook of emotion elicitation and assessment. Oxford University Press, New York
Google Scholar
Cootes TF, Edwards GJ, Taylor CJ (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 23(6): 681–685
Article Google Scholar
Hanjalic A, Xu L-Q (2005) Affective video content representation and modeling. IEEE Trans Multimed 7(1): 143–154
Article Google Scholar
Joho H, Jose JM, Valenti R, Sebe N (2009) Exploiting facial expressions for affective video summarisation. In: Proceedings of the ACM international conference on image and video retrieval, CIVR ’09. ACM, New York, pp 31:1–31:8
Kierkels JJM, Soleymani M, Pun T (2009) Queries and tags in affect-based multimedia retrieval. In: Proceedings of the 2009 IEEE international conference on multimedia and expo, ICME’09. IEEE Press, Piscataway, pp 1436–1439
Koelstra S, Muehl C, Patras I (2009) EEG analysis for implicit tagging of video data. In: Workshop on affective brain-computer interfaces. Proceedings ACII, pp 27–32
Koelstra S, Muhl C, Soleymani M, Lee J-S, Yazdani A, Ebrahimiv T, Pun T, Nijholt A, Patras I (2012) Deap: a database for emotion analysis: using physiological signals. IEEE Trans Affect Comput 3: 18–31
Article Google Scholar
Koelstra S, Yazdani A, Soleymani M, Mühl C, Lee J-S, Nijholt A, Pun T, Ebrahimi T, Patras I (2010) Single trial classification of EEG and peripheral physiological signals for recognition of emotions induced by music videos. In: Proceedings of the 2010 international conference on brain informatics, BI’10. Springer, Berlin, pp 89–100
Kreibig SD (2010) Autonomic nervous system activity in emotion: a review. Biol Psychol 84(3): 394–421
Article Google Scholar
Krzywicki AT, He G, O’Kane BL (2009) Analysis of facial thermal variations in response to emotion-eliciting film clips. In: Quantum information and computation VII, the international society for optical engineering. SPIE, pp 734312–734312–11
Liu Z, Wang S, Wang Z, Ji Q (2013) Implicit video multi-emotion tagging by exploiting multi-expression relations. In: 2013 10th IEEE international conference and workshops on automatic face and gesture recognition (FG). pp 1–6
Lv Y, Wang S, Shen P (2011) A real-time attitude recognition by eye-tracking. In: Proceedings of the third international conference on internet multimedia computing and service, ICIMCS ’11. ACM, New York, pp 170–173
Money AG, Agius H (2009) Analysing user physiological responses for affective video summarisation. Displays 30(2): 59–70. cited By (since 1996) 8
Article Google Scholar
Money AG, Agius HW (2008) Feasibility of personalized affective video summaries. In: Peter C, Beale R (eds) Affect and emotion in human-computer interaction, volume 4868 of Lecture Notes in Computer Science. Springer, pp 194–208
Money AG, Agius HW (2010) Elvis: entertainment-led video summaries. ACM TransMultimed Comput Commun Appl 6(3):17:1–17:30
Google Scholar
Ong K-M, Kameyama W (2009) Classification of video shots based on human affect. Inf Media Technol 4(4): 903–912
Google Scholar
Pantic M, Vinciarelli A (2009) Implicit human-centered tagging. IEEE Signal Proc Mag 26(6): 173–180
Article Google Scholar
Peng W-T, Chang C-H, Chu W-T, Huang W-J, Chou C-N, Chang W-Y, Hung Y-P (2010) A real-time user interest meter and its applications in home video summarizing. In: 2010 IEEE international conference on multimedia and expo (ICME), pp 849–854
Peng W-T, Chu W-T, Chang C-H, Chou C-N, Huang W-J, Chang W-Y, Hung Y-P (2011) Editing by viewing: automatic home video summarization by viewing behavior analysis. IEEE Trans Multimed 13(3): 539–550
Article Google Scholar
Peng W-T, Huang W-J, Chu W-T, Chou C-N, Chang W-Y, Chang C-H, Hung Y-P (2008) A user experience model for home video summarization. In: Proceedings of the 15th international multimedia modeling conference on advances in multimedia modeling, MMM ’09. Springer, Berlin, pp 484–495
Rainville P, Bechara A, Naqvi N, Damasio AR (2006) Basic emotions are associated with distinct patterns of cardiorespiratory activity. Int J Psychophysiol 61(1): 5–18
Article Google Scholar
Scott I, Cootes T, Taylor CAam modelling and search software. http://personalpages.manchester.ac.uk/staff/timothy.f.cootes/software/am_tools_doc/index.html
Smeaton AF, Rothwell S (2009) Biometric responses to music-rich segments in films: the cdvplex. In: International workshop on content-based multimedia indexing, pp 162–168
Sokolova M, Japkowicz N, Szpakowicz S (2006) Beyond accuracy, f-score and roc: a family of discriminant measures for performance evaluation. AI 2006: Advances in Artificial Intelligence, pp 1015–1021
Soleymani M, Chanel G, Kierkels J, Pun T (2008) Affective characterization of movie scenes based on multimedia content analysis and user’s physiological emotional responses. In: Tenth IEEE international symposium on multimedia, 2008. ISM 2008, pp 228–235
Soleymani M, Chanel G, Kierkels JJM, Pun T (2008) Affective ranking of movie scenes using physiological signals and content analysis. In: Proceedings of the 2nd ACM workshop on multimedia semantics. MS ’08. ACM, New York, pp 32–39
Soleymani M, Koelstra S, Patras I, Pun T (2011) Continuous emotion detection in response to music videos. In: 2011 IEEE international conference on automatic face gesture recognition and workshops (FG 2011), pp 803–808
Soleymani M, Lichtenauer J, Pun T, Pantic M (2012) A multimodal database for affect recognition and implicit tagging. IEEE Trans Affect Comput 3(1): 42–55
Article Google Scholar
Toyosawa S, Kawai T (2010) An experience oriented video digesting method using heart activity and its applicable video types. In: Proceedings of the 11th Pacific Rim conference on advances in multimedia information processing: part I. PCM’10. Springer, Berlin, pp 260–271
Wang S, Liu Z, Lv S, Lv Y, Wu G, Peng P, Chen F, Wang X (2010) A natural visible and infrared facial expression database for expression recognition and emotion inference. IEEE Trans Multimed 12(7): 682–691
Article Google Scholar
Wang S, Wang X (2010) Kansei engineering and soft computing: theory and practice, chapter 7: emotional semantic detection from multimedia: a brief overview. IGI Global, Pennsylvania
Google Scholar
Yazdani A, Lee J-S, Ebrahimi T (2009) Implicit emotional tagging of multimedia using EEG signals and brain computer interface. In: Proceedings of the first SIGMM workshop on social media. WSM ’09. ACM, New York, pp 81–88
Zhao Z, Morstatter F, Sharma S, Alelyani S, Anand A, Liu H (2010) Advancing feature selection research–asu feature selection repository. Technical report, School of Computing, Informatics and Decision Systems Engineering, Arizona State University, Tempe

Download references

Acknowledgments

This paper is supported by the NSFC (61175037, 61228304), Special Innovation Project on Speech of Anhui Province (11010202192), Project from Anhui Science and Technology Agency (1106c0805008) and the fundamental research funds for the central universities.

Author information

Authors and Affiliations

Key Lab of Computing and Communication Software of Anhui Province School of Computer Science and Technology, University of Science and Technology of China Hefei, Anhui, People’s Republic of China, 230027
Shangfei Wang, Zhilei Liu, Yachen Zhu, Menghua He & Xiaoping Chen
Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA
Qiang Ji

Authors

Shangfei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhilei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yachen Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Menghua He
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoping Chen
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Ji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shangfei Wang.

Appendix

A total of 32 playlists including six basic emotional video clips are prepared for the subjects. The numbers of subjects of different playlists under three illuminations are shown in Table 6.

Table 6 The number of users of different playlists under three illuminations

Full size table

Playlists 31 and 32 were used for supplementary experiments (when a subject’s emotions were not elicited successfully, another experiment was conducted using these two playlists). Playlist 13 was used only as a complement to induce three emotions: disgust, anger, and happiness. Table A-1 shows that 56 subjects used playlist 11; however, their last three emotions (disgust, anger, and happiness) were induced by playlist 13 again. This was performed because for some of the early experiments the playlists were too long to record with the camera; therefore, a supplementary experiment was carried out for these subjects using a playlist including the last three emotion-eliciting videos. The contents of these playlists are shown in Table 7. All of these video clips are segmented from some movies or TV shows obtained from the internet. A brief description of each video’s content is provided below.

Happy-1.flv: A video containing several funny video snippets.
Happy-2.flv: A police playing jokes on passers-by.
Happy-3.flv: An American man playing jokes on passers-by with paper money attached to his foot. He pretends to be in a deep sleep to test who will take away the money.
Happy-4.flv: An old woman playing tricks on a man.
Happy-5.flv: A news vendor playing tricks on passers-by by hiding his head when people come to ask for help. Happy-6.flv: An American playing tricks on passers-by. He puts glue on the chair and waits for people to sit. When the people stand up, their pants are torn.
Happy-7.flv: Two Americans playing tricks on a fitness instructor at a fitness club. They put one mirror in front of a wall. When someone shows his or her figure in front of the mirror, they slowly push the mirror down toward the person.
Disgust-1.wmv: A video showing the process of creating a crocodile tattoo in Africa, which contains some scenes that may induce a feeling of disgust.
Disgust-2.flv: A bloody cartoon movie containing some unsettling scenes.
Disgust-3.flv: Nauseating film snippets containing some disturbing scenes.
Disgust-4: A bloody cartoon movie that may cause discomfort.
Disgust-5.flv: A disturbing video showing a man take his heart out.
Disgust-6.flv: A cartoon movie, Happy Tree Friends, containing many bloody and disgusting scenes.
Disgust-7.avi: A puppet show containing some bloody and disgusting scenes.
Disgust-8.flv: A video showing a man eating a large worm.
Disgust-9.flv: A bloody cartoon movie that may cause discomfort.
Fear-1.flv: A daughter scaring her father with a dreadful face.
Fear-2.flv: A short video relating a ghost story about visiting a friend.
Fear-3.flv: A video of a dreadful head appearing suddenly after two scenery images are displayed.
Fear-4.flv: A video of a dreadful head appearing suddenly out of a calm scene.
Fear-5.flv: A short video relating a ghost story that takes place in an elevator.
Fear-6.flv: A short video relating a ghost story that takes place when visiting a friend.
Fear-7.flv: A video of a dreadful head appearing suddenly in a messy room.
Surprise-1.flv: A Chinese magician performing a surprising magic trick: passing through a wall without a door.
Surprise-2.flv: A magician removing food from a picture of ads on the wall.
Surprise-3.flv: A video of a magic performed on America’s Got Talent: a man is sawed with a chainsaw.
Surprise-4.flv: A collection of amazing video snippets. Surprise-5.flv: A video clip showing amazing stunts segmented from a TV show.
Surprise-6.flv: A video of a man creating a world in an inconceivable way; the video appears to be a clip from a science-fiction film.
Surprise-7.avi: A video showing an amazing motorcycle performance.
Sad-1.avi: A video showing pictures of the China 512 Earthquake.
Sad-2.avi: A video showing sad pictures of the China 512 Earthquake.
Sad-3.flv: A video showing some heart-warming video snippets of the China 512 Earthquake.
Sad-4.flv: A video showing 100 sad scenes of the China 512 Earthquake.
Sad-5.flv: A video relating the facts of the Japanese invasion of China during the Second World War.
Sad-6.flv: A video showing touching words spoken by children when the China 512 Earthquake occurred.
Sad-7.flv: A video showing touching words spoken by Wen Jiabao, premier of China, when the China 512 Earthquake occurred.
Anger-1.flv: A video of a brutal man killing his dog.
Anger-2.flv: A video of students bullying their old geography teacher.
Anger-3.flv: A video showing a disobedient son beating and scolding his mother in the street.
Anger-4.flv: A video showing a Japanese massacre in Nanjing during the Second World War.
Anger-5.flv: A video cut from the film The Tokyo Trial, when Hideki Tojo is on trial.

Table 7 Content of the playlists

Full size table

The subject numbers in each illumination directory in the released NVIE database are shown in Table 8.

Table 8 The subject number in each illumination directory

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, S., Liu, Z., Zhu, Y. et al. Implicit video emotion tagging from audiences’ facial expression. Multimed Tools Appl 74, 4679–4706 (2015). https://doi.org/10.1007/s11042-013-1830-0

Download citation

Published: 12 January 2014
Issue Date: July 2015
DOI: https://doi.org/10.1007/s11042-013-1830-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Implicit video emotion tagging from audiences’ facial expression

Abstract

Access this article

Similar content being viewed by others

A review on sentiment analysis and emotion detection from text

Facial emotion recognition using convolutional neural networks (FERC)

Facial emotion recognition based real-time learner engagement detection system in online learning context using deep learning models

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

A review on sentiment analysis and emotion detection from text

Facial emotion recognition using convolutional neural networks (FERC)

Facial emotion recognition based real-time learner engagement detection system in online learning context using deep learning models

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation