research-article

Group emotion recognition in the wild by combining deep neural networks for facial expression classification and scene-context analysis

Authors:

Stephan K. ChalupAuthors Info & Claims

ICMI '17: Proceedings of the 19th ACM International Conference on Multimodal Interaction

Pages 561 - 568

https://doi.org/10.1145/3136755.3143010

Published: 03 November 2017 Publication History

Abstract

This paper presents the implementation details of a proposed solution to the Emotion Recognition in the Wild 2017 Challenge, in the category of group-level emotion recognition. The objective of this sub-challenge is to classify a group's emotion as Positive, Neutral or Negative. Our proposed approach incorporates both image context and facial information extracted from an image for classification. We use Convolutional Neural Networks (CNNs) to predict facial emotions from detected faces present in an image. Predicted facial emotions are combined with scene-context information extracted by another CNN using fully connected neural network layers. Various techniques are explored by combining and training these two Deep Neural Network models in order to perform group-level emotion recognition. We evaluate our approach on the Group Affective Database 2.0 provided with the challenge. Experimental evaluations show promising performance improvements, resulting in approximately 37% improvement over the competition's baseline model on the validation dataset.

References

[1]

Ahmed Bilal Ashraf, Simon Lucey, Jeffrey F Cohn, Tsuhan Chen, Zara Ambadar, Kenneth M Prkachin, and Patricia E Solomon. 2009. The Painful Face–pain Expression Recognition Using Active Appearance Models. Image and Vision Computing 27, 12 (2009), 1788–1796.

Digital Library

[2]

Sigal G Barsade and Donald E Gibson. 1998. Group Emotion: A View From Top and Bottom. Research on Managing Groups And Teams 1, 4 (1998), 81–102.

[3]

Aleksandra Cerekovic. 2016. A Deep Look into Group Happiness Prediction from Images. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. ACM, 437–444.

Digital Library

[4]

François Chollet. 2016. Xception: Deep Learning with Depthwise Separable Convolutions. arXiv preprint arXiv:1610.02357 (2016).

[5]

Jeffrey F Cohn, Tomas Simon Kruez, Iain Matthews, Ying Yang, Minh Hoai Nguyen, Margara Tejera Padilla, Feng Zhou, and Fernando De la Torre. 2009.

[6]

Detecting Depression From Facial Actions and Vocal Prosody. In 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, 2009. ACII 2009. IEEE, 1–7.

[7]

Abhinav Dhall, Roland Goecke, Jyoti Joshi, Jesse Hoey, and Tom Gedeon. 2016. Emotiw 2016: Video and Group-Level Emotion Recognition Challenges. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. ACM, 427–432.

Digital Library

[8]

Abhinav Dhall, Jyoti Joshi, Karan Sikka, Roland Goecke, and Nicu Sebe. 2015. The More the Merrier: Analysing the Affect of a Group of People in Images. In Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, Vol. 1. IEEE, 1–8.

[9]

Paul Ekman and Wallace V Friesen. 1976. Measuring Facial Movement. Environmental Psychology and Nonverbal Behavior 1, 1 (1976), 56–75.

[10]

Rana El Kaliouby and Peter Robinson. 2005. Real-Time Inference of Complex Mental States From Facial Expressions and Head Gestures. In Real-Time Vision for Human-Computer Interaction. Springer, 181–200.

[11]

Ian J Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, et al. 2013. Challenges in Representation Learning: A Report on Three Machine Learning Contests. In International Conference on Neural Information Processing. Springer, 117–124.

[12]

Ralph Gross, Iain Matthews, Jeffrey Cohn, Takeo Kanade, and Simon Baker. 2010. Multi-pie. Image and Vision Computing 28, 5 (2010), 807–813.

Digital Library

[13]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. CoRR abs/1502.01852 (2015).

[14]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

[15]

Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv preprint arXiv:1704.04861 (2017).

[16]

Jonathan Huang, Vivek Rathod, Chen Sun, Menglong Zhu, Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna, Yang Song, Sergio Guadarrama, et al. 2016. Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors. arXiv preprint arXiv:1611.10012 (2016).

[17]

Bo-Kyeong Kim, Hwaran Lee, Jihyeon Roh, and Soo-Young Lee. 2015. Hierarchical Committee of Deep Cnns with Exponentially-Weighted Decision Fusion for Static Facial Expression Recognition. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. ACM, 427–434. ICMI’17, November 13–17, 2017, Glasgow, UK Asad Abbas and Stephan K. Chalup

Digital Library

[18]

Diederik Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:1412.6980 (2014).

[19]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet Classification With Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems (NIPS 2012), Vol. 25. 1097–1105.

Digital Library

[20]

Jianshu Li, Sujoy Roy, Jiashi Feng, and Terence Sim. 2016. Happiness Level Prediction with Sequential Inputs via Multiple Regressions. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. ACM, 487–493.

Digital Library

[21]

Patrick Lucey, Jeffrey F Cohn, Takeo Kanade, Jason Saragih, Zara Ambadar, and Iain Matthews. 2010. The Extended Cohn-Kanade Dataset (ck+): A Complete Dataset for Action Unit and Emotion-Specified Expression. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 94–101.

[22]

Patrick Lucey, Jeffrey F Cohn, Kenneth M Prkachin, Patricia E Solomon, and Iain Matthews. 2011. Painful data: The UNBC-McMaster shoulder pain expression archive database. In Automatic Face & Gesture Recognition and Workshops (FG 2011), 2011 IEEE International Conference on. IEEE, 57–64.

[23]

C Mayer, M Eggers, and B Radig. 2014. Cross-Database Evaluation for Facial Expression Recognition. Pattern Recognition and Image Analysis 24, 1 (2014), 124–132.

Digital Library

[24]

Daniel McDuff, Rana El Kaliouby, Karim Kassam, and Rosalind Picard. 2010. Affect Valence Inference From Facial Action Unit Spectrograms. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, 17–24.

[25]

D. McDuff, R. el Kaliouby, T. Senechal, M. Amr, J. F. Cohn, and R. Picard. 2013. Affectiva-MIT Facial Expression Dataset (AM-FED): Naturalistic and Spontaneous Facial Expressions Collected In-the-Wild. In 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops. 881–888.

Digital Library

[26]

Rosalind W Picard and Roalind Picard. 1997. Affective Computing. Vol. 252. MIT Press, Cambridge.

Digital Library

[27]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115, 3 (2015), 211–252.

Digital Library

[28]

Haşim Sak, Andrew Senior, and Françoise Beaufays. 2014. Long Short-Term Memory Recurrent Neural Network Architectures for Large Scale Acoustic Modeling. In Fifteenth Annual Conference of the International Speech Communication Association.

[29]

Evangelos Sariyanidi, Hatice Gunes, and Andrea Cavallaro. 2015. Automatic Analysis of Facial Affect: A Survey of Registration, Representation, and Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 6 (2015), 1113–1133.

Digital Library

[30]

Karen Simonyan and Andrew Zisserman. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014).

[31]

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. 2015. Rethinking the Inception Architecture for Computer Vision. CoRR abs/1512.00567 (2015).

[32]

Vassilios Vonikakis, Yasin Yazici, Viet Dung Nguyen, and Stefan Winkler. 2016. Group Happiness Assessment Using Geometric Features and Dataset Balancing. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. ACM, 479–486.

Digital Library

[33]

Jacob Whitehill, Gwen Littlewort, Ian Fasel, Marian Bartlett, and Javier Movellan. 2009. Toward Practical Smile Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 11 (2009), 2106–2111.

Digital Library

[34]

Jianxin Wu and Jim M Rehg. 2011. CENTRIST: A Visual Descriptor for Scene Categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 8 (2011), 1489–1501.

Digital Library

[35]

Zhiding Yu and Cha Zhang. 2015. Image Based Static Facial Expression Recognition with Multiple Deep Network Learning. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. ACM, 435–442.

Digital Library

[36]

Zhihong Zeng, Maja Pantic, Glenn I Roisman, and Thomas S Huang. 2009. A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence 31, 1 (2009), 39–58.

Digital Library

[37]

Bolei Zhou, Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba. 2017. Places: A 10 million Image Database for Scene Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017).

[38]

Xiangxin Zhu and Deva Ramanan. 2012. Face Detection, Pose Estimation, and Landmark Localization in the Wild. In 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2879–2886.

Digital Library

Cited By

Wang XZhang DLee D(2024)Implementing the Affective Mechanism for Group Emotion Recognition With a New Graph Convolutional Network ArchitectureIEEE Transactions on Affective Computing10.1109/TAFFC.2023.332010115:3(1104-1115)Online publication date: Jul-2024
https://doi.org/10.1109/TAFFC.2023.3320101
Dhall ASingh MGoecke RGedeon TZeng DWang YIkeda K(2023)EmotiW 2023: Emotion Recognition in the Wild ChallengeProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3616545(746-749)Online publication date: 9-Oct-2023
https://dl.acm.org/doi/10.1145/3577190.3616545
Wang XZhang DTan HLee D(2023)A Self-Fusion Network Based on Contrastive Learning for Group Emotion RecognitionIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.320224910:2(458-469)Online publication date: Apr-2023
https://doi.org/10.1109/TCSS.2022.3202249
Show More Cited By

Index Terms

Group emotion recognition in the wild by combining deep neural networks for facial expression classification and scene-context analysis
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Neural networks

Recommendations

An Investigation into the Impact of Occlusion on Facial Emotion Recognition in the Wild
PETRA '24: Proceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments

Emotion recognition from facial expressions is a crucial task in affective computing with applications in human-computer interaction, security and healthcare. However, real-world scenarios often involve partial occlusions of facial features due to ...
Extended deep neural network for facial emotion recognition
Highlights
- A new Deep Fully Connected model for facial emotion recognition.
- The model ...
Abstract
Humans use facial expressions to show their emotional states. However, facial expression recognition has remained a challenging and interesting problem in computer vision. In this paper we present our approach which is the extension of ...
Emotion recognition in the wild using deep neural networks and Bayesian classifiers
ICMI '17: Proceedings of the 19th ACM International Conference on Multimodal Interaction

Group emotion recognition in the wild is a challenging problem, due to the unstructured environments in which everyday life pictures are taken. Some of the obstacles for an effective classification are occlusions, variable lighting conditions, and image ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMI '17: Proceedings of the 19th ACM International Conference on Multimodal Interaction

November 2017

676 pages

ISBN:9781450355438

DOI:10.1145/3136755

General Chairs:
Edward Lank
University of Waterloo, Canada
,
Alessandro Vinciarelli
University of Glasgow, UK
,
Program Chairs:
Eve Hoggan
Aarhus University, Denmark
,
Sriram Subramanian
University of Sussex, UK
,
Stephen A. Brewster
University of Glasgow, UK

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGCHI: ACM Special Interest Group on Computer-Human Interaction

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICMI '17

Sponsor:

SIGCHI

ICMI '17: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

November 13 - 17, 2017

Glasgow, UK

Acceptance Rates

ICMI '17 Paper Acceptance Rate 65 of 149 submissions, 44%;

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

23
Total Citations
View Citations
401
Total Downloads

Downloads (Last 12 months)22
Downloads (Last 6 weeks)2

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang XZhang DLee D(2024)Implementing the Affective Mechanism for Group Emotion Recognition With a New Graph Convolutional Network ArchitectureIEEE Transactions on Affective Computing10.1109/TAFFC.2023.332010115:3(1104-1115)Online publication date: Jul-2024
https://doi.org/10.1109/TAFFC.2023.3320101
Dhall ASingh MGoecke RGedeon TZeng DWang YIkeda K(2023)EmotiW 2023: Emotion Recognition in the Wild ChallengeProceedings of the 25th International Conference on Multimodal Interaction10.1145/3577190.3616545(746-749)Online publication date: 9-Oct-2023
https://dl.acm.org/doi/10.1145/3577190.3616545
Wang XZhang DTan HLee D(2023)A Self-Fusion Network Based on Contrastive Learning for Group Emotion RecognitionIEEE Transactions on Computational Social Systems10.1109/TCSS.2022.320224910:2(458-469)Online publication date: Apr-2023
https://doi.org/10.1109/TCSS.2022.3202249
Sharma GDhall ACai J(2023)Audio-Visual Automatic Group Affect AnalysisIEEE Transactions on Affective Computing10.1109/TAFFC.2021.310417014:2(1056-1069)Online publication date: 1-Apr-2023
https://doi.org/10.1109/TAFFC.2021.3104170
Veltmeijer EGerritsen CHindriks K(2023)Automatic Emotion Recognition for Groups: A ReviewIEEE Transactions on Affective Computing10.1109/TAFFC.2021.306572614:1(89-107)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TAFFC.2021.3065726
Nethi AMeda DReddy CKanth Koppala SNandy ABethireddy VSukhija SGupta Y(2023)Cohesive Group Emotion Recognition using Deep Learning2023 26th ACIS International Winter Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD-Winter)10.1109/SNPD-Winter57765.2023.10466291(264-269)Online publication date: 14-Dec-2023
https://doi.org/10.1109/SNPD-Winter57765.2023.10466291
Nethi AMeda DReddy CKanth Koppala SNandy ABethireddy VSukhija SGupta Y(2023)Cohesive Group Emotion Recognition using Deep Learning2023 IEEE/ACIS 8th International Conference on Big Data, Cloud Computing, and Data Science (BCD)10.1109/BCD57833.2023.10466291(264-269)Online publication date: 14-Dec-2023
https://doi.org/10.1109/BCD57833.2023.10466291
Malhotra ASharma GKumar RDhall AHoey J(2023)Social Event Context and Affect Prediction in Group Videos2023 11th International Conference on Affective Computing and Intelligent Interaction Workshops and Demos (ACIIW)10.1109/ACIIW59127.2023.10388162(1-8)Online publication date: 10-Sep-2023
https://doi.org/10.1109/ACIIW59127.2023.10388162
Rathod BVanzara RPandya D(2023)A recent survey on perceived group sentiment analysisJournal of Visual Communication and Image Representation10.1016/j.jvcir.2023.10398897(103988)Online publication date: Dec-2023
https://doi.org/10.1016/j.jvcir.2023.103988
Boughanem HGhazouani HBarhoumi W(2023)Facial Emotion Recognition in-the-Wild Using Deep Neural Networks: A Comprehensive ReviewSN Computer Science10.1007/s42979-023-02423-75:1Online publication date: 13-Dec-2023
https://doi.org/10.1007/s42979-023-02423-7
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten