A Dynamic Emotion Recognition System Based on Convolutional Feature Extraction and Recurrent Neural Network

Yin, Yida; Ayoub, Misbah; Abel, Andrew; Zhang, Haiyang

doi:10.1007/978-3-031-16078-3_8

Yida Yin¹⁰,
Misbah Ayoub¹⁰,
Andrew Abel¹¹ &
…
Haiyang Zhang¹⁰

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 543))

Included in the following conference series:

Proceedings of SAI Intelligent Systems Conference

1006 Accesses
2 Citations

Abstract

Over the past three decades, there has been sustained research activity in emotion recognition from faces, powered by the popularity of smart devices and the development of improved machine learning, resulting in the creation of recognition systems with high accuracy. While research has commonly focused on single images, recent research has also made use of dynamic video data. This paper presents CNN-RNN (Convolutional Neural Network - Recurrent Neural Network) based emotion recognition using videos from the ADFES database, and we present the results in the arousal-valence space, rather than assigning a discrete emotion. As well as traditional performance metrics, we also design a new performance metric, PN accuracy, to distinguish between positive and negative emotions. We demonstrate improved performance with a smaller RNN than the initial pre-trained model, and report a peak accuracy of 0.58, with peak PN accuracy of 0.76, which shows our approach is very capable distinguishing between positive and negative emotions. We also present a detailed analysis of system performance, using new valence-arousal domain temporal visualisations to show transitions in recognition over time, demonstrating the importance of context based information in emotion recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Emotion Detection in Real-Time Video Using Deep Learning

Video-Based Emotion Estimation Using Deep Neural Networks: A Comparative Study

Emotion Recognition with Spatial Attention and Temporal Softmax Pooling

References

Abdulsalam, W.H., Alhamdani, R.S., Abdullah, M.N.: Facial emotion recognition from videos using deep convolutional neural networks. Int. J. Mach. Learn. Comput. 9(1), 14–19 (2019)
Article Google Scholar
Barrett, L.F.: How Emotions Are Made: The Secret Life of the Brain. Houghton Mifflin Harcourt, Boston (2017)
Google Scholar
Boubenna, H., Lee, D.: Image-based emotion recognition using evolutionary algorithms. Biol. Inspired Cogn. Archit. 24, 70–76 (2018)
Google Scholar
Cheng, S., Zhou, G.: Facial expression recognition method based on improved VGG convolutional neural network. Int. J. Pattern Recognit. Artif. Intell. 34(07), 2056003 (2020)
Article Google Scholar
Adam Coates, Andrew Ng, and Honglak Lee. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pages 215–223. JMLR Workshop and Conference Proceedings, 2011
Google Scholar
Cohen, I., Sebe, N., Garg, A., Chen, L.S., Huang, T.S.: Facial expression recognition from video sequences: temporal and static modeling. Comput. Vis. Image Underst. 91(1–2), 160–187 (2003)
Article Google Scholar
Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Collecting large, richly annotated facial-expression databases from movies. IEEE Multimedia 19(03), 34–41 (2012)
Article Google Scholar
Jeffrey Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell. Long-term recurrent convolutional networks for visual recognition and description. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2625–2634, 2015
Google Scholar
Samira Ebrahimi Kahou, Vincent Michalski, Kishore Konda, Roland Memisevic, and Christopher Pal. Recurrent neural networks for emotion recognition in video. In Proceedings of the 2015 ACM on international conference on multimodal interaction, pages 467–474, 2015
Google Scholar
Ekman, P., Friesen, W.V.: Constants across cultures in the face and emotion. J. Pers. Soc. Psychol. 17(2), 124 (1971)
Article Google Scholar
Yin Fan, Xiangju Lu, Dian Li, and Yuanliu Liu. Video-based emotion recognition using cnn-rnn and c3d hybrid networks. In Proceedings of the 18th ACM international conference on multimodal interaction, pages 445–450, 2016
Google Scholar
Hyoun-Joo Go, Keun-Chang Kwak, Dae-Jong Lee, and Myung-Geun Chun. Emotion recognition from the facial image and speech signal. In SICE 2003 Annual Conference (IEEE Cat. No. 03TH8734), volume 3, pages 2890–2895. IEEE, 2003
Google Scholar
Goodfellow, I.J., et al.: Challenges in representation learning: a report on three machine learning contests. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8228, pp. 117–124. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-42051-1_16
Chapter Google Scholar
Haag, A., Goronzy, S., Schaich, P., Williams, J.: Emotion recognition using bio-sensors: first steps towards an automatic system. In: André, E., Dybkjær, L., Minker, W., Heisterkamp, P. (eds.) ADS 2004. LNCS (LNAI), vol. 3068, pp. 36–48. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24842-2_4
Chapter Google Scholar
Xingxun Jiang, Yuan Zong, Wenming Zheng, Chuangao Tang, Wanchuang Xia, Cheng Lu, and Jiateng Liu. Dfew: A large-scale database for recognizing dynamic facial expressions in the wild. In Proceedings of the 28th ACM International Conference on Multimedia, pages 2881–2889, 2020
Google Scholar
Pooya Rezvani Khorrami. How deep learning can help emotion recognition. PhD thesis, University of Illinois at Urbana-Champaign, 2017
Google Scholar
Dimitrios Kollias and Stefanos Zafeiriou. Aff-wild2: Extending the aff-wild database for affect recognition. arXiv preprint arXiv:1811.07770, 2018
Ronak Kosti, Jose M Alvarez, Adria Recasens, and Agata Lapedriza. Emotic: Emotions in context dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 61–69, 2017
Google Scholar
Kotsia, I., Pitas, I.: Facial expression recognition in image sequences using geometric deformation features and support vector machines. IEEE Trans. Image Process. 16(1), 172–187 (2006)
Article MathSciNet Google Scholar
Langner, O., Dotsch, R., Bijlstra, G., Wigboldus, D.H.J., Hawk, S.T., Van Knippenberg, A.D.: Presentation and validation of the radboud faces database. Cogn. Emot. 24(8), 1377–1388 (2010)
Google Scholar
I Lawrence and Kuei Lin. A concordance correlation coefficient to evaluate reproducibility. Biometrics, pages 255–268, 1989
Google Scholar
Jiyoung Lee, Seungryong Kim, Sunok Kim, Jungin Park, and Kwanghoon Sohn. Context-aware emotion recognition networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10143–10152, 2019
Google Scholar
Patrick Lucey, Jeffrey F Cohn, Takeo Kanade, Jason Saragih, Zara Ambadar, and Iain Matthews. The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In 2010 ieee computer society conference on computer vision and pattern recognition-workshops, pages 94–101. IEEE, 2010
Google Scholar
Lyons, M.J., Budynek, J., Akamatsu, S.: Automatic classification of single facial images. IEEE Trans. Pattern Anal. Mach. Intell. 21(12), 1357–1362 (1999)
Article Google Scholar
Liying Ma and Khashayar Khorasani. Facial expression recognition using constructive neural networks. In Signal Processing, Sensor Fusion, and Target Recognition X, volume 4380, pages 521–530. International Society for Optics and Photonics, 2001
Google Scholar
Albert Mehrabian. Framework for a comprehensive description and measurement of emotional states. Genetic, social, and general psychology monographs, 1995
Google Scholar
Albert Mehrabian. Communication without words. In Communication theory, pages 193–200. Routledge, 2017
Google Scholar
Mistry, K., Zhang, L., Neoh, S.C., Lim, C.P., Fielding, B.: A micro-GA embedded PSO feature selection approach to intelligent facial emotion recognition. IEEE Trans. Cybern. 47(6), 1496–1509 (2016)
Article Google Scholar
Yee-Hui Oh, Anh Cat Le Ngo, Raphael C-W Phari, John See, and Huo-Chong Ling. Intrinsic two-dimensional local structures for micro-expression recognition. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 1851–1855. IEEE, 2016
Google Scholar
Denis Rangulov and Muhammad Fahim. Emotion recognition on large video dataset based on convolutional feature extractor and recurrent neural network. In 2020 IEEE 4th International Conference on Image Processing, Applications and Systems (IPAS), pages 14–20. IEEE, 2020
Google Scholar
Riaz, M.N., Shen, Y., Sohail, M., Guo, M.: eXnet: an efficient approach for emotion recognition in the wild. Sensors 20(4), 1087 (2020)
Article Google Scholar
Scherer, K.R.: What are emotions? And how can they be measured? Soc. Sci. Inf. 44(4), 695–729 (2005)
Article Google Scholar
Schlosberg, H.: Three dimensions of emotion. Psychol. Rev. 61(2), 81 (1954)
Article Google Scholar
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014
Josh M Susskind, Adam K Anderson, and Geoffrey E Hinton. The toronto face database. Department of Computer Science, University of Toronto, Toronto, ON, Canada, Tech. Rep, 3, 2010
Google Scholar
Van Der Schalk, J., Hawk, S.T., Fischer, A.H., Doosje, B.: Moving faces, looking places: validation of the Amsterdam dynamic facial expression set (ADFES). Emotion 11(4), 907 (2011)
Article Google Scholar
Wenhui Wang, Nan Yang, Furu Wei, Baobao Chang, and Ming Zhou. Gated self-matching networks for reading comprehension and question answering. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 189–198, 2017
Google Scholar
Wang, Y., Liao, W., Chang, Y.: Gated recurrent unit network-based short-term photovoltaic forecasting. Energies 11(8), 2163 (2018)
Article Google Scholar
Wingenbach, T.S.H., Ashwin, C., Brosnan, M.: Validation of the Amsterdam dynamic facial expression set-bath intensity variations (ADFES-BIV): a set of videos expressing low, intermediate, and high intensity emotions. PLoS ONE 11(1), e0147112 (2016)
Article Google Scholar
Bing-Fei, W., Lin, C.-H.: Adaptive feature mapping for customizing deep learning based facial expression recognition model. IEEE Access 6, 12451–12461 (2018)
Article Google Scholar
Yacoob, Y., Davis, L.S.: Recognizing human facial expressions from long image sequences using optical flow. IEEE Trans. Pattern Anal. Mach. Intell. 18(6), 636–642 (1996)
Article Google Scholar
Zen, G., Porzi, L., Sangineto, E., Ricci, E., Sebe, N.: Learning personalized models for facial expression analysis and gesture recognition. IEEE Trans. Multimedia 18(4), 775–788 (2016)
Article Google Scholar
Bin Zhang, Changqin Quan, and Fuji Ren. Study on cnn in the recognition of emotion in audio and images. In 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS), pages 1–5. IEEE, 2016
Google Scholar
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning (still) requires rethinking generalization. Commun. ACM 64(3), 107–115 (2021)
Article Google Scholar
Zhang, T., Zheng, W., Cui, Z., Zong, Y., Li, Y.: Spatial-temporal recurrent neural network for emotion recognition. IEEE Trans. Cybern. 49(3), 839–847 (2018)
Article Google Scholar
Zhang, Y.-D., et al.: Facial emotion recognition based on biorthogonal wavelet entropy, fuzzy support vector machine, and stratified cross validation. IEEE Access 4, 8375–8385 (2016)
Article Google Scholar
Sicheng Zhao, Yue Gao, Xiaolei Jiang, Hongxun Yao, Tat-Seng Chua, and Xiaoshuai Sun. Exploring principles-of-art features for image emotion recognition. In Proceedings of the 22nd ACM international conference on Multimedia, pages 47–56, 2014
Google Scholar

Download references

Author information

Authors and Affiliations

School of Advanced Technology, Xi’an Jiaotong-Liverpool University, Suzhou, China
Yida Yin, Misbah Ayoub & Haiyang Zhang
Computer and Information Sciences, University of Strathclyde, Glasgow, Scotland
Andrew Abel

Authors

Yida Yin
View author publications
You can also search for this author in PubMed Google Scholar
Misbah Ayoub
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Abel
View author publications
You can also search for this author in PubMed Google Scholar
Haiyang Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Abel .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Saga University, Saga, Japan
Kohei Arai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yin, Y., Ayoub, M., Abel, A., Zhang, H. (2023). A Dynamic Emotion Recognition System Based on Convolutional Feature Extraction and Recurrent Neural Network. In: Arai, K. (eds) Intelligent Systems and Applications. IntelliSys 2022. Lecture Notes in Networks and Systems, vol 543. Springer, Cham. https://doi.org/10.1007/978-3-031-16078-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-16078-3_8
Published: 01 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16077-6
Online ISBN: 978-3-031-16078-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A Dynamic Emotion Recognition System Based on Convolutional Feature Extraction and Recurrent Neural Network