Abstract
Automatic emotion recognition has attracted much interest in the last years and is becoming a challenging task. One modality by itself does not carry all the information to convey and perceive human emotions. Also, sometimes, it isn’t easy to choose between several affective states. To remove these ambiguities, we propose a deep learning-based decision-level fusion approach for Facial Textual Emotion Recognition (FTxER) to classify emotions into discrete emotion classes. Our approach is based on Deep Convolution Neural Network (DCNN) and Bidirectional Long Short Term Memory (BiLSTM). We use the latter to improve the correlation of the time dimension of DCNN face data. Our experiments on the CK+ dataset show that the weighted average of F1-score of the FTxER model is about \(79\%\).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Saroop, A., Ghugare, P., Mathamsetty, S., Vasani, V.: Facial emotion recognition: a multi-task approach using deep learning (2021)
Minaee, S., and Abdolrashidi, A.: Deep-emotion: facial expression recognition using attentional convolutional network. Sensors (Basel, Switzerland) 21, 3046 (2021)
Mukhopadhyay, M., Dey, A., Shaw, R.N., Ghosh, A.: Facial emotion recognition based on Textural pattern and Convolutional Neural Network. In: 2021 IEEE 4th International Conference on Computing, Power and Communication Technologies (GUCON), pp. 1–6 (2021)
Khediri, N., Ben Ammar, M., Kherallah, M.: Towards an online emotional recognition system for intelligent tutoring environment. In: The International Arab Conference on Information Technology, Yassmine, Hammamet, Tunisia, 22–24 December 2017 (2017)
Manisha, S., Saida H.N., Gopal, N., Anand, R.P.: Bimodal emotion recognition using machine learning. Int. J. Eng. Adv. Technol. (IJEAT) 10(4) (2021)
Bahreini, K., Nadolski, R., Westera, W.: Data fusion for real-time multimodal emotion recognition through webcams and microphones in e-learning. Int. J. Human-Comput. Interact. 32(5), 415–430 (2016)
Handa, A., Agarwal, R., Kohli, N.: Audio-visual emotion recognition system using multi-modal features. Int. J. Cogn. Inf. Nat. Intell. 15(4), 1–14 (2021)
Lee, J., Kim, H., Cheong, Y.: A multi-modal approach for emotion recognition of TV drama characters using image and text. In: IEEE International Conference on Big Data and Smart Computing (BigComp), pp. 420–424 (2020)
Li, W., Hirota, K, Lui, X., Dai, Y., Jia, Z.: The multi-modal emotion recognition based on text and image. In: The 9th International Symposium on Computational Intelligence and Industrial Applications, Beijing, China, 31 October–3 November 2020 (2020)
P. Kumar, V. Khokher, Y. Gupta and B. Raman: "Hybrid Fusion Based Approach for Multimodal Emotion Recognition with Insufficient Labeled Data," 2021 IEEE International Conference on Image Processing (ICIP), pp. 314–318, (2021)
Sosa, P.M.: Twitter sentiment analysis using combined LSTM-CNN models. ACADEMIA, CS291, University of California, Santa Barbara (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR, San Diego, CA, USA, 7–9 May 2015, Conference Track Proceedings (2015)
Wang, L., He, Z., Meng, B., Liu, K., Dou, Q., Yang, X.: Two-pathway attention network for real-time facial expression recognition. J. Real-Time Image Process. 18(4), 1173–1182 (2021). https://doi.org/10.1007/s11554-021-01123-w
Corchs, S., Fersini, E., Gasparini, F.: Ensemble learning on visual and textual data for social image emotion classification. Int. J. Mach. Learn. Cybern. 10(8), 2057–2070 (2017). https://doi.org/10.1007/s13042-017-0734-0
Khediri, N., Ben Ammar, M., Kherallah, M.: Comparison of image segmentation using different color spaces. In: 2021 IEEE 21st International Conference on Communication Technology, ICCT2021, Tianjin, China, 13–16 October 2021 (2021)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML, pp. 807–814 (2010)
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp. 94–101. IEEE (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Khediri, N., Ben Ammar, M., Kherallah, M. (2022). A New Deep Learning Fusion Approach for Emotion Recognition Based on Face and Text. In: Nguyen, N.T., Manolopoulos, Y., Chbeir, R., Kozierkiewicz, A., Trawiński, B. (eds) Computational Collective Intelligence. ICCCI 2022. Lecture Notes in Computer Science(), vol 13501. Springer, Cham. https://doi.org/10.1007/978-3-031-16014-1_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-16014-1_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16013-4
Online ISBN: 978-3-031-16014-1
eBook Packages: Computer ScienceComputer Science (R0)