Abstract
Emotion is an important aspect of effective human communication, and hence, facial emotion recognition (FER) has become essential in human–computer interaction systems. The automation of FER has been carried out by many researchers using ML/DL techniques. However, the models developed using convolutional neural networks (CNNs) bagged high recognition accuracies among different FER approaches. Rather than its high performance, CNN has failed to encode different orientation features since the pooling operations used in CNN for feature extraction omit vital information. Due to omitting vital features, the performance will be reduced while recognizing the emotions from facial images that consists of different orientations. Subsequently, to reduce the problems of CNN such as encoding different orientation features and increased training time, Capsule Networks (CapsNet) were developed. CapsNet is capable of storing 8 such features vectors with the incorporation of dynamic routing approaches and squashing in place of pooling operations to mitigate the issue of rotational invariance. Hence in this paper, we proposed CapsNet for FER in order to enhance the accuracy. However, the facial images that consider for training consist of unwanted information that is not essential for FER, delay the convergence and take more iterations for training facial images. Hence, face localization (FL) is proposed to incorporate with CapsNet in our model to eliminate the back ground noise or unwanted information from the facial images for effective training process. The proposed FL-CapsNet is rigorously tested on benchmark datasets such as JAFFE, CK+, and FER2013 to evaluate the generalization of the proposed model, and it is evidenced that FL-CapsNet outperformed the existing CapsNet-based FER models.
Similar content being viewed by others
Data availability
The datasets generated during and/or analyzed during the current study are publicly available.
References
Ekman, P.: Facial Expressions. Hand-Book of Cognition And Emotion (1999)
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. Adv. Neural Inform. Process. Syst. 30 (2017)
Lopes, A.T., De, A.E., De, S.A.F., Oliveira, S.T.: Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recogn. 61, 610–628 (2017)
Kim, T., Yu, C., Lee, S.: Facial expression recognition using feature additive pooling and progressive fine-tuning of cnn. Electron. Lett. 54(23), 1326–1328 (2018)
Filali, H., Riffi, J., Aboussaleh, I., Mahraz, A.M., Tairi, H.: Meaningful learning for deep facial emotional features. Neural Process. Lett. 54, 387 (2021)
Akhand, M., Roy, S., Siddique, N., Kamal, M.A.S., Shimamura, T.: Facial emotion recognition using transfer learning in the deep cnn. Electronics 10(9), 1036 (2021)
Reddy, A.H., Kolli, K., Kiran, Y.L.: Deep cross feature adaptive network for facial emotion classification. Signal Image Video Process. 16, 369 (2021)
Alphonse, A.S., Shankar, K., Rakkini, M.J.J., Ananthakrishnan, S., Athisayamani, S., Singh, A.R., Gobi, R.: A multi-scale and rotation invariant phase pattern (mripp) and a stack of restricted boltzmann machine (rbm) with preprocessing for facial expression classification. J. Ambient. Intell. Humaniz. Comput. 12(3), 3447–3463 (2021)
Toraman, S.: Automatic recognition of preictal and interictal EEG signals using 1D-capsule networks. Comput. Electr. Eng. 91, 107033 (2021)
Toraman, S., Dursun, Ö.O.: GameEmo-CapsNet: emotion recognition from single-channel EEG signals using the 1D capsule networks. Traitement du Signal. 38(6), 1689 (2021)
Hosseini, S., Cho, N.I.: GF-CapsNet: using gabor jet and capsule networks for facial age, gender, and expression recognition. In: Proceedings of 14th international conference on automatic face & gesture recognition (FG 2019), pp. 1-8. IEEE (2019)
Wu, F., Pang, C., Zhang, B.: FaceCaps for facial expression recognition. Comput. Anima. Virtual Worlds 32, e2021 (2021)
Park, B.H., Oh, S.Y., Kim, I.J.: Face alignment using a deep neural network with local feature learning and recurrent regression. Expert Syst. Appl. 89, 66–80 (2017)
Viola, P., Jones, M.: Robust real-time object detection. Int. J. Comput. Vision 4, 34–47 (2001)
Pisano, E.D., Zong, S., Hemminger, B.M., DeLuca, M., Johnston, R.E., Muller, K., Braeuning, M.P., Pizer, S.M.: Contrast limited adaptive histogram equalization image processing to improve the detection of simulated spiculations in dense mammograms. J. Digit. Imaging 11(4), 193 (1998)
Lyons, M., Kamachi, M., Gyoba, J.: The Japanese female facial expression (JAFFE) dataset (1998). https://doi.org/10.5281/zenodo.3451524
Kanade, T., Cohn, J.F., Tian, Y.: Comprehensive database for facial expression analysis. In: Proceedings of fourth international conference on automatic face and gesture recognition (Cat. No. PR00580), pp. 46–53. IEEE (2000)
Goodfellow, I.J., Erhan, D., Carrier, P.L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D.H., Zhou, Y.: Challenges in representation learning: A report on three machine learning contests. In: Proceedings of international conference on neural information processing, pp. 117–124. Springer (2013)
Mensah, P.K., Ayidzoe, M.A.: Overview of CapsNet Performance evaluation methods for image classification using a dual input capsule network as a case study. Int. J. Comput. Digit. Syst. 11(1), 29–43 (2022)
Author information
Authors and Affiliations
Contributions
Sivaiah Bellamkonda: Conceptualization, Methodology, Software, Investigation, Writing - original draft, Writing - review & editing, Visualization. N P Gopalan: Conceptualization, Methodology, Writing - review & editing, Supervision, Resources. C Mala: Conceptualization, Methodology, Writing - review & editing, Supervision, Resources. Lavanya Settipalli: Methodology, Writing - review & editing, Resources, Visualization.
Corresponding author
Ethics declarations
Conflict of interest
There are no competing interests related to this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sivaiah, B., Gopalan, N.P., Mala, C. et al. FL-CapsNet: facial localization augmented capsule network for human emotion recognition. SIViP 17, 1705–1713 (2023). https://doi.org/10.1007/s11760-022-02381-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-022-02381-2