Skip to main content
Log in

FL-CapsNet: facial localization augmented capsule network for human emotion recognition

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Emotion is an important aspect of effective human communication, and hence, facial emotion recognition (FER) has become essential in human–computer interaction systems. The automation of FER has been carried out by many researchers using ML/DL techniques. However, the models developed using convolutional neural networks (CNNs) bagged high recognition accuracies among different FER approaches. Rather than its high performance, CNN has failed to encode different orientation features since the pooling operations used in CNN for feature extraction omit vital information. Due to omitting vital features, the performance will be reduced while recognizing the emotions from facial images that consists of different orientations. Subsequently, to reduce the problems of CNN such as encoding different orientation features and increased training time, Capsule Networks (CapsNet) were developed. CapsNet is capable of storing 8 such features vectors with the incorporation of dynamic routing approaches and squashing in place of pooling operations to mitigate the issue of rotational invariance. Hence in this paper, we proposed CapsNet for FER in order to enhance the accuracy. However, the facial images that consider for training consist of unwanted information that is not essential for FER, delay the convergence and take more iterations for training facial images. Hence, face localization (FL) is proposed to incorporate with CapsNet in our model to eliminate the back ground noise or unwanted information from the facial images for effective training process. The proposed FL-CapsNet is rigorously tested on benchmark datasets such as JAFFE, CK+, and FER2013 to evaluate the generalization of the proposed model, and it is evidenced that FL-CapsNet outperformed the existing CapsNet-based FER models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are publicly available.

References

  1. Ekman, P.: Facial Expressions. Hand-Book of Cognition And Emotion (1999)

  2. Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. Adv. Neural Inform. Process. Syst. 30 (2017)

  3. Lopes, A.T., De, A.E., De, S.A.F., Oliveira, S.T.: Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recogn. 61, 610–628 (2017)

    Article  Google Scholar 

  4. Kim, T., Yu, C., Lee, S.: Facial expression recognition using feature additive pooling and progressive fine-tuning of cnn. Electron. Lett. 54(23), 1326–1328 (2018)

    Article  Google Scholar 

  5. Filali, H., Riffi, J., Aboussaleh, I., Mahraz, A.M., Tairi, H.: Meaningful learning for deep facial emotional features. Neural Process. Lett. 54, 387 (2021)

    Article  Google Scholar 

  6. Akhand, M., Roy, S., Siddique, N., Kamal, M.A.S., Shimamura, T.: Facial emotion recognition using transfer learning in the deep cnn. Electronics 10(9), 1036 (2021)

    Article  Google Scholar 

  7. Reddy, A.H., Kolli, K., Kiran, Y.L.: Deep cross feature adaptive network for facial emotion classification. Signal Image Video Process. 16, 369 (2021)

    Article  Google Scholar 

  8. Alphonse, A.S., Shankar, K., Rakkini, M.J.J., Ananthakrishnan, S., Athisayamani, S., Singh, A.R., Gobi, R.: A multi-scale and rotation invariant phase pattern (mripp) and a stack of restricted boltzmann machine (rbm) with preprocessing for facial expression classification. J. Ambient. Intell. Humaniz. Comput. 12(3), 3447–3463 (2021)

    Article  Google Scholar 

  9. Toraman, S.: Automatic recognition of preictal and interictal EEG signals using 1D-capsule networks. Comput. Electr. Eng. 91, 107033 (2021)

    Article  Google Scholar 

  10. Toraman, S., Dursun, Ö.O.: GameEmo-CapsNet: emotion recognition from single-channel EEG signals using the 1D capsule networks. Traitement du Signal. 38(6), 1689 (2021)

    Article  Google Scholar 

  11. Hosseini, S., Cho, N.I.: GF-CapsNet: using gabor jet and capsule networks for facial age, gender, and expression recognition. In: Proceedings of 14th international conference on automatic face & gesture recognition (FG 2019), pp. 1-8. IEEE (2019)

  12. Wu, F., Pang, C., Zhang, B.: FaceCaps for facial expression recognition. Comput. Anima. Virtual Worlds 32, e2021 (2021)

    Article  Google Scholar 

  13. Park, B.H., Oh, S.Y., Kim, I.J.: Face alignment using a deep neural network with local feature learning and recurrent regression. Expert Syst. Appl. 89, 66–80 (2017)

    Article  Google Scholar 

  14. Viola, P., Jones, M.: Robust real-time object detection. Int. J. Comput. Vision 4, 34–47 (2001)

    Google Scholar 

  15. Pisano, E.D., Zong, S., Hemminger, B.M., DeLuca, M., Johnston, R.E., Muller, K., Braeuning, M.P., Pizer, S.M.: Contrast limited adaptive histogram equalization image processing to improve the detection of simulated spiculations in dense mammograms. J. Digit. Imaging 11(4), 193 (1998)

    Article  Google Scholar 

  16. Lyons, M., Kamachi, M., Gyoba, J.: The Japanese female facial expression (JAFFE) dataset (1998). https://doi.org/10.5281/zenodo.3451524

  17. Kanade, T., Cohn, J.F., Tian, Y.: Comprehensive database for facial expression analysis. In: Proceedings of fourth international conference on automatic face and gesture recognition (Cat. No. PR00580), pp. 46–53. IEEE (2000)

  18. Goodfellow, I.J., Erhan, D., Carrier, P.L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D.H., Zhou, Y.: Challenges in representation learning: A report on three machine learning contests. In: Proceedings of international conference on neural information processing, pp. 117–124. Springer (2013)

  19. Mensah, P.K., Ayidzoe, M.A.: Overview of CapsNet Performance evaluation methods for image classification using a dual input capsule network as a case study. Int. J. Comput. Digit. Syst. 11(1), 29–43 (2022)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Sivaiah Bellamkonda: Conceptualization, Methodology, Software, Investigation, Writing - original draft, Writing - review & editing, Visualization. N P Gopalan: Conceptualization, Methodology, Writing - review & editing, Supervision, Resources. C Mala: Conceptualization, Methodology, Writing - review & editing, Supervision, Resources. Lavanya Settipalli: Methodology, Writing - review & editing, Resources, Visualization.

Corresponding author

Correspondence to Bellamkonda Sivaiah.

Ethics declarations

Conflict of interest

There are no competing interests related to this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sivaiah, B., Gopalan, N.P., Mala, C. et al. FL-CapsNet: facial localization augmented capsule network for human emotion recognition. SIViP 17, 1705–1713 (2023). https://doi.org/10.1007/s11760-022-02381-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-022-02381-2

Keywords

Navigation