Skip to main content

Advertisement

Log in

Robustness comparison between the capsule network and the convolutional network for facial expression recognition

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

As an important part of human-computer interactions, facial expression recognition has become a popular research topic in computer vision, pattern recognition, artificial intelligence and other fields. With the development of deep learning and convolutional neural networks, research on facial expression recognition has also made considerable progress. Because facial expressions vary in real environments, such as rotation, shifting, brightness changes, partial occlusion and noise with different intensities, research on the robustness of facial expression recognition is very important. A capsule network consists of capsules, which are groups of neurons, and these capsules can learn posture information through the dynamic routing mechanism. The length of a capsule represents the existence probability, and each neuron in a capsule represents posture information (e.g., position, size, orientation or a combination of these properties). Therefore, in this study, the robustness of the emerging capsule network (CapsNet) is comprehensively compares with that of the traditional convolutional neural network (CNN) and fully convolutional network (FCN) in facial expression recognition tasks. The simulation results based on the Cohn-Kanade (CK+) databases show that the capsule network is more robust than the other networks. Therefore, the capsule network has significant advantages over the other networks in facial expression recognition task in complex real-world environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Ekman P, Friesen WV (1978) Facial Action Coding System (FACS): A technique for the measurement of facial action [J]. rivista di psichiatria 47(2):126–138

    Google Scholar 

  2. Mollahosseini A, Chan D, Mahoor MH (2016) Going deeper in facial expression recognition using deep neural networks. IEEE Winter Conference on Applications of Computer Vision (WACV), Lake Placid, pp 1–10

  3. Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, San Francisco, pp 94–101

  4. Chen J, Lv Y, Xu R, Can X (2019) Automatic social signal analysis: facial expression recognition using difference convolution neural network [J]. J Parallel Distrib Comput 131:97–102

    Article  Google Scholar 

  5. Lopes AT, de Aguiar E, De Souza AF (2017) Thiago Oliveira-Santos. Facial expression recognition with Convolutional Neural Networks: Coping with few data and the training sample order [J]. Pattern Recogn 61

  6. Zhang C, Wang P, Chen K, Kämäräinen J-K (2017) Identity-aware convolutional neural networks for facial expression recognition. J Syst Eng Electron 28(04):784–792

    Article  Google Scholar 

  7. Zhang H, Huang B, Tian G (2020) Facial expression recognition based on deep convolution long short-term memory networks of double-channel weighted mixture. Pattern Recogn Lett 131:128–134

    Article  Google Scholar 

  8. Chang T, Li H, Wen G, Hu Y, Ma J (2019) Facial expression recognition sensing the complexity of testing samples. Appl Intell 49:4319–4334. https://doi.org/10.1007/s10489-019-01491-8

    Article  Google Scholar 

  9. Cheng Y, Jiang B, Jia K (2014) A Deep Structure for Facial Expression Recognition under Partial Occlusion. 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, Kitakyushu, pp 211–214

    Google Scholar 

  10. Wang H, Gao J, Tong L, Yu L (2016) Facial expression recognition based on PHOG feature and sparse representation 2016 35th Chinese Control Conference (CCC), Chengdu, pp 3869–3874

  11. Xinli Yang, Ming Li, Shilin Zhao (2017) Facial expression recognition algorithm based on CNN and LBP feature fusion. ACM International Conference Proceeding Series, pp 33–38

  12. Siddiqi MH (2018) Accurate and robust facial expression recognition system using real-time YouTube-based datasets. Appl Intell 48:2912–2929. https://doi.org/10.1007/s10489-017-1121-y

    Article  Google Scholar 

  13. Liu K, Hsu C, Wang W, Chiang H (2019) Real-Time Facial Expression Recognition Based on CNN 2019. International Conference on System Science and Engineering (ICSSE), Dong Hoi, pp 120–123

  14. Li Y, Zeng J, Shan S, Chen X (2019) Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans Image Process 28(5):2439–2450

    Article  MathSciNet  Google Scholar 

  15. Wang Z, Zhang L, Wang B (2019) Sparse modified marginal fisher analysis for facial expression recognition. Appl Intell 49:2659–2671. https://doi.org/10.1007/s10489-018-1388-7

    Article  Google Scholar 

  16. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. 2015 IEEE Conference on computer Vision and Pattern Recognition (CVPR), Boston, pp 3431–3440

  17. Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. 31st Conference on Neural Information Processing Systems (NIPS 2017)

  18. Fan X, Tjahjadi T (2015) A spatial-temporal framework based on histogram of gradients and optical flow for facial expression recognition in video sequences. Pattern Recogn 48(11):3407–3416. https://doi.org/10.1016/j.patcog.2015.04.025

    Article  Google Scholar 

  19. Ramirez Rivera A, Rojas Castillo J, Oksam Chae O (2013) Local directional number pattern for face analysis: face and expression recognition. IEEE Trans Image Process 22(5):1740–1752

    Article  MathSciNet  Google Scholar 

  20. Zavaschi THH, Britto AS, Oliveira LES, Koerich AL (2013) Fusion of feature sets and classifiers for facial expression recognition. Expert Syst Appl 40(2):646–655

    Article  Google Scholar 

  21. Gu W, Xiang C, Venkatesh Y, Huang D, Lin H (2012) Facial expression recognition using radial encoding of local Gabor features and classifier synthesis. Pattern Recogn 45(1):80–91

    Article  Google Scholar 

  22. F. De la Torre, W. Chu, X. Xiong, F. Vicente, X. Ding and J. Cohn. IntraFace. 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Ljubljana, 2015, pp. 1–8

  23. Yang H, Ciftci U, Yin L (2018) Facial Expression Recognition by De-expression Residue Learning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake CityT, pp 2168–2177

  24. Sun X, Lv M (2019) Facial expression recognition based on a hybrid model combining deep and shallow features. Cogn Comput 11:587–597. https://doi.org/10.1007/s12559-019-09654-y

    Article  Google Scholar 

  25. Zhang Z, Luo P, Chen CL, Tang X (2018) From facial expression recognition to interpersonal relation prediction. Int J Comput Vis 126:550–569. https://doi.org/10.1007/s11263-017-1055-1

    Article  MathSciNet  Google Scholar 

  26. Liu X, Kumar BVKV, You J, Jia P (2017) Adaptive Deep Metric Learning for Identity-Aware Facial Expression Recognition, vol 2017. IEEE Conference on Computer Vision and Pattern Recognition workshops (CVPRW), Honolulu, pp 522–531

  27. Simard PY, Steinkraus D, Platt JC (2003) Best practices for convolutional neural networks applied to visual document analysis. Seventh International Conference on Document Analysis and Recognition. Proceedings, Edinburgh, pp 958–963

    Google Scholar 

  28. Krizhevsky A,Sutskever I, Hinton GE ImageNet Classification with Deep Convolutional Neural Networks. NIPS. (2012)

  29. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. ICLR

  30. He K, Zhang X, Ren S, Sun J (2016) Deep Residual Learning for Image Recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, pp 770–778

  31. Huang G, Liu Z, Maaten L VD (2017) Densely Connected Convolutional Networks.CVPR. IEEE Computer Society

Download references

Acknowledgments

We thank all the subjects and experimenters who participated in the experiment. This work was supported in part by the National Natural Science Foundation of China (no. 61472330 and no. 61872301).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangyuan Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, D., Zhao, X., Yuan, G. et al. Robustness comparison between the capsule network and the convolutional network for facial expression recognition. Appl Intell 51, 2269–2278 (2021). https://doi.org/10.1007/s10489-020-01895-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01895-x

Keywords

Navigation