Abstract
The task of facial expression recognition (FER) is riddled with many challenges, such as face occlusion, head posture, illumination angle, and intensity. Due to the development of deep learning and large FER datasets in recent years, most methods have achieved notable success. This paper aims to solve the problem that general classification models are difficult to distinguish, for some easily confused expressions (such as anger and surprise). To this end, we make two contributions in this paper: (1) The model extracts weighted local key regions as local information on the final feature maps, and fuses the global information for multi-task recognition. (2) Triplet loss function is used to make the intra-class feature distance significantly reduced from the inter-class feature distance. It can enhance the discriminability of features while fitting the sample distribution. The experiments confirm that two contributions are combined to gain another round of performance boost. For instance, the results on CK+ and FER2013 datasets demonstrate the superiority of the proposed method.
Similar content being viewed by others
References
Cohn, Y.l.T.J.: Recognizing facial actions by combining geometric features and regional appearance patterns. Carnegie Mellon University, the Robotics Institute (2001)
Dhall, A., Ramana Murthy, O., Goecke, R., Joshi, J., Gedeon, T.: Video and image based emotion recognition challenges in the wild: emotiw 2015. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 423–426. ACM (2015)
Goodfellow, I.J., Erhan, D., Carrier, P.L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D.H., et al.: Challenges in representation learning: a report on three machine learning contests. In: International Conference on Neural Information Processing, pp. 117–124. Springer (2013)
Guojiang, W., Guoliang, Y., Kechang, F.: Facial expression recognition based on extended optical flow constraint. In: 2010 International Conference on Intelligent Computation Technology and Automation, vol. 2, pp. 297–300. IEEE (2010)
Ai, H.Z., Xiao, X., Xu, G.: Face detection and retrieval. Chin J Comput. Chin. Ed. 26(7), 874–881 (2003)
Hasani, B., Mahoor, M.H.: Facial expression recognition using enhanced deep 3D convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 30–40 (2017)
Hoffer, E., Ailon, N.: Deep metric learning using triplet network. In: International Workshop on Similarity-Based Pattern Recognition, pp. 84–92. Springer (2015)
Hu, P., Cai, D., Wang, S., Yao, A., Chen, Y.: Learning supervised scoring ensemble for emotion recognition in the wild. In: Proceedings of the 19th ACM International Conference on Multimodal Interaction, pp. 553–560. ACM (2017)
Hua, W., Dai, F., Huang, L., Xiong, J., Gui, G.: Hero: human emotions recognition for realizing intelligent internet of things. IEEE Access 7, 24321–24332 (2019)
Huang, R., Xie, X., Feng, Z., Lai, J.: Face recognition by landmark pooling-based CNN with concentrate loss. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 1582–1586. IEEE (2017)
Levi, G., Hassner, T.: Emotion recognition in the wild via convolutional neural networks and mapped binary patterns. In: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 503–510. ACM (2015)
Lopes, A.T., de Aguiar, E., De Souza, A.F., Oliveira-Santos, T.: Facial expression recognition with convolutional neural networks: coping with few data and the training sample order. Pattern Recognit. 61, 610–628 (2017)
Luo, Y., Wu, C.M., Zhang, Y.: Facial expression recognition based on fusion feature of PCA and IBP with SVM. Opt.-Int. J. Light Electron Opt. 124(17), 2767–2770 (2013)
Luo, Z., Chen, J., Takiguchi, T., Ariki, Y.: Facial expression recognition with deep age. In: 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW) (2017)
Mase, K.: Recognition of facial expression from optical flow. IEICE Trans. Inf. Syst. 74(10), 3474–3483 (1991)
Minaee, S., Abdolrashidi, A.: Deep-emotion: facial expression recognition using attentional convolutional network. arXiv:1902.01019 (2019)
Otsuka, T., Ohya, J.: Spotting segments displaying facial expression from image sequences using HMM. In: Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 442–447. IEEE (1998)
Ouyang, Y., Sang, N., Huang, R.: Accurate and robust facial expressions recognition by fusing multiple sparse representation based classifiers. Neurocomputing 149, 71–78 (2015)
Ranjan, R., Patel, V.M., Chellappa, R.: Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 121–135 (2017)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 815–823 (2015)
Sebe, N., Lew, M.S., Cohen, I., Garg, A., Huang, T.S.: Emotion recognition using a cauchy naive bayes classifier. In: Object recognition supported by user interaction for service robots, vol. 1, pp. 17–20. IEEE (2002)
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Wu, T., Fu, S., Yang, G.: Survey of the facial expression recognition research. In: International Conference on Brain Inspired Cognitive Systems, pp. 392–402. Springer (2012)
Xiang, J., Zhu, G.: Joint face detection and facial expression recognition with mtcnn. In: 2017 4th International Conference on Information Science and Control Engineering (ICISCE), pp. 424–427. IEEE (2017)
Yang, H., Ciftci, U., Yin, L.: Facial expression recognition by de-expression residue learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2168–2177 (2018)
Yang, H., Yin, L.: CNN based 3d facial expression recognition using masking and landmark features. In: 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 556–560. IEEE (2017)
Yao, A., Cai, D., Hu, P., Wang, S., Sha, L., Chen, Y.: Holonet: towards robust emotion recognition in the wild. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 472–478. ACM (2016)
Ye, J., Zhan, Y., Song, S.: Facial expression features extraction based on gabor wavelet transformation. In: 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No. 04CH37583), vol. 3, pp. 2215–2219. IEEE (2004)
Zeng, N., Zhang, H., Song, B., Liu, W., Li, Y., Dobaie, A.M.: Facial expression recognition via learning deep sparse autoencoders. Neurocomputing 273, 643–649 (2018)
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)
Zhang, T., Zheng, W., Cui, Z., Zong, Y., Yan, J., Yan, K.: A deep neural network-driven feature learning method for multi-view facial expression recognition. IEEE Trans. Multimed. 18(12), 2528–2536 (2016)
Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: European Conference on Computer Vision, pp. 94–108. Springer (2014)
Zhou, X., Huang, X., Xu, B., Wang, Y.: Real-time facial expression recognition based on boosted embedded hidden Markov model. In: Third International Conference on Image and Graphics (ICIG’04), pp. 290–293. IEEE (2004)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhu, D., Tian, G., Zhu, L. et al. LKRNet: a dual-branch network based on local key regions for facial expression recognition. SIViP 15, 263–270 (2021). https://doi.org/10.1007/s11760-020-01753-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-020-01753-w