Abstract
Facial expression recognition has been widely used in lots of fields such as health care and intelligent robot systems. However, recognizing facial expression in the wild is still very challenging due to variations, light intensity, occlusions and the ambiguity of human emotion. When training samples cannot include all these environments, the classification can easily lead to errors. Therefore, this paper proposes a new heuristic objective function based on the domain knowledge so as to better optimize deep neural networks for facial expression recognition. Moreover, we take the specific relationship between the facial expression and facial action units as the domain knowledge. By analyzing the mixing relationship between different expression categories and then enlarging the distance of easily confused categories, we define a new heuristic objective function which can guide deep neural network to learn better features and then improve the accuracy of facial expression recognition. The experimental results verify the effectiveness, universality and the superior performance of our methods.
Similar content being viewed by others
References
Sun, Y., Wen, G.: Cognitive facial expression recognition with constrained dimensionality reduction. Neurocomputing 230, 397–408 (2017)
Mollahosseini, A., Hasani, B., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10(1), 18–31 (2019)
Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE conference on computer vision pattern recognition (CVPR), pp. 2852–2861 (2017)
Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: 2011 IEEE international conference on computer vision workshops (ICCVW) (2011)
Goodfellow, I.J., Erhan, D., Carrier, P.L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D.H., et al.: Challenges in representation learning: a report on three machine learning contests. In: Lee, M., Hirose, A., Hou, Z.G., Kil, R.M. (eds.) Neural information processing, pp. 117–124. Springer, Berlin (2013). https://doi.org/10.1007/978-3-642-42051-1_16
Barsoum, E., Zhang, C., Ferrer C.C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM international conference on multimodal interaction, pp. 279–283 (2016)
Fabian Benitez-Quiroz, C., Srinivasan, R., Martinez, A.M.: Emotionet: an accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (2016)
Wen, G., Li, H., Huang, J., et al.: Random deep belief networks for recognizing emotions from speech signals. Comput. Intell. Neurosci. 2017, 1–9 (2017). https://doi.org/10.1155/2017/1945630
Chen, S., Wang, J., Chen, Y., Shi, Z., Geng, X., Rui, Y.: Label distribution learning on auxiliary label space graphs for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) (2020)
Villanueva, M.G., Zavala, S.R.: Deep neural network architecture: application for facial expression recognition. IEEE Latin Am. Trans. 18(07), 1311–1319 (2020). https://doi.org/10.1109/TLA.2020.9099774
Joseph, J.L., Mathew, S.P.: Facial expression recognition for the blind using deep learning. In: 2021 IEEE 4th international conference on computing, power and communication technologies (GUCON), pp. 1–5 (2021). https://doi.org/10.1109/GUCON50781.2021.9574035.
Liu, C., Liu, X., Chen, C., Wang, Q.: Soft thresholding squeeze-and-excitation network for pose-invariant facial expression recognition. Visual Comput. (2022). https://doi.org/10.1007/s00371-022-02483-5
Jun Wang. Facial Expression and Action Unit Recognition Based on Prior Knowledge. University of Science and Technology of China, 2015.
Ekman, P., Rosenberg, E.L.: What the Face Reveals: Basic and Applied Studies of Spontaneous Expression using The Facial Action Coding System (FACS). Oxford University Press, USA (1997)
Nuanes, T., Elsey, M., Sankaranarayanan, A., Shen, J.: Soft cross entropy loss and bottleneck tri-cost volume for efficient stereo depth prediction. In: 2021 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), pp. 2840-2848 (2021). doi: https://doi.org/10.1109/CVPRW53098.2021.00319
Anderson, J.R., Michalski, R.S., Carbonell, J.G., et al.: Machine Learning: An Artificial Intelligence Approach. Morgan Kaufmann, Burlington (1986)
Zhang, H., Su, W., Yu, J., Wang, Z.: Identity–expression dual branch network for facial expression recognition. IEEE Trans. Cognitive Dev. Syst. 13(4), 898–911 (2021). https://doi.org/10.1109/TCDS.2020.3034807
Li, M., Hao, X., Huang, X., Song, Z., Liu, X., Li, X.: Facial expression recognition with identity and emotion joint learning. IEEE Trans. Affective Comput. 12(2), 544–550 (2021). https://doi.org/10.1109/TAFFC.2018.2880201
Chen, J., Guo, C., Xu, R., Zhang, K., Yang, Z., Liu, H.: Toward children’s empathy ability analysis: joint facial expression recognition and intensity estimation using label distribution learning. IEEE Trans. Industr. Inf. 18(1), 16–25 (2022). https://doi.org/10.1109/TII.2021.3075989
Zhang, T., et al.: Cross-database micro-expression recognition: a benchmark. IEEE Trans. Knowl. Data Eng. 34(2), 544–559 (2022). https://doi.org/10.1109/TKDE.2020.2985365
Wang, J., et al.: Capture expression-dependent AU relations for expression recognition. In: 2014 IEEE international conference on multimedia and expo workshops (ICMEW) IEEE (2014)
Wang, Z., Chen, T., Ren, J., Yu, W., Cheng, H., Lin, L.: Deep reasoning with knowledge graph for social relationship understanding. In Proceedings of the international joint conference on artificial intelligence, pp. 2021–2028 (2018)
Chen, T., Yu, W., Chen, R., Lin, L.: Knowledge-embedded routing network for scene graph generation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6163–6171 (2019)
Chen, T., Xu, M., Hui, X., Wu, H., Lin, L.: Learning semanticspecific graph representation for multi-label image recognition. In: Proceedings of the IEEE international conference on computer vision, pp. 522–531 (2019)
Xie, Y., Chen, T., Pu, T., Wu, H., Lin, L.: Adversarial graph representation adaptation for cross-domain facial expression recognition. In: Proceedings of the 28th ACM international conference on multimedia, pp. 1255–1264 (2020)
Chen, T., Pu, T., Xie, Y., Wu, H., Liu, L., Lin, L.: Cross-domain facial expression recognition: a unified evaluation benchmark and adversarial graph learning. IEEE Trans Pattern Anal. Mach. Intell. (2020). https://doi.org/10.1109/TPAMI.2021.3131222
Chen, T., Lin, L., Hui, X., Chen, R., Wu, H.: Knowledge-guided multi-label few-shot learning for general image recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2020). https://doi.org/10.1109/TPAMI.2020.3025814
Pu, T., Chen, T., Xie, Y., Wu H., Lin, L.: AU-expression knowledge constrained representation learning for facial expression recognition. In: 2021 IEEE international conference on robotics and automation (ICRA), pp. 11154-11161 (2021). https://doi.org/10.1109/ICRA48506.2021.9561252.
He, J., Xiaocui, Y., Sun, B., Lejun, Y.: Facial expression and action unit recognition augmented by their dependencies on graph convolutional networks. J. Multimodal User Interfaces 15(4), 429–440 (2021). https://doi.org/10.1007/s12193-020-00363-7
Jin, X., Lai, Z., Jin, Z.: Learning dynamic relationships for facial expression recognition based on graph convolutional network. IEEE Trans. Image Process. 30, 7143–7155 (2021). https://doi.org/10.1109/TIP.2021.3101820
Wen, G., Chang, T., Li, H., Jiang, L.: Dynamic objectives learning for facial expression recognition. IEEE Trans. Multimedia 22(11), 2914–2925 (2020). https://doi.org/10.1109/TMM.2020.2966858
Pan, H., Han, H., Shan, S., Chen, X.: Mean-variance loss for deep age estimation from a face. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 5285–5294 (2018)
Wen, Y., Zhang, K., Li, Y., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: Leibe, Bastian, Matas, Jiri, Sebe, Nicu, Welling, Max (eds.) Computer vision – ECCV 2016: 14th european conference, Amsterdam, The Netherlands, October 11–14, 2016, proceedings, Part VII, pp. 499–515. Springer International Publishing, Cham (2016). https://doi.org/10.1007/978-3-319-46478-7_31
Sandhya, M., Morampudi, M.K., Pruthweraaj, I., et al.: Multi-instance cancelable iris authentication system using triplet loss for deep learning models. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02429-x
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 770–778 (2016)
Wang, F., Cheng, J., Liu, W., Liu, H.: Additive margin softmax for face verification. IEEE Signal Process. Lett. 25(7), 926–930 (2018)
Liu, W., Wen, Y., Yu, Z., et al.: Sphereface: deep hypersphere embedding for face recognition. In: Honolulu: IEEE conference on computer vision and pattern recognition (2017)
Wang, H., Wang, Y., Zhou, Z., et al.: Cosface: large margin cosine loss for deep face recognition. In: Salt Lake City: IEEE conference on computer vision and pattern recognition (2018)
Deng, J., Guo, J., Stefanos, Z.: Arcface: additive angular margin loss for deep face recognition. In: Seattle: IEEE/CVF conference on computer vision and pattern recognition (2019)
Shan, L., Deng, W.: Deep facial expression recognition: a survey. IEEE Trans. Affective Comput. (2018). https://doi.org/10.1109/TAFFC.2020.2981446
Ekman, P., Friesen, W.V.: Constants across cultures in the face and emotion[J]. J. Pers. Soc. Psychol. 17(2), 124–129 (1971)
Ekman, P., Friesen, W.V.: Facial Action Coding System (FACS)[M]. Consulting Psychologists Press (1978)
Zhu, D., Tian, G., Zhu, L., Wang, W., Wang, B., Li, C.: LKRNet: a dual-branch network based on local key regions for facial expression recognition. Signal Image Video Process. 15(2), 263–270 (2020). https://doi.org/10.1007/s11760-020-01753-w
Van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
Wang, K., Peng, X., Yang, J., Lu, S., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6897–6906 (2020)
Zhang, Y., Wang, C., Deng, W.: Relative uncertainty learning for facial expression recognition. NeurIPS 34, 17616–17627 (2021)
Zhao, Z., Liu, Q., Wang, S.: Learning deep global multi-scale and local attention features for facial expression recognition in the wild. IEEE Trans. Image Process. 30, 6544–6556 (2021). https://doi.org/10.1109/TIP.2021.3093397
Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Yu.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)
Ruan, D., Yan, Y., Lai, S., et al.: Feature decomposition and reconstruction learning for effective facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 7660–7669 (2021)
She, J., Hu, Y., Shi, H., Wang, J., Shen, Q., Mei, T.: Dive into ambiguity: latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6248–6257 (2021)
Zhao, Z., Liu, Q., Zhou, F.: Robust lightweight facial expression recognition network with label distribution training. In: Proceedings of the AAAI conference on artificial intelligence, vol. 35, no. 4, pp. 3510–3519 (2021)
Ma, F., Sun, B., Li, S.: Facial expression recognition with visual transformers and attentional selective fusion. IEEE Trans. Affective Comput. (2021). https://doi.org/10.1109/TAFFC.2021.3122146
Albanie, S., Nagrani, A., Vedaldi, A., Zisserman, A.: Emotion recognition in speech using crossmodal transfer in the wild. In: Proceedings of the 26th ACM international conference on Multimedia, pp. 292–301 (2018)
Lian, Z., Li, Y., Tao, J., Huang, J., Niu, M.: Region based robust facial expression analysis. In: 2018 First Asian conference on affective computing and intelligent interaction (ACII Asia), pp. 1–5. IEEE (2018)
Li, M., Xu, H., Huang, X., Song, Z., Liu, X., Li, X.: Facial expression recognition with identity and emotion joint learning. IEEE Trans. Affect. Comput. 2, 71 (2018)
Chen, S., Wang, J., Chen, Y., Shi, Z., Geng, X., Rui, Y.: Label distribution learning on auxiliary label space graphs for facial expression recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13984–13993 (2020)
Vo, T.-H., Lee, G.-S., Yang, H.-J., Kim, S.-H.: Pyramid with super resolution for in-the-wild facial expression recognition. IEEE Access 8(131), 988–132001 (2020)
Darshan Gera, S., Balasubramanian, A.J.: CERN: compact facial expression recognition net. Pattern Recognit. Lett. 155, 9–18 (2022). https://doi.org/10.1016/j.patrec.2022.01.013
Chen, B., Guan, W., Li, P., Ikeda, N., Hirasawa, K., Huchuan, L.: Residual multi-task learning for facial landmark localization and expression recognition. Pattern Recognit 115, 107893 (2021). https://doi.org/10.1016/j.patcog.2021.107893
Xue, F., Wang, Q., Guo, G.: Transfer: learning relation-aware facial expression representations with transformers. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp. 3601–3610 (2021)
Acknowledgements
This study was supported by National Natural Science Foundation of China (Grant Nos. 62006049, 62176095, 62172113 and 62072123), Guangdong Province Key Area R&D Plan Project (Grant No. 2020B1111120001), Guangzhou Science and Technology Planning Project (Grant No. 201803010088), Ministry of Education Humanities and Social Science project (Grant No. 18JDGC012).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, H., Xiao, X., Liu, X. et al. Heuristic objective for facial expression recognition. Vis Comput 39, 4709–4720 (2023). https://doi.org/10.1007/s00371-022-02619-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-022-02619-7