Abstract
Due to problems such as occlusion and pose variation, facial expression recognition (FER) in the wild is a challenging classification task. This paper proposes a global multiscale and local attention network (GL-VGG) based on the VGG structure, which consists of four modules: a VGG base module, a dropblock module, a global multiscale module, and a local attention module. The base module pre-extracts features, the dropblock module prevents overfitting in the convolutional layers, the global multiscale module is used to learn different receptive field features in the global perception domain, which reduces the susceptibility of deeper convolution towards occlusion and variant pose, and the local attention module guides the network to focus on local rich features, which releases the interference of occlusion on FER in the wild. Experiments on two public wild FER datasets show that our GL-VGG approach outperforms the baseline and other state-of-the-art methods with 88.33% on RAF-DB and 74.17% on FER2013.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barsoum, E., Zhang, C., Ferrer, C.C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 279–283 (2016)
Fard, A.P., Mahoor, M.H.: Ad-Corre: adaptive correlation-based loss for facial expression recognition in the wild. IEEE Access 10, 26756–26768 (2022)
Farzaneh, A.H., Qi, X.: Facial expression recognition in the wild via deep attentive center loss. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2402–2411 (2021)
Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2021)
Ghiasi, G., Lin, T.Y., Le, Q.V.: Dropblock: a regularization method for convolutional networks. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Hinton, G.E., Krizhevsky, A., Sutskever, I.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25(1106–1114), 1 (2012)
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Khaireddin, Y., Chen, Z.: Facial emotion recognition: state of the art performance on fer2013. arXiv preprint arXiv:2105.03588 (2021)
Lei, J., et al.: Facial expression recognition by expression-specific representation swapping. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds.) ICANN 2021. LNCS, vol. 12892, pp. 80–91. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86340-1_7
Li, H., Xiao, X., Liu, X., Guo, J., Wen, G., Liang, P.: Heuristic objective for facial expression recognition. Vis. Comput. 39, 4709–4720 (2022)
Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2852–2861 (2017)
Li, Y., Zeng, J., Shan, S., Chen, X.: Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans. Image Process. 28(5), 2439–2450 (2018)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding facial expressions with Gabor wavelets. In: Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 200–205. IEEE (1998)
Ma, F., Sun, B., Li, S.: Facial expression recognition with visual transformers and attentional selective fusion. IEEE Trans. Affect. Comput. 14, 1236–1248 (2021)
Minaee, S., Minaei, M., Abdolrashidi, A.: Deep-emotion: facial expression recognition using attentional convolutional network. Sensors 21(9), 3046 (2021)
Najibi, M., Samangouei, P., Chellappa, R., Davis, L.: SSH: single stage headless face detector. In: 2017 IEEE International Conference on Computer Vision (ICCV) (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)
Valstar, M., Pantic, M., et al.: Induced disgust, happiness and surprise: an addition to the mmi facial expression database. In: Proceedings of the 3rd International Workshop on EMOTION (satellite of LREC): Corpora for Research on Emotion and Affect, p. 65. Paris, France. (2010)
Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)
Wang, K., Peng, X., Yang, J., Lu, S., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6897–6906 (2020)
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19 (2018)
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2921–2929 (2016)
Acknowledgements
The paper’s work is supported by 2022 Guangzhou education scientific research project 202214086 (Research on evaluation of children’s development based on artificial intelligence technology) and the Joint Project of University and City in Guangzhou Science and Technology Bureau under Grant No. SL2022A03J00903.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zheng, S., Liu, M., Zheng, L., Chen, W. (2024). Facial Expression Recognition with Global Multiscale and Local Attention Network. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14495. Springer, Cham. https://doi.org/10.1007/978-3-031-50069-5_33
Download citation
DOI: https://doi.org/10.1007/978-3-031-50069-5_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50068-8
Online ISBN: 978-3-031-50069-5
eBook Packages: Computer ScienceComputer Science (R0)