Facial Expression Recognition with Global Multiscale and Local Attention Network

Zheng, Shukai; Liu, Miao; Zheng, Ligang; Chen, Wenbin

doi:10.1007/978-3-031-50069-5_33

Shukai Zheng¹²,
Miao Liu¹²,
Ligang Zheng¹² &
…
Wenbin Chen¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14495))

Included in the following conference series:

Computer Graphics International Conference

199 Accesses

Abstract

Due to problems such as occlusion and pose variation, facial expression recognition (FER) in the wild is a challenging classification task. This paper proposes a global multiscale and local attention network (GL-VGG) based on the VGG structure, which consists of four modules: a VGG base module, a dropblock module, a global multiscale module, and a local attention module. The base module pre-extracts features, the dropblock module prevents overfitting in the convolutional layers, the global multiscale module is used to learn different receptive field features in the global perception domain, which reduces the susceptibility of deeper convolution towards occlusion and variant pose, and the local attention module guides the network to focus on local rich features, which releases the interference of occlusion on FER in the wild. Experiments on two public wild FER datasets show that our GL-VGG approach outperforms the baseline and other state-of-the-art methods with 88.33% on RAF-DB and 74.17% on FER2013.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barsoum, E., Zhang, C., Ferrer, C.C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 279–283 (2016)
Google Scholar
Fard, A.P., Mahoor, M.H.: Ad-Corre: adaptive correlation-based loss for facial expression recognition in the wild. IEEE Access 10, 26756–26768 (2022)
Article Google Scholar
Farzaneh, A.H., Qi, X.: Facial expression recognition in the wild via deep attentive center loss. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2402–2411 (2021)
Google Scholar
Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2021)
Article Google Scholar
Ghiasi, G., Lin, T.Y., Le, Q.V.: Dropblock: a regularization method for convolutional networks. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Hinton, G.E., Krizhevsky, A., Sutskever, I.: ImageNet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25(1106–1114), 1 (2012)
Google Scholar
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Khaireddin, Y., Chen, Z.: Facial emotion recognition: state of the art performance on fer2013. arXiv preprint arXiv:2105.03588 (2021)
Lei, J., et al.: Facial expression recognition by expression-specific representation swapping. In: Farkaš, I., Masulli, P., Otte, S., Wermter, S. (eds.) ICANN 2021. LNCS, vol. 12892, pp. 80–91. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86340-1_7
Chapter Google Scholar
Li, H., Xiao, X., Liu, X., Guo, J., Wen, G., Liang, P.: Heuristic objective for facial expression recognition. Vis. Comput. 39, 4709–4720 (2022)
Article Google Scholar
Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2852–2861 (2017)
Google Scholar
Li, Y., Zeng, J., Shan, S., Chen, X.: Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans. Image Process. 28(5), 2439–2450 (2018)
Article MathSciNet Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding facial expressions with Gabor wavelets. In: Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 200–205. IEEE (1998)
Google Scholar
Ma, F., Sun, B., Li, S.: Facial expression recognition with visual transformers and attentional selective fusion. IEEE Trans. Affect. Comput. 14, 1236–1248 (2021)
Article Google Scholar
Minaee, S., Minaei, M., Abdolrashidi, A.: Deep-emotion: facial expression recognition using attentional convolutional network. Sensors 21(9), 3046 (2021)
Article Google Scholar
Najibi, M., Samangouei, P., Chellappa, R., Davis, L.: SSH: single stage headless face detector. In: 2017 IEEE International Conference on Computer Vision (ICCV) (2017)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. Computer Science (2014)
Google Scholar
Valstar, M., Pantic, M., et al.: Induced disgust, happiness and surprise: an addition to the mmi facial expression database. In: Proceedings of the 3rd International Workshop on EMOTION (satellite of LREC): Corpora for Research on Emotion and Affect, p. 65. Paris, France. (2010)
Google Scholar
Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)
Article Google Scholar
Wang, K., Peng, X., Yang, J., Lu, S., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6897–6906 (2020)
Google Scholar
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19 (2018)
Google Scholar
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
Google Scholar
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2921–2929 (2016)
Google Scholar

Download references

Acknowledgements

The paper’s work is supported by 2022 Guangzhou education scientific research project 202214086 (Research on evaluation of children’s development based on artificial intelligence technology) and the Joint Project of University and City in Guangzhou Science and Technology Bureau under Grant No. SL2022A03J00903.

Author information

Authors and Affiliations

School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou Higher Education Mega Center, Guangzhou, 510006, China
Shukai Zheng, Miao Liu, Ligang Zheng & Wenbin Chen

Authors

Shukai Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Miao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ligang Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Wenbin Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Miao Liu .

Editor information

Editors and Affiliations

Shanghai Jiao Tong University, Shanghai, China
Bin Sheng
Shanghai Jiao Tong University, Shanghai, China
Lei Bi
University of Sydney, Sydney, NSW, Australia
Jinman Kim
MIRALab-CUI, University of Geneve, Carouge, Geneve, Switzerland
Nadia Magnenat-Thalmann
Swiss Federal Institute of Technology, Lausanne, Switzerland
Daniel Thalmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zheng, S., Liu, M., Zheng, L., Chen, W. (2024). Facial Expression Recognition with Global Multiscale and Local Attention Network. In: Sheng, B., Bi, L., Kim, J., Magnenat-Thalmann, N., Thalmann, D. (eds) Advances in Computer Graphics. CGI 2023. Lecture Notes in Computer Science, vol 14495. Springer, Cham. https://doi.org/10.1007/978-3-031-50069-5_33

Download citation

DOI: https://doi.org/10.1007/978-3-031-50069-5_33
Published: 20 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50068-8
Online ISBN: 978-3-031-50069-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Facial Expression Recognition with Global Multiscale and Local Attention Network