A Deep Attention Transformer Network for Pain Estimation with Facial Expression Video

Xu, Haochen; Liu, Manhua

doi:10.1007/978-3-030-86608-2_13

Haochen Xu^12,13 &
Manhua Liu^12,13

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12878))

Included in the following conference series:

Chinese Conference on Biometric Recognition

1601 Accesses
2 Citations

Abstract

Since pain often causes deformations in the facial structure, analysis of facial expressions has received considerable attention for automatic pain estimation in recent years. This study proposes a deep attention transformer network for pain estimation called Pain Estimate Transformer (PET), which consists of two different subnetworks: an image encoding subnetwork and a video transformer subnetwork. In image encoding subnetwork, ResNet is combined with a bottleneck attention block to learning the features of facial images. In the transformer subnetwork, a transformer encoder is used to capture the temporal relationship among frames. The spatial-temporal features are combined with Multi-Layer Perceptron (MLP) for pain intensity regression. Experimental results on the UNBC-McMaster Shoulder Pain dataset show that the proposed PET achieves compelling performances for pain intensity estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ekman, P., Rosenberg, E.L. (eds.): What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press, Oxford (1997)
Google Scholar
Egede, J.O., Valstar, M.: Cumulative attributes for pain intensity estimation. In: Proceedings of the 19th ACM International Conference on Multimodal Interaction (2017)
Google Scholar
Littlewort, G.C., Bartlett, M.S., Lee, K.: Automatic coding of facial expressions displayed during posed and genuine pain. Image Vis. Comput. 27(12), 1797–1803 (2009)
Google Scholar
Wang, F., et al. : Regularizing face verification nets for pain intensity regression. In: 2017 IEEE International Conference on Image Processing (ICIP). IEEE (2017)
Google Scholar
Tavakolian, M., Hadid, A.: Deep spatiotemporal representation of the face for automatic pain intensity estimation. In: 2018 24th International Conference on Pattern Recognition (ICPR). IEEE (2018)
Google Scholar
Zhou, J., et al.: Recurrent convolutional neural network regression for continuous pain intensity estimation in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2016)
Google Scholar
Rodriguez, P., et al.: Deep pain: exploiting long short-term memory networks for facial expression classification. IEEE Tran. Cybern. 1–11 (2017)
Google Scholar
Vaswani, A., et al.: Attention is All you Need. In: NIPS (2017)
Google Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)
Google Scholar
Park, J., et al.: BAM: bottleneck attention module. In: British Machine Vision Conference (BMVC). British Machine Vision Association (BMVA) (2018)
Google Scholar
He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Lucey, P., et al.: Painful data: the UNBC-McMaster shoulder pain expression archive database. In: 2011 IEEE International Conference on Automatic Face and Gesture Recognition (FG). IEEE (2011)
Google Scholar
Zhao, R., et al.: Facial expression intensity estimation using ordinal information. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Wang, J., Sun, H.: Pain intensity estimation using deep spatiotemporal and handcrafted features. IEICE Trans. Inf. Syst. 101(6), 1572–1580 (2018)
Article Google Scholar
Yang, R., et al.: Incorporating high-level and low-level cues for pain intensity estimation. In: 2018 24th International Conference on Pattern Recognition (ICPR). IEEE (2018)
Google Scholar
Tavakolian, M., Hadid, A.: Deep binary representation of facial expressions: a novel framework for automatic pain intensity recognition. In: 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE (2018)
Google Scholar
Huang, D., et al.: Pain-awareness multistream convolutional neural network for pain estimation. J. Electr. Imag. 28(4), 043008 (2019)
Google Scholar

Download references

Acknowledgment

This study was supported in part by National Natural Science Foundation of China (61773263), Shanghai Jiao Tong University Scientific and Technological Innovation Funds (2019QYB02), Shanghai Municipal Science and Technology Major Project (2021SHZDZX0102).

Author information

Authors and Affiliations

Department of Instrument Science and Engineering, School of EIEE, Shanghai Jiao Tong University, Shanghai, 200240, China
Haochen Xu & Manhua Liu
The MoE Key Lab of Artificial Intelligence, Artificial Intelligence Institute, Shanghai Jiao Tong University, Shanghai, 200240, China
Haochen Xu & Manhua Liu

Authors

Haochen Xu
View author publications
You can also search for this author in PubMed Google Scholar
Manhua Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manhua Liu .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jianjiang Feng
Fudan University, Shanghai, China
Junping Zhang
Shanghai Jiao Tong University, Shanghai, China
Manhua Liu
Shanghai University, Shanghai, China
Yuchun Fang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, H., Liu, M. (2021). A Deep Attention Transformer Network for Pain Estimation with Facial Expression Video. In: Feng, J., Zhang, J., Liu, M., Fang, Y. (eds) Biometric Recognition. CCBR 2021. Lecture Notes in Computer Science(), vol 12878. Springer, Cham. https://doi.org/10.1007/978-3-030-86608-2_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-86608-2_13
Published: 08 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86607-5
Online ISBN: 978-3-030-86608-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics