Abstract
Since pain often causes deformations in the facial structure, analysis of facial expressions has received considerable attention for automatic pain estimation in recent years. This study proposes a deep attention transformer network for pain estimation called Pain Estimate Transformer (PET), which consists of two different subnetworks: an image encoding subnetwork and a video transformer subnetwork. In image encoding subnetwork, ResNet is combined with a bottleneck attention block to learning the features of facial images. In the transformer subnetwork, a transformer encoder is used to capture the temporal relationship among frames. The spatial-temporal features are combined with Multi-Layer Perceptron (MLP) for pain intensity regression. Experimental results on the UNBC-McMaster Shoulder Pain dataset show that the proposed PET achieves compelling performances for pain intensity estimation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ekman, P., Rosenberg, E.L. (eds.): What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press, Oxford (1997)
Egede, J.O., Valstar, M.: Cumulative attributes for pain intensity estimation. In: Proceedings of the 19th ACM International Conference on Multimodal Interaction (2017)
Littlewort, G.C., Bartlett, M.S., Lee, K.: Automatic coding of facial expressions displayed during posed and genuine pain. Image Vis. Comput. 27(12), 1797–1803 (2009)
Wang, F., et al. : Regularizing face verification nets for pain intensity regression. In: 2017 IEEE International Conference on Image Processing (ICIP). IEEE (2017)
Tavakolian, M., Hadid, A.: Deep spatiotemporal representation of the face for automatic pain intensity estimation. In: 2018 24th International Conference on Pattern Recognition (ICPR). IEEE (2018)
Zhou, J., et al.: Recurrent convolutional neural network regression for continuous pain intensity estimation in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2016)
Rodriguez, P., et al.: Deep pain: exploiting long short-term memory networks for facial expression classification. IEEE Tran. Cybern. 1–11 (2017)
Vaswani, A., et al.: Attention is All you Need. In: NIPS (2017)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)
Park, J., et al.: BAM: bottleneck attention module. In: British Machine Vision Conference (BMVC). British Machine Vision Association (BMVA) (2018)
He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Lucey, P., et al.: Painful data: the UNBC-McMaster shoulder pain expression archive database. In: 2011 IEEE International Conference on Automatic Face and Gesture Recognition (FG). IEEE (2011)
Zhao, R., et al.: Facial expression intensity estimation using ordinal information. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Wang, J., Sun, H.: Pain intensity estimation using deep spatiotemporal and handcrafted features. IEICE Trans. Inf. Syst. 101(6), 1572–1580 (2018)
Yang, R., et al.: Incorporating high-level and low-level cues for pain intensity estimation. In: 2018 24th International Conference on Pattern Recognition (ICPR). IEEE (2018)
Tavakolian, M., Hadid, A.: Deep binary representation of facial expressions: a novel framework for automatic pain intensity recognition. In: 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE (2018)
Huang, D., et al.: Pain-awareness multistream convolutional neural network for pain estimation. J. Electr. Imag. 28(4), 043008 (2019)
Acknowledgment
This study was supported in part by National Natural Science Foundation of China (61773263), Shanghai Jiao Tong University Scientific and Technological Innovation Funds (2019QYB02), Shanghai Municipal Science and Technology Major Project (2021SHZDZX0102).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Xu, H., Liu, M. (2021). A Deep Attention Transformer Network for Pain Estimation with Facial Expression Video. In: Feng, J., Zhang, J., Liu, M., Fang, Y. (eds) Biometric Recognition. CCBR 2021. Lecture Notes in Computer Science(), vol 12878. Springer, Cham. https://doi.org/10.1007/978-3-030-86608-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-86608-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86607-5
Online ISBN: 978-3-030-86608-2
eBook Packages: Computer ScienceComputer Science (R0)