Skip to main content

A Deep Attention Transformer Network for Pain Estimation with Facial Expression Video

  • Conference paper
  • First Online:
Biometric Recognition (CCBR 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12878))

Included in the following conference series:

Abstract

Since pain often causes deformations in the facial structure, analysis of facial expressions has received considerable attention for automatic pain estimation in recent years. This study proposes a deep attention transformer network for pain estimation called Pain Estimate Transformer (PET), which consists of two different subnetworks: an image encoding subnetwork and a video transformer subnetwork. In image encoding subnetwork, ResNet is combined with a bottleneck attention block to learning the features of facial images. In the transformer subnetwork, a transformer encoder is used to capture the temporal relationship among frames. The spatial-temporal features are combined with Multi-Layer Perceptron (MLP) for pain intensity regression. Experimental results on the UNBC-McMaster Shoulder Pain dataset show that the proposed PET achieves compelling performances for pain intensity estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ekman, P., Rosenberg, E.L. (eds.): What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press, Oxford (1997)

    Google Scholar 

  2. Egede, J.O., Valstar, M.: Cumulative attributes for pain intensity estimation. In: Proceedings of the 19th ACM International Conference on Multimodal Interaction (2017)

    Google Scholar 

  3. Littlewort, G.C., Bartlett, M.S., Lee, K.: Automatic coding of facial expressions displayed during posed and genuine pain. Image Vis. Comput. 27(12), 1797–1803 (2009)

    Google Scholar 

  4. Wang, F., et al. : Regularizing face verification nets for pain intensity regression. In: 2017 IEEE International Conference on Image Processing (ICIP). IEEE (2017)

    Google Scholar 

  5. Tavakolian, M., Hadid, A.: Deep spatiotemporal representation of the face for automatic pain intensity estimation. In: 2018 24th International Conference on Pattern Recognition (ICPR). IEEE (2018)

    Google Scholar 

  6. Zhou, J., et al.: Recurrent convolutional neural network regression for continuous pain intensity estimation in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2016)

    Google Scholar 

  7. Rodriguez, P., et al.: Deep pain: exploiting long short-term memory networks for facial expression classification. IEEE Tran. Cybern. 1–11 (2017)

    Google Scholar 

  8. Vaswani, A., et al.: Attention is All you Need. In: NIPS (2017)

    Google Scholar 

  9. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  10. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)

    Google Scholar 

  11. Park, J., et al.: BAM: bottleneck attention module. In: British Machine Vision Conference (BMVC). British Machine Vision Association (BMVA) (2018)

    Google Scholar 

  12. He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)

    Google Scholar 

  13. Lucey, P., et al.: Painful data: the UNBC-McMaster shoulder pain expression archive database. In: 2011 IEEE International Conference on Automatic Face and Gesture Recognition (FG). IEEE (2011)

    Google Scholar 

  14. Zhao, R., et al.: Facial expression intensity estimation using ordinal information. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)

    Google Scholar 

  15. Wang, J., Sun, H.: Pain intensity estimation using deep spatiotemporal and handcrafted features. IEICE Trans. Inf. Syst. 101(6), 1572–1580 (2018)

    Article  Google Scholar 

  16. Yang, R., et al.: Incorporating high-level and low-level cues for pain intensity estimation. In: 2018 24th International Conference on Pattern Recognition (ICPR). IEEE (2018)

    Google Scholar 

  17. Tavakolian, M., Hadid, A.: Deep binary representation of facial expressions: a novel framework for automatic pain intensity recognition. In: 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE (2018)

    Google Scholar 

  18. Huang, D., et al.: Pain-awareness multistream convolutional neural network for pain estimation. J. Electr. Imag. 28(4), 043008 (2019)

    Google Scholar 

Download references

Acknowledgment

This study was supported in part by National Natural Science Foundation of China (61773263), Shanghai Jiao Tong University Scientific and Technological Innovation Funds (2019QYB02), Shanghai Municipal Science and Technology Major Project (2021SHZDZX0102).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Manhua Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xu, H., Liu, M. (2021). A Deep Attention Transformer Network for Pain Estimation with Facial Expression Video. In: Feng, J., Zhang, J., Liu, M., Fang, Y. (eds) Biometric Recognition. CCBR 2021. Lecture Notes in Computer Science(), vol 12878. Springer, Cham. https://doi.org/10.1007/978-3-030-86608-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86608-2_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86607-5

  • Online ISBN: 978-3-030-86608-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics