Skip to main content
Log in

Your heart rate betrays you: multimodal learning with spatio-temporal fusion networks for micro-expression recognition

  • REGULAR PAPER
  • Published:
International Journal of Multimedia Information Retrieval Aims and scope Submit manuscript

Abstract

Micro-expressions can convey feelings that people are trying to hide. At present, some studies on micro-expression, most of which only use the temporal or spatial information in the image to recognize micro-expressions, neglect the intrinsic features of the image. To solve this problem, we detect the subject’s heart rate in the long micro-expression videos; we extract the image’s spatio-temporal feature through a spatio-temporal network and then extract the heart rate feature using a heart rate network. A multimodal learning method that combines the heart rate and spatio-temporal features is used to recognize micro-expressions. The experimental results on CASMEII, SAMM, and SMIC show that the proposed methods’ unweighted F1-score and unweighted average recall are 0.8867 and 0.8962, respectively. The spatio-temporal fusion network combined with heart rate information provides an essential reference for multimodal approaches to micro-expression recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data Availability

The data that support the findings of this study are available from the corresponding author, Ning He, upon reasonable request.

References

  1. O”Sullivan M, Frank MG, Tiwana HJ (2009) Police lie detection accuracy: the effect of lie scenario. Law Hum Behav 33(6):542–543

  2. Weinberger S (2010) Intent to deceive? Nature 465(7297):412–415

    Article  Google Scholar 

  3. Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans Pattern Anal Mach Intell 29(6):915–928

    Article  Google Scholar 

  4. Wu H-Y, Rubinstein M, Shih E, Guttag J, Durand F, Freeman W (2012) Eulerian video magnification for revealing subtle changes in the world. ACM Trans Graph (TOG) 31(4):1–8

    Article  Google Scholar 

  5. Liu S-Q, Lan X, Yuen PC (2018) Remote photoplethysmography correspondence feature for 3d mask face presentation attack detection. In: Proceedings of the European conference on computer vision (ECCV), pp 558–573

  6. Liu Y, Jourabloo A, Liu X (2018) Learning deep models for face anti-spoofing: binary or auxiliary supervision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 389–398

  7. Verkruysse W, Svaasand LO, Nelson JS (2008) Remote plethysmographic imaging using ambient light. Opt Express 16(26):21434–21445

    Article  Google Scholar 

  8. Rouast PV, Adam MP, Dorner V, Lux E (2016) Remote photoplethysmography: evaluation of contactless heart rate measurement in an information systems setting. In: Applied informatics and technology innovation conference, pp 1–17

  9. Liu Y-J, Zhang J-K, Yan W-J, Wang S-J, Zhao G, Fu X (2015) A main directional mean optical flow feature for spontaneous micro-expression recognition. IEEE Trans Affect Comput 7(4):299–310

    Article  Google Scholar 

  10. Liong S-T, See J, Wong KS, Phan RC-W (2018) Less is more: micro-expression recognition from video using apex frame. Signal Process Image Commun 62:82–92

    Article  Google Scholar 

  11. Khor H-Q, See J, Phan RCW, Lin W (2018) Enriched long-term recurrent convolutional network for facial micro-expression recognition. In: 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018), pp 667–674. IEEE

  12. Liu Y, Du H, Zheng L, Gedeon T (2019) A neural micro-expression recognizer. In: 2019 14th IEEE international conference on automatic face & gesture recognition (FG 2019), pp 1–4. IEEE

  13. Krishnamurthy G, Majumder N, Poria S, Cambria E (2018) A deep learning approach for multimodal deception detection. arXiv preprintarXiv:1803.00344

  14. Samadiani N, Huang G, Cai B, Luo W, Chi C-H, Xiang Y, He J (2019) A review on automatic facial expression recognition systems assisted by multimodal sensor data. Sensors 19(8):1863

    Article  Google Scholar 

  15. Li X, Pfister T, Huang X, Zhao G, Pietikäinen M (2013) A spontaneous micro-expression database: inducement, collection and baseline. In: 2013 10th ieee international conference and workshops on automatic face and gesture recognition (FG), pp 1–6. IEEE

  16. Davison AK, Lansley C, Costen N, Tan K, Yap MH (2016) Samm: a spontaneous micro-facial movement dataset. IEEE Trans Affect Comput 9(1):116–129

    Article  Google Scholar 

  17. Davison AK, Merghani W, Yap MH (2018) Objective classes for micro-facial expression recognition. J Imaging 4(10):119

    Article  Google Scholar 

  18. Qu F, Wang S-J, Yan W-J, Li H, Wu S, Fu X (2017) Cas(me)2): a database for spontaneous macro-expression and micro-expression spotting and recognition. IEEE Trans Affect Comput 9:424–436

    Article  Google Scholar 

  19. Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. arXiv preprint arXiv:1406.2199

  20. Wang C, Peng M, Bi T, Chen T (2020) Micro-attention for micro-expression recognition. Neurocomputing 410:354–362

    Article  Google Scholar 

  21. Zhang R, He N, Wu Y, He Y, Yan K (2021) To balance: balanced micro-expression recognition. Multimedia Systems

  22. Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended Cohn–Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE computer society conference on computer vision and pattern recognition-workshops, pp 94–101. IEEE

  23. Xia Z, Hong X, Gao X, Feng X, Zhao G (2019) Spatiotemporal recurrent convolutional networks for recognizing spontaneous micro-expressions. IEEE Trans Multimedia 22(3):626–640

    Article  Google Scholar 

  24. Liu C et al (2009) Beyond pixels: exploring new representations and applications for motion analysis. Ph.D. thesis, Massachusetts Institute of Technology

  25. King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758

    Google Scholar 

  26. Zhang C, Liu S, Xu X, Zhu C (2019) C3ae: exploring the limits of compact model for age estimation. In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  27. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  28. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  29. Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1097–1105

  30. Liong S-T, Gan YS, See J, Khor H-Q, Huang Y-C (2019) Shallow triple stream three-dimensional CNN (ststnet) for micro-expression recognition. In: 2019 14th IEEE international conference on automatic face & gesture recognition (FG 2019), pp 1–5. IEEE

  31. Gan YS, Liong S-T, Yau W-C, Huang Y-C, Tan L-K (2019) Off-apexnet on micro-expression recognition system. Signal Process Image Commun 74:129–139

    Article  Google Scholar 

  32. Zhou L, Mao Q, Xue L (2019) Dual-inception network for cross-database micro-expression recognition. In: 2019 14th IEEE international conference on automatic face & gesture recognition (FG 2019), pp 1–5. IEEE

  33. Van Quang N, Chun J, Tokuyama T (2019) Capsulenet for micro-expression recognition. In: 2019 14th IEEE international conference on automatic face & gesture recognition (FG 2019), pp 1–7. IEEE

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61872042, 61972375, 62172045), the Key Project of Beijing Municipal Commission of Education (KZ201911417048), the Major Project of Technological Innovation 2030-“New Generation Artificial Intelligence” (2018AAA0100800), Premium Funding Project for Academic Human Resources Development in Beijing Union University (BPHR2020AZ01, BPH2020EZ01), and the Science and Technology Project of Beijing Municipal Commission of Education (KM202111417009, KM201811417005).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ning He.

Ethics declarations

Conflict of interest

No potential conflict of interest was reported by the authors.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the ethical standards.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, R., He, N., Liu, S. et al. Your heart rate betrays you: multimodal learning with spatio-temporal fusion networks for micro-expression recognition. Int J Multimed Info Retr 11, 553–566 (2022). https://doi.org/10.1007/s13735-022-00250-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13735-022-00250-9

Keywords

Navigation