Skip to main content

Spontaneous Expression Recognition Based on Visual Attention Mechanism and Co-salient Features

  • Conference paper
  • First Online:
Book cover Machine Learning for Cyber Security (ML4CS 2020)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12488))

Included in the following conference series:

  • 826 Accesses

Abstract

Spontaneous facial expression recognition has gained much attention from researchers in recent years, however most of the existing algorithms still encounter bottlenecks in performance due to too big redundant images data in the video. In this paper, we propose a novel co-salient facial feature extraction algorithm, combined with human visual attention mechanism and group data co-processing technology, which would largely reduce the redundant information in the original images and effectively improve the recognizing accuracy of facial expressions. Firstly, based on human visual mechanism, key frames of expression are dynamically derived from the original videos to capture the temporal dynamics of facial expressions. Secondly, using key sequence frames, salient regions are obtained by multiplicative fusion algorithm and in multi-images co-operative manner. Thirdly, we get rid of these salient regions due to their little deformation and low-correlation to facial expressions, and reduce the number of facial features data. At last, we extract Local Binary Pattern (LBP) features from the remainder of facial features and use Support Vector Machine (SVM) classifier to classify them respectively. Experimental results on dataset Cohn-Kanade plus and MMI showed that our proposed method can effectively improve the recognizing accuracy of spontaneous expression sequence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. He, M.H.: Spontaneous and artificial expression analysis of recognition studies. ANhui: University of Science and Technology of China, pp. 15–16 (2014)

    Google Scholar 

  2. Dahmane, M., Meunier, J.: Continuous emotion recognition using gabor energy filters. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) ACII 2011. LNCS, vol. 6975, pp. 351–358. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24571-8_46

    Chapter  Google Scholar 

  3. Zhu, Y., Torre, F., Cohn, J.F., Zhang, Y.: Dynamic cascades with bidirectional boost strapping for action unit detection in spontaneous facial behavior. IEEE Trans. Affect. Comput. 2(2), 79–91 (2011)

    Article  Google Scholar 

  4. Glodek, M., et al.: Multiple classifier systems for the classification of audio-visual emotional states. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) ACII 2011. LNCS, vol. 6975, pp. 359–368. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24571-8_47

    Chapter  Google Scholar 

  5. Dahmane, M., Meunier, J.: Continuous emotion recognition using gabor energy filters. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) ACII 2011. LNCS, vol. 6975, pp. 351–358. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24571-8_46

    Chapter  Google Scholar 

  6. Savran, A., Cao, H., Shah, M., Nenkova, A., Verma, R.: Combining video, audio and lexical indicators of affect in spontaneous conversation via paricle filtering. In: Proceedings of the 14th ACM International Conference on Multimodal Interaction Workshops, pp. 485–492 (2012)

    Google Scholar 

  7. Dahmane, M., Meunier, J.: Continuous emotion recognition using gabor energy filters. In: D’Mello, S., Graesser, A., Schuller, Björn, Martin, J.C. (eds.) ACII 2011. LNCS, vol. 6975, pp. 351–358. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24571-8_46

    Chapter  Google Scholar 

  8. Zhu, Y., De la Torre, F., Cohn, J.F., et al.: Dynamic cascades with bidirectional bootstrapping for action unit detection in spontaneous facial behavior. IEEE Trans. Affect. Comput. 2(2), 79–91 (2011)

    Article  Google Scholar 

  9. Jiang, B., Valstar, M., Martinez, B., et al.: A dynamic appearance descriptor approach to facial actions temporal modeling. IEEE Trans. Cybern. 44(2), 161–174 (2014)

    Article  Google Scholar 

  10. Matas, J., Chum, O., Urban, M., et al.: Robust wide baseline stereo from maximally stable external regions. In: Proceedings of the British Machine Vision Conference, pp. 384–393 (2002)

    Google Scholar 

  11. Tuytelars, T., Van Gool, L.: Matching widely separated views based on affine invariant region. Int. J. Comput. Vis. 59(1), 61–85 (2004). https://doi.org/10.1023/B:VISI.0000020671.28016.e8

    Article  Google Scholar 

  12. Kadir, T., Zisserman, A., Brady, M.: An affine invariant salient region detector. In: Pajdla, T., Matas, J. (eds.) ECCV 2004. LNCS, vol. 3021, pp. 228–241. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24670-1_18

    Chapter  Google Scholar 

  13. Mikolajczyk, K., Tuytelars, T., Schmid, C., et al.: A comparison of affine region detectors. Int. J. Comput. Vis. 65(1-2), 43–72 (2005). https://doi.org/10.1007/s11263-005-3848-x

    Article  Google Scholar 

  14. Cai, H.P., Lei, L., Chen, T., et al.: A general approach for extracting affine invariant regions. J. Acta Electrinica Sin. 36(4), 672–678 (2008). in Chinese

    Google Scholar 

  15. Ramirez, G.A., Baltrušaitis, T., Morency, L.P.: Modeling latent discriminative dynamic of multi-dimensional affective signals. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) ACII 2011. LNCS, vol. 6975, pp. 396–406. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24571-8_51

    Chapter  Google Scholar 

  16. Nicolle, J., Rapp, V., Bailly, K., Prevost, L., Chetouani, M.: Robust continuous prediction of human emotions using multi-scale dynamic cues. In: Proceedings of the 14th ACM International Conference on Multimodal Interaction Work-shops, pp. 501–508 (2012)

    Google Scholar 

  17. Haber, R., Hershenson, M.: The Psychology of Visual Perception. Holt, Rinehart and Winston, New York (1973)

    Google Scholar 

  18. Huazhu, F., Xiao, C., Zhuowen, T.: Cluster-based co-saliency detection. IEEE Trans. Image Process. 22(10), 3766–3778 (2013)

    Article  MathSciNet  Google Scholar 

  19. Lang, C., Nguyen, T.V., Katti, H., Yadati, K., Kankanhalli, M., Yan, S.: Depth matters: influence of depth cues on visual saliency. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 101–115. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_8

    Chapter  Google Scholar 

  20. Hu, Y.X., Wan, L.: Heterogeneous image fusion based on visual attention mechanism. Comput. Eng. 41(3), 247–252 (2015)

    Google Scholar 

  21. Shao, H., Wang, Y., et al.: Dynamic sequence emotional recognition based on AAM and optical flow method. Comput. Eng. Des. 38(6), 1642–1656 (2017)

    Google Scholar 

Download references

Acknowledgements

This paper is funded by Scientific Project of Guangdong Provincial Transport Department (No. Sci & Tec-2016-02-30), Surface Project of Natural Science Foundation of Guangdong Province (No. 2016A030313703 and 2016A030313713).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenchao Jiang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, L., Ji, Q., Jiang, W., Ning, D. (2020). Spontaneous Expression Recognition Based on Visual Attention Mechanism and Co-salient Features. In: Chen, X., Yan, H., Yan, Q., Zhang, X. (eds) Machine Learning for Cyber Security. ML4CS 2020. Lecture Notes in Computer Science(), vol 12488. Springer, Cham. https://doi.org/10.1007/978-3-030-62463-7_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-62463-7_26

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-62462-0

  • Online ISBN: 978-3-030-62463-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics