Spontaneous Expression Recognition Based on Visual Attention Mechanism and Co-salient Features

Zhang, Ling; Ji, Qiumin; Jiang, Wenchao; Ning, Dongjun

doi:10.1007/978-3-030-62463-7_26

Ling Zhang¹²,
Qiumin Ji¹²,
Wenchao Jiang¹² &
…
Dongjun Ning¹³

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12488))

Included in the following conference series:

International Conference on Machine Learning for Cyber Security

826 Accesses

Abstract

Spontaneous facial expression recognition has gained much attention from researchers in recent years, however most of the existing algorithms still encounter bottlenecks in performance due to too big redundant images data in the video. In this paper, we propose a novel co-salient facial feature extraction algorithm, combined with human visual attention mechanism and group data co-processing technology, which would largely reduce the redundant information in the original images and effectively improve the recognizing accuracy of facial expressions. Firstly, based on human visual mechanism, key frames of expression are dynamically derived from the original videos to capture the temporal dynamics of facial expressions. Secondly, using key sequence frames, salient regions are obtained by multiplicative fusion algorithm and in multi-images co-operative manner. Thirdly, we get rid of these salient regions due to their little deformation and low-correlation to facial expressions, and reduce the number of facial features data. At last, we extract Local Binary Pattern (LBP) features from the remainder of facial features and use Support Vector Machine (SVM) classifier to classify them respectively. Experimental results on dataset Cohn-Kanade plus and MMI showed that our proposed method can effectively improve the recognizing accuracy of spontaneous expression sequence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

He, M.H.: Spontaneous and artificial expression analysis of recognition studies. ANhui: University of Science and Technology of China, pp. 15–16 (2014)
Google Scholar
Dahmane, M., Meunier, J.: Continuous emotion recognition using gabor energy filters. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) ACII 2011. LNCS, vol. 6975, pp. 351–358. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24571-8_46
Chapter Google Scholar
Zhu, Y., Torre, F., Cohn, J.F., Zhang, Y.: Dynamic cascades with bidirectional boost strapping for action unit detection in spontaneous facial behavior. IEEE Trans. Affect. Comput. 2(2), 79–91 (2011)
Article Google Scholar
Glodek, M., et al.: Multiple classifier systems for the classification of audio-visual emotional states. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) ACII 2011. LNCS, vol. 6975, pp. 359–368. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24571-8_47
Chapter Google Scholar
Dahmane, M., Meunier, J.: Continuous emotion recognition using gabor energy filters. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) ACII 2011. LNCS, vol. 6975, pp. 351–358. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24571-8_46
Chapter Google Scholar
Savran, A., Cao, H., Shah, M., Nenkova, A., Verma, R.: Combining video, audio and lexical indicators of affect in spontaneous conversation via paricle filtering. In: Proceedings of the 14th ACM International Conference on Multimodal Interaction Workshops, pp. 485–492 (2012)
Google Scholar
Dahmane, M., Meunier, J.: Continuous emotion recognition using gabor energy filters. In: D’Mello, S., Graesser, A., Schuller, Björn, Martin, J.C. (eds.) ACII 2011. LNCS, vol. 6975, pp. 351–358. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24571-8_46
Chapter Google Scholar
Zhu, Y., De la Torre, F., Cohn, J.F., et al.: Dynamic cascades with bidirectional bootstrapping for action unit detection in spontaneous facial behavior. IEEE Trans. Affect. Comput. 2(2), 79–91 (2011)
Article Google Scholar
Jiang, B., Valstar, M., Martinez, B., et al.: A dynamic appearance descriptor approach to facial actions temporal modeling. IEEE Trans. Cybern. 44(2), 161–174 (2014)
Article Google Scholar
Matas, J., Chum, O., Urban, M., et al.: Robust wide baseline stereo from maximally stable external regions. In: Proceedings of the British Machine Vision Conference, pp. 384–393 (2002)
Google Scholar
Tuytelars, T., Van Gool, L.: Matching widely separated views based on affine invariant region. Int. J. Comput. Vis. 59(1), 61–85 (2004). https://doi.org/10.1023/B:VISI.0000020671.28016.e8
Article Google Scholar
Kadir, T., Zisserman, A., Brady, M.: An affine invariant salient region detector. In: Pajdla, T., Matas, J. (eds.) ECCV 2004. LNCS, vol. 3021, pp. 228–241. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24670-1_18
Chapter Google Scholar
Mikolajczyk, K., Tuytelars, T., Schmid, C., et al.: A comparison of affine region detectors. Int. J. Comput. Vis. 65(1-2), 43–72 (2005). https://doi.org/10.1007/s11263-005-3848-x
Article Google Scholar
Cai, H.P., Lei, L., Chen, T., et al.: A general approach for extracting affine invariant regions. J. Acta Electrinica Sin. 36(4), 672–678 (2008). in Chinese
Google Scholar
Ramirez, G.A., Baltrušaitis, T., Morency, L.P.: Modeling latent discriminative dynamic of multi-dimensional affective signals. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.C. (eds.) ACII 2011. LNCS, vol. 6975, pp. 396–406. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-24571-8_51
Chapter Google Scholar
Nicolle, J., Rapp, V., Bailly, K., Prevost, L., Chetouani, M.: Robust continuous prediction of human emotions using multi-scale dynamic cues. In: Proceedings of the 14th ACM International Conference on Multimodal Interaction Work-shops, pp. 501–508 (2012)
Google Scholar
Haber, R., Hershenson, M.: The Psychology of Visual Perception. Holt, Rinehart and Winston, New York (1973)
Google Scholar
Huazhu, F., Xiao, C., Zhuowen, T.: Cluster-based co-saliency detection. IEEE Trans. Image Process. 22(10), 3766–3778 (2013)
Article MathSciNet Google Scholar
Lang, C., Nguyen, T.V., Katti, H., Yadati, K., Kankanhalli, M., Yan, S.: Depth matters: influence of depth cues on visual saliency. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 101–115. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_8
Chapter Google Scholar
Hu, Y.X., Wan, L.: Heterogeneous image fusion based on visual attention mechanism. Comput. Eng. 41(3), 247–252 (2015)
Google Scholar
Shao, H., Wang, Y., et al.: Dynamic sequence emotional recognition based on AAM and optical flow method. Comput. Eng. Des. 38(6), 1642–1656 (2017)
Google Scholar

Download references

Acknowledgements

This paper is funded by Scientific Project of Guangdong Provincial Transport Department (No. Sci & Tec-2016-02-30), Surface Project of Natural Science Foundation of Guangdong Province (No. 2016A030313703 and 2016A030313713).

Author information

Authors and Affiliations

Faculty of Computer, Guangdong University of Technology, Guangzhou, China
Ling Zhang, Qiumin Ji & Wenchao Jiang
Taotall Technology Co., Ltd., Guangzhou, China
Dongjun Ning

Authors

Ling Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Qiumin Ji
View author publications
You can also search for this author in PubMed Google Scholar
Wenchao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Dongjun Ning
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenchao Jiang .

Editor information

Editors and Affiliations

Xidian University, Xi'an, China
Xiaofeng Chen
Guangzhou University, Guangzhou, China
Hongyang Yan
Michigan State University, East Lansing, MI, USA
Qiben Yan
Division of Computer, Electrical and Mathematical Sciences and Engineering, King Abdullah University of Science, Thuwal, Saudi Arabia
Xiangliang Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, L., Ji, Q., Jiang, W., Ning, D. (2020). Spontaneous Expression Recognition Based on Visual Attention Mechanism and Co-salient Features. In: Chen, X., Yan, H., Yan, Q., Zhang, X. (eds) Machine Learning for Cyber Security. ML4CS 2020. Lecture Notes in Computer Science(), vol 12488. Springer, Cham. https://doi.org/10.1007/978-3-030-62463-7_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-62463-7_26
Published: 11 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62462-0
Online ISBN: 978-3-030-62463-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics