Abstract
Gait is a distinctive human feature that can be recognized from a distance and has been widely utilized in the field of emotion recognition. In this study, we propose a novel dual-stream model (GLM) for gait emotion recognition that combines the strengths of global and local features. We extract skeleton point gait data from walking videos and process them into suitable inputs for two channels of feature extraction networks, which respectively capture global and local characteristics. To enhance the features and improve recognition accuracy, we further introduce an attention-based feature fusion module. Through experiments on benchmark datasets, our proposed model achieves high accuracy in recognizing emotions from gait data.
This work was supported by the National Key R &D Programme of China (2022YFC3803202), Major Project of Anhui Province under Grant 202203a05020011 and General Programmer of the National Natural Science Foundation of China (61976078).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bhattacharya, U., Mittal, T., Chandra, R., Randhavane, T., Bera, A., Manocha, D.: Step: spatial temporal graph convolutional networks for emotion perception from gaits. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 1342–1350 (2020)
Bhattacharya, U., et al.: Take an emotion walk: perceiving emotions from gaits using hierarchical attention pooling and affective mapping. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 145–163. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_9
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chen, X., Yan, B., Zhu, J., Wang, D., Yang, X., Lu, H.: Transformer tracking. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8126–8135 (2021)
Habibie, I., Holden, D., Schwarz, J., Yearsley, J., Komura, T.: A recurrent variational autoencoder for human motion synthesis. In: Proceedings of the British Machine Vision Conference (BMVC) (2017)
Hodgins, J.: CMU graphics lab motion capture database (2015)
Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3. 6 m: large scale datasets and predictive methods for 3d human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1325–1339 (2013)
Li, B., Zhu, C., Li, S., Zhu, T.: Identifying emotions from non-contact gaits information based on microsoft kinects. IEEE Trans. Affect. Comput. 9(4), 585–591 (2016)
Lian, Z., Li, Y., Tao, J., Huang, J.: Investigation of multimodal features, classifiers and fusion methods for emotion recognition. arXiv preprint arXiv:1809.06225 (2018)
Lin, C., Wan, J., Liang, Y., Li, S.Z.: Large-scale isolated gesture recognition using a refined fused model based on masked res-c3d network and skeleton LSTM. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 52–58. IEEE (2018)
Ma, Y., Paterson, H.M., Pollick, F.E.: A motion capture library for the study of identity, gender, and emotion perception from biological motion. Behav. Res. Methods 38(1), 134–141 (2006)
Mehrabian, A.: Pleasure-arousal-dominance: a general framework for describing and measuring individual differences in temperament. Curr. Psychol. 14, 261–292 (1996)
Michalak, J., Troje, N.F., Fischer, J., Vollmar, P., Heidenreich, T., Schulte, D.: Embodiment of sadness and depression-gait patterns associated with dysphoric mood. Psychosom. Med. 71(5), 580–587 (2009)
Narang, S., Best, A., Feng, A., Kang, S.h., Manocha, D., Shapiro, A.: Motion recognition of self and others on realistic 3d avatars. Comput. Anim. Virt. Worlds 28(3–4), e1762 (2017)
Randhavane, T., Bhattacharya, U., Kapsaskis, K., Gray, K., Bera, A., Manocha, D.: Identifying emotions from walking using affective and deep features. arXiv preprint arXiv:1906.11884 (2019)
Schurgin, M., Nelson, J., Iida, S., Ohira, H., Chiao, J., Franconeri, S.: Eye movements during emotion recognition in faces. J. Vis. 14(13), 14–14 (2014)
Si, C., Chen, W., Wang, W., Wang, L., Tan, T.: An attention enhanced graph convolutional LSTM network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1227–1236 (2019)
Tang, J., Li, K., Jin, X., Cichocki, A., Zhao, Q., Kong, W.: CTFN: hierarchical learning for multimodal sentiment analysis using coupled-translation fusion network. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 5301–5311 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, F., Sun, X. (2023). GLM: A Model Based on Global-Local Joint Learning for Emotion Recognition from Gaits Using Dual-Stream Network. In: Lu, H., et al. Image and Graphics. ICIG 2023. Lecture Notes in Computer Science, vol 14355. Springer, Cham. https://doi.org/10.1007/978-3-031-46305-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-46305-1_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46304-4
Online ISBN: 978-3-031-46305-1
eBook Packages: Computer ScienceComputer Science (R0)