Emotion Recognition via 3D Skeleton Based Gait Analysis Using Multi-thread Attention Graph Convolutional Networks

Lu, Jiachen; Wang, Zhihao; Zhang, Zhongguang; Du, Yawen; Zhou, Yulin; Wang, Zhao

doi:10.1007/978-981-99-8469-5_6

Jiachen Lu¹⁵,
Zhihao Wang¹⁵,
Zhongguang Zhang¹⁵,
Yawen Du¹⁶,
Yulin Zhou^16,17 &
…
Zhao Wang^16,17

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14429))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

764 Accesses
1 Citations

Abstract

Human gait is a manner of individual’s walking that observers are able to learn useful information through daily walking activities. Recently, gait skeletons-based emotion recognition has attracted much attention, while many methods have been proposed gradually. Skeleton-based representations offer several advantages for recognition tasks. In particular, such representation is extremely lightweight and could be directly extracted from video data using off-the-shelf algorithms. Moreover, skeleton data is not tied to any specific cultural or ethnic context, it hence become increasingly popular in recent years for cross-cultural studies and other related applications. To effectively process this type of data, many researchers have turned to Graph Convolutional Networks (GCNs) to leverage the topological structure of the data, which improves performance by modeling the relationships between different joints and body parts as a graph. This allows GCNs to capture complex spatial and temporal patterns. In this work, we have constructed an efficient multi-stream GCN framework for emotion recognition task. We have identified the complementary effect among streams using a multi-thread attention method (MTA), which is able to improve the emotion recognition performance. In addition, the proposed MTA graph convolution layer is able to extract effective features from the topology of the graph to further improve recognition performance. The proposed method outperforms state-of-art methods on challenging benchmark dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Skeleton-Based Action Recognition with Dense Spatial Temporal Graph Network

Graph Attention Convolutional Network with Motion Tempo Enhancement for Skeleton-Based Action Recognition

MGSAN: multimodal graph self-attention network for skeleton-based action recognition

Article Open access 27 November 2024

References

Barrett, L.F.: How Emotions are Made: The Secret Life of the Brain. Pan Macmillan (2017)
Google Scholar
Bhattacharya, U., Mittal, T., Chandra, R., Randhavane, T., Bera, A., Manocha, D.: STEP: spatial temporal graph convolutional networks for emotion perception from gaits. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 1342–1350 (2020)
Google Scholar
Bhattacharya, U., et al.: Take an emotion walk: perceiving emotions from gaits using hierarchical attention pooling and affective mapping. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 145–163. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_9
Chapter Google Scholar
Chai, S., et al.: A multi-head pseudo nodes based spatial-temporal graph convolutional network for emotion perception from gait. Neurocomputing 511, 437–447 (2022)
Article Google Scholar
Chen, T., et al.: Learning multi-granular spatio-temporal graph network for skeleton-based action recognition. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 4334–4342 (2021)
Google Scholar
Chen, Z., Li, S., Yang, B., Li, Q., Liu, H.: Multi-scale spatial temporal graph convolutional network for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 1113–1122 (2021)
Google Scholar
Crenn, A., Khan, R.A., Meyer, A., Bouakaz, S.: Body expression recognition from animated 3D skeleton. In: 2016 International Conference on 3D Imaging (IC3D), pp. 1–7. IEEE (2016)
Google Scholar
Daoudi, M., Berretti, S., Pala, P., Delevoye, Y., Del Bimbo, A.: Emotion recognition by body movement representation on the manifold of symmetric positive definite matrices. In: Battiato, S., Gallo, G., Schettini, R., Stanco, F. (eds.) ICIAP 2017. LNCS, vol. 10484, pp. 550–560. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68560-1_49
Chapter Google Scholar
Hou, R., Li, Y., Zhang, N., Zhou, Y., Yang, X., Wang, Z.: Shifting perspective to see difference: a novel multi-view method for skeleton based action recognition. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 4987–4995 (2022)
Google Scholar
Hou, R., Wang, Z., Ren, R., Cao, Y., Wang, Z.: Multi-channel network: constructing efficient GCN baselines for skeleton-based action recognition. Compu. Graph. 110, 111–117 (2023)
Google Scholar
Hu, C., Sheng, W., Dong, B., Li, X.: TNTC: two-stream network with transformer-based complementarity for gait-based emotion recognition. In: ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3229–3233. IEEE (2022)
Google Scholar
Li, B., Zhu, C., Li, S., Zhu, T.: Identifying emotions from non-contact gaits information based on microsoft kinects. IEEE Trans. Affect. Comput. 9(4), 585–591 (2016)
Google Scholar
Li, B., Li, X., Zhang, Z., Wu, F.: Spatio-temporal graph routing for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8561–8568 (2019)
Google Scholar
Li, S., Cui, L., Zhu, C., Li, B., Zhao, N., Zhu, T.: Emotion recognition using kinect motion capture data of human gaits. PeerJ 4, e2364 (2016)
Article Google Scholar
Liu, W., Zheng, W.-L., Lu, B.-L.: Emotion recognition using multimodal deep learning. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016. LNCS, vol. 9948, pp. 521–529. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46672-9_58
Chapter Google Scholar
Liu, Z., Zhang, H., Chen, Z., Wang, Z., Ouyang, W.: Disentangling and unifying graph convolutions for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 143–152 (2020)
Google Scholar
Lu, H., Xu, S., Zhao, S., Hu, X., Ma, R., Hu, B.: EPIC: emotion perception by spatio-temporal interaction context of gait. IEEE J. Biomed. Health Inf. (2023)
Google Scholar
Ma, R., Hu, H., Xing, S., Li, Z.: Efficient and fast real-world noisy image denoising by combining pyramid neural network and two-pathway unscented Kalman filter. IEEE Trans. Image Process. 29, 3927–3940 (2020)
Article Google Scholar
Ma, R., Li, S., Zhang, B., Fang, L., Li, Z.: Flexible and generalized real photograph denoising exploiting dual meta attention. IEEE Trans. Cybern. (2022)
Google Scholar
Ma, R., Li, S., Zhang, B., Hu, H.: Meta PID attention network for flexible and efficient real-world noisy image denoising. IEEE Trans. Image Process. 31, 2053–2066 (2022)
Article Google Scholar
Ma, R., Li, S., Zhang, B., Li, Z.: Towards fast and robust real image denoising with attentive neural network and PID controller. IEEE Trans. Multimedia 24, 2366–2377 (2021)
Article Google Scholar
Ma, R., Li, S., Zhang, B., Li, Z.: Generative adaptive convolutions for real-world noisy image denoising. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 1935–1943 (2022)
Google Scholar
Ma, R., Zhang, B., Zhou, Y., Li, Z., Lei, F.: PID controller-guided attention neural network learning for fast and effective real photographs denoising. IEEE Trans. Neural Netw. Learn. Syst. 33(7), 3010–3023 (2021)
Article Google Scholar
Muhammad, G., Hossain, M.S.: Emotion recognition for cognitive edge computing using deep learning. IEEE Internet Things J. 8(23), 16894–16901 (2021)
Article Google Scholar
Narayanan, V., Manoghar, B.M., Dorbala, V.S., Manocha, D., Bera, A.: ProxEmo: gait-based emotion learning and multi-view proxemic fusion for socially-aware robot navigation. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8200–8207. IEEE (2020)
Google Scholar
Qin, Z., et al.: Fusing higher-order features in graph neural networks for skeleton-based action recognition (2021)
Google Scholar
Randhavane, T., Bhattacharya, U., Kapsaskis, K., Gray, K., Bera, A., Manocha, D.: Identifying emotions from walking using affective and deep features. arXiv preprint arXiv:1906.11884 (2019)
Sheng, W., Li, X.: Multi-task learning for gait-based identity recognition and emotion recognition using attention enhanced temporal graph convolutional network. Pattern Recogn. 114, 107868 (2021)
Article Google Scholar
Shi, L., Zhang, Y., Cheng, J., Lu, H.: Skeleton-based action recognition with directed graph neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7912–7921 (2019)
Google Scholar
Shi, L., Zhang, Y., Cheng, J., Lu, H.: Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12026–12035 (2019)
Google Scholar
Song, Y.F., Zhang, Z., Shan, C., Wang, L.: Constructing stronger and faster baselines for skeleton-based action recognition. arXiv preprint arXiv:2106.15125 (2021)
Vu, M.T., Beurton-Aimar, M., Marchand, S.: Multitask multi-database emotion recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3637–3644 (2021)
Google Scholar
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Google Scholar
Zhang, J., Yin, Z., Chen, P., Nichele, S.: Emotion recognition using multi-modal data and machine learning techniques: a tutorial and review. Inf. Fusion 59, 103–126 (2020)
Article Google Scholar
Zhang, X., Xu, C., Tao, D.: Context aware graph convolution for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14333–14342 (2020)
Google Scholar
Zhuang, Y., Lin, L., Tong, R., Liu, J., Iwamot, Y., Chen, Y.W.: G-GCSN: global graph convolution shrinkage network for emotion perception from gait. In: Proceedings of the Asian Conference on Computer Vision (2020)
Google Scholar

Download references

Acknowledgements

This research has been supported by National Key Research and Development Project of China (Grant No. 2021ZD0110505), Natural Key Research and Development Project of Zhejiang Province (Grant No. 2023C01043) and Ningbo Natural Science Foundation (Grant 2022Z072, 2023Z236).

Author information

Authors and Affiliations

School of Software Technology, Zhejiang University, Hangzhou, China
Jiachen Lu, Zhihao Wang & Zhongguang Zhang
Ningbo Innovation Center, Zhejiang University, Ningbo, China
Yawen Du, Yulin Zhou & Zhao Wang
Ningbo Tech University, Ningbo, China
Yulin Zhou & Zhao Wang

Authors

Jiachen Lu
View author publications
You can also search for this author in PubMed Google Scholar
Zhihao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhongguang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yawen Du
View author publications
You can also search for this author in PubMed Google Scholar
Yulin Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Zhao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhao Wang .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lu, J., Wang, Z., Zhang, Z., Du, Y., Zhou, Y., Wang, Z. (2024). Emotion Recognition via 3D Skeleton Based Gait Analysis Using Multi-thread Attention Graph Convolutional Networks. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14429. Springer, Singapore. https://doi.org/10.1007/978-981-99-8469-5_6

Download citation

DOI: https://doi.org/10.1007/978-981-99-8469-5_6
Published: 25 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8468-8
Online ISBN: 978-981-99-8469-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Emotion Recognition via 3D Skeleton Based Gait Analysis Using Multi-thread Attention Graph Convolutional Networks