Skip to main content

Lightweight Multispectral Skeleton and Multi-stream Graph Attention Networks for Enhanced Action Prediction with Multiple Modalities

  • Conference paper
  • First Online:
Pattern Recognition and Computer Vision (PRCV 2023)

Abstract

Human action recognition methods often focus on extracting structural and temporal information from skeleton-based graphs. However, these approaches struggle with effectively capturing and processing extensive information during action transitions. To overcome this limitation, we propose LMS-GAT, a novel approach that facilitates information exchange through node concentration and diffusion across structural and temporal dimensions. By selectively suppressing and reinstating the representations of structural nodes for each specific action, and utilizing hierarchical shifted temporal windows for assessing temporal information, LMS-GAT addresses the challenge of dynamic changes in action recognition. Experimental evaluation on NTU RGB+D 60 and 120 datasets shows that LMS-GAT outperforms state-of-the-art methods in terms of prediction accuracy. This highlights the efficacy of our approach in capturing and recognizing human actions with improved performance.

Supported by the National Natural Science Foundation of China under Grant 62002074 and 62072452; Supported by the Shenzhen Science and Technology Program JCYJ20200109115627045, in part by the Regional Joint Fund of Guangdong under Grant 2021B1515120011.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Huang, T., Huang, J., Pang, Y., Yan, H.: Smart contract watermarking based on code obfuscation. Inf. Sci. 628, 439–448 (2023)

    Article  Google Scholar 

  2. Li, J., et al.: Efficient and secure outsourcing of differentially private data publishing with multiple evaluators. IEEE Trans. Dependable Secure Comput. 19(01), 67–76 (2022)

    Article  Google Scholar 

  3. Dong, C.-Z., Catbas, F.N.: A review of computer vision-based structural health monitoring at local and global levels. Struct. Health Monit. 20(2), 692–743 (2021)

    Article  Google Scholar 

  4. Senior, A., et al.: Enabling video privacy through computer vision. IEEE Secur. Priv. 3(3), 50–57 (2005)

    Article  Google Scholar 

  5. Kosch, T., Welsch, R., Chuang, L., Schmidt, A.: The placebo effect of artificial intelligence in human-computer interaction. ACM Trans. Comput.-Hum. Interact. 29(6), 1–32 (2023)

    Article  Google Scholar 

  6. Li, M., Chen, S., Chen, X., Zhang, Y., Wang, Y., Tian, Q.: Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3595–3603 (2019)

    Google Scholar 

  7. Hao, X., Li, J., Guo, Y., Jiang, T., Yu, M.: Hypergraph neural network for skeleton-based action recognition. IEEE Trans. Image Process. 30, 2263–2275 (2021)

    Article  MathSciNet  Google Scholar 

  8. Plizzari, C., Cannici, M., Matteucci, M.: Skeleton-based action recognition via spatial and temporal transformer networks. Comput. Vis. Image Underst. 208, 103219 (2021)

    Article  Google Scholar 

  9. Yang, C., Xu, Y., Shi, J., Dai, B., Zhou, B.: Temporal pyramid network for action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 591–600 (2020)

    Google Scholar 

  10. Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1 (2018)

    Google Scholar 

  11. Chen, Y., Zhang, Z., Yuan, C., Li, B., Deng, Y., Hu, W.: Channel-wise topology refinement graph convolution for skeleton-based action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13359–13368 (2021)

    Google Scholar 

  12. Chi, H.-G., Ha, M. H., Chi, S., Lee, S.W., Huang, Q., Ramani, K.: InfoGCN: representation learning for human skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20186–20196 (2022)

    Google Scholar 

  13. Pang, Y., et al.: Graph decipher: a transparent dual-attention graph neural network to understand the message-passing mechanism for the node classification. Int. J. Intell. Syst. 37(11), 8747–8769 (2022)

    Article  Google Scholar 

  14. Liu, Z., Zhang, H., Chen, Z., Wang, Z., Ouyang, W.: Disentangling and unifying graph convolutions for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 143–152 (2020)

    Google Scholar 

  15. Plizzari, C., Cannici, M., Matteucci, M.: Spatial temporal transformer network for skeleton-based action recognition. In: Del Bimbo, A., et al. (eds.) ICPR 2021. LNCS, vol. 12663, pp. 694–701. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68796-0_50

    Chapter  Google Scholar 

  16. Goyal, P., Chhetri, S.R., Canedo, A.: dyngraph2vec: capturing network dynamics using dynamic graph representation learning. Knowl.-Based Syst. 187, 104816 (2020)

    Article  Google Scholar 

  17. Hajiramezanali, E., Hasanzadeh, A., Narayanan, K., Duffield, N., Zhou, M., Qian, X.: Variational graph recurrent neural networks, arXiv preprint arXiv:1908.09710 (2019)

  18. Xu, D., Ruan, C., Korpeoglu, E., Kumar, S., Achan, K.: Inductive representation learning on temporal graphs, arXiv preprint arXiv:2002.07962 (2020)

  19. Sankar, A., Wu, Y., Gou, L., Zhang, W., Yang, H.: DySAT: deep neural representation learning on dynamic graphs via self-attention networks. In: Proceedings of the 13th International Conference on Web Search and Data Mining, pp. 519–527 (2020)

    Google Scholar 

  20. Pang, Y., et al.: Sparse-DYN: sparse dynamic graph multirepresentation learning via event-based sparse temporal attention network. Int. J. Intell. Syst. 37(11), 8770–8789 (2022)

    Article  Google Scholar 

  21. Shahroudy, A., Liu, J., Ng, T.T., Wang, G.: NTU RGB+ D: a large scale dataset for 3D human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010–1019 (2016)

    Google Scholar 

  22. Liu, J., Shahroudy, A., Perez, M., Wang, G., Duan, L.-Y., Kot, A.C.: NTU RGB+ D 120: a large-scale benchmark for 3D human activity understanding. IEEE Trans. Pattern Anal. Mach. Intell. 42(10), 2684–2701 (2019)

    Article  Google Scholar 

  23. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  24. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Teng Huang or Xi Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, T., Kong, W., Liang, J., Ding, Z., Li, H., Zhang, X. (2024). Lightweight Multispectral Skeleton and Multi-stream Graph Attention Networks for Enhanced Action Prediction with Multiple Modalities. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14425. Springer, Singapore. https://doi.org/10.1007/978-981-99-8429-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8429-9_6

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8428-2

  • Online ISBN: 978-981-99-8429-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics