Action Tree Convolutional Networks: Skeleton-Based Human Action Recognition

Liu, Wenjie; Zhang, Ziyi; Han, Bing; Zhu, Chenhui

doi:10.1007/978-3-030-00764-5_72

Wenjie Liu¹⁸,
Ziyi Zhang¹⁸,
Bing Han¹⁸ &
…
Chenhui Zhu¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11166))

Included in the following conference series:

Pacific Rim Conference on Multimedia

3231 Accesses

Abstract

This paper is mainly about addressing the problem of skeleton-based human activity recognition: ignoring the structure and relationship between skeleton joints and body-parts, the existence of a large amount of useless information in the activity data, and poor generalization ability. In order to solve the shortcomings of existing mainstream methods used for human action recognition, we propose a novel method named Action Tree Convolutional Networks (ATCNs). This method uses a data based auto-designed Action Tree network to dynamically generate a tree of nodes/body-parts and a semantic attention center, profoundly emphasizing the relations and semantics of nodes/body-parts. This method we introduced has a great improvement on the previous algorithm’s neglect of the importance of nodes/body-parts relation, and improves the generalization ability of the algorithm. Through experiments on Kinetics and NTU-RGB+D datasets, our method achieves better performance improvements over other state-of-the-art methods.

Z. Zhang, B. Han and C. Zhu—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Karpathy, A., Toderici, G., Shetty, et al.: Large-scale video classification with convolutional neural networks. In: CVPR, pp. 1725–1732 (2014)
Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: ICCV, pp. 4489–4497. IEEE, Washington (2015)
Google Scholar
Baradel, F., Wolf, C., Mille, J., Taylor, G.W.: Glimpse clouds: human activity recognition from unstructured feature points. arXiv preprint arXiv:1802.07898 (2018)
Amir, S., Jun, L., Tian-Tsong, N., Gang, W.: NTU RGB+D: a large scale dataset for 3D human activity analysis. In: CVPR (2016)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: ICLR (2017)
Google Scholar
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: AAAI (2018)
Google Scholar
Lin, L., Wang, G., Zhang, R., et al.: Deep structured scene parsing by learning with image descriptions. In: Computer Vision and Pattern Recognition, pp. 2276–2284. IEEE, Washington (2016)
Google Scholar
Kay, W., Carreira, J., Simonyan, K., et al.: The kinetics human action video dataset. In: arXiv preprint arXiv:1705.06950 (2017)
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: CVPR, vol. 1, no. 2, p. 7 (2017, July)
Google Scholar
Simon, T., Joo, H., Matthews, I., Sheikh, Y.: Hand keypoint detection in single images using multiview bootstrapping. In: CVPR, vol. 2, July 2017
Google Scholar
Wei, S.E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: CVPR, pp. 4724–4732 (2016)
Google Scholar
Kim, T.S., Reiter, A.: Interpretable 3D human action analysis with temporal convolutional networks. In: CVPRW, pp. 1623–1631. IEEE, Washington, July 2017
Google Scholar
Veeriah, V., Zhuang, N., Qi, G.J.: Differential recurrent neural networks for action recognition. In: Computer Vision (ICCV), pp. 4041–4049. IEEE, Washington, December 2015
Google Scholar
Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: CVPR, pp. 1110–1118 (2015)
Google Scholar
Liu, J., Shahroudy, A., Xu, D., Wang, G.: Spatio-temporal LSTM with trust gates for 3D human action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 816–833. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_50
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

University of Electronic Science and Technology of China, Chengdu, 611731, Sichuan, China
Wenjie Liu, Ziyi Zhang, Bing Han & Chenhui Zhu

Authors

Wenjie Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ziyi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bing Han
View author publications
You can also search for this author in PubMed Google Scholar
Chenhui Zhu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenjie Liu .

Editor information

Editors and Affiliations

Hefei University of Technology, Hefei, China
Richang Hong
National Chiao Tung University, Hsinchu, Taiwan
Wen-Huang Cheng
University of Tokyo, Tokyo, Japan
Toshihiko Yamasaki
Hefei University of Technology, Hefei, China
Meng Wang
City University of Hong Kong, Hong Kong, Hong Kong
Chong-Wah Ngo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, W., Zhang, Z., Han, B., Zhu, C. (2018). Action Tree Convolutional Networks: Skeleton-Based Human Action Recognition. In: Hong, R., Cheng, WH., Yamasaki, T., Wang, M., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2018. PCM 2018. Lecture Notes in Computer Science(), vol 11166. Springer, Cham. https://doi.org/10.1007/978-3-030-00764-5_72

Download citation

DOI: https://doi.org/10.1007/978-3-030-00764-5_72
Published: 18 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00763-8
Online ISBN: 978-3-030-00764-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics