Skip to main content

Human Action Recognition from 3D Landmark Points of the Performer

  • Conference paper
  • First Online:
Computer Vision and Image Processing (CVIP 2020)

Abstract

Recognizing human actions is an active research area, where pose of the performer is an important cue for recognition. However, applying the 3D landmark points of the performer in recognizing action, is relatively less explored area of research due to the challenge involved in the process of extracting 3D landmark points from single view of the performers. With the recent advancements in the area of 3D landmark point detection, exploiting the landmark points in recognizing human action, is a good idea. We propose a technique for Human Action Recognition by learning the 3D landmark points of human pose, obtained from single image. We apply an autoencoder architecture followed by a regression layer to estimate the pose parameters like shape, gesture and camera position, which are later mapped to the 3D landmark points by Skinned Multi Person Linear Model (SMPL model). The proposed method is a novel attempt to apply a CNN based 3D pose reconstruction model (autoencoder) for recognizing action. Further, instead of using the autoencoder as a classifier to classify to 3D poses, we replace the decoder part by a regressor to obtain the landmark points, which are then fed into a classifier. The 3D landmark points of the human performer(s) at each frame, are fed into a neural network classifier as features for recognizing action.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Fan, Z., Ling, S., Jin, X., Yi, F.: From handcrafted to learned representations for human action recognition: a survey. Image Vis. Comput. 55, 42–52 (2016)

    Article  Google Scholar 

  2. Maryam, Z., Robert, B.: Semantic human activity recognition: a literature review. Pattern Recogn. 48(8), 2329–2345 (2015)

    Article  Google Scholar 

  3. Chaudhry, R., Ravichandran, A., Hager, G., Vidal, R.: Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: CVPR, pp. 1–8. IEEE (2009)

    Google Scholar 

  4. Mukherjee, S., Biswas, S.K., Mukherjee, D.P.: Recognizing human action at a distance in video by key poses. IEEE Trans. CSVT 21(9), 1228–1241 (2011)

    Google Scholar 

  5. Wang H., Schmid C.: Action recognition with improved trajectories. In: ICCV, pp. 3551–3558. IEEE (2013)

    Google Scholar 

  6. Mukherjee, S.: Human action recognition using dominant pose duplet. In: Nalpantidis, L., Krüger, V., Eklundh, J.-O., Gasteratos, A. (eds.) ICVS 2015. LNCS, vol. 9163, pp. 488–497. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20904-3_44

    Chapter  Google Scholar 

  7. Laptev I., Marszałek M., Schmid C., Rozenfeld B.: Learning realistic human actions from movies. In: CVPR, pp. 1–8. IEEE (2008)

    Google Scholar 

  8. Das Dawn, D., Shaikh, S.H.: A comprehensive survey of human action recognition with spatio-temporal interest point (STIP) detector. Vis. Comput. 32(3), 289–306 (2015). https://doi.org/10.1007/s00371-015-1066-2

    Article  Google Scholar 

  9. Vinodh, B., Sunitha, G.T., Mukherjee, S.: Event recognition in egocentric videos using a novel trajectory based feature. In: ICVGIP, pp. 76:1–76:8. ACM (2016)

    Google Scholar 

  10. Nazir, S., Yousaf, M.H., Nebel, J.-C., Velastin, S.A.: A bag of expression framework for improved human action recognition. Pattern Recogn. Lett. 103, 39–45 (2018)

    Article  Google Scholar 

  11. Herath, S., Harandi, M.T., Porikli, F.M.: Going deeper into action recognition: a survey. Image Vis. Comput. (2017). https://doi.org/10.1016/j.imavis.2017.01.010

    Article  Google Scholar 

  12. Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: CVPR, pp. 1–9. IEEE (2016)

    Google Scholar 

  13. Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. In: ICML, pp. 1–8 (2010)

    Google Scholar 

  14. Hara, K., Kataoka, H., Satoh, Y.: Can spatio-temporal 3D CNNs retrace the history of 2D CNNs and ImageNet? In: CVPR, pp. 6546–6555. IEEE (2018)

    Google Scholar 

  15. Li, C., Zhong, Q., Xie, D., Pu, S.: Collaborative spatiotemporal feature learning for video action recognition. In: CVPR, pp. 7872–7881. IEEE (2019)

    Google Scholar 

  16. Wu, C.-Y., Zaheer, M., Hu, H., Manmatha, R., Smola, A.J., Krahenbuhl, P.: Compressed video action recognition. In: CVPR, pp. 6026–6035. IEEE (2018)

    Google Scholar 

  17. Shou, Z., et al.: DMC-Net: generating discriminative motion cues for fast compressed video action recognition. In: CVPR, pp. 1–10. IEEE (2019)

    Google Scholar 

  18. Singh, K.K., Mukherjee, S.: Recognizing human activities in videos using improved dense trajectories over LSTM. In: Rameshan, R., Arora, C., Dutta Roy, S. (eds.) NCVPRIPG 2017. CCIS, vol. 841, pp. 78–88. Springer, Singapore (2018). https://doi.org/10.1007/978-981-13-0020-2_8

    Chapter  Google Scholar 

  19. Li, C., Wang, P., Wang, S., Hou, Y., Li, W.: Skeleton-based action recognition using LSTM and CNN. In: ICME Workshops, pp. 585–590. IEEE (2017)

    Google Scholar 

  20. Li, C., et al.: Deep manifold structure transfer for action recognition. IEEE Trans. Image Process. 28, 4646–4658 (2019)

    Article  MathSciNet  Google Scholar 

  21. Uddin, M.A., Lee, Y.-K.: Feature fusion of deep spatial features and handcrafted spatiotemporal features for human action recognition. Sensors 19(7), 1599 (2019). https://doi.org/10.3390/s19071599

    Article  Google Scholar 

  22. Loper, M., Mahmood, N., Romero, J., Gerard, P.-M., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. 34, 248:1–248:16 (2015)

    Article  Google Scholar 

  23. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, pp. 1–8. IEEE (2005)

    Google Scholar 

  24. Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: CVPR, pp. 1–10. IEEE (2018)

    Google Scholar 

  25. Nagalakshmi, C., Mukherjee S.: Classification of yoga asana from single image by learning 3D view of human pose. In: ICVGIP Workshops. Springer (2018). https://doi.org/10.1007/978-3-030-57907-4_1

  26. Soomro K., Zamir A.R., Shah M.: UCF101: a dataset of 101 human action classes from videos in the wild. Report no. CRCV-TR-12-01 (November 2012)

    Google Scholar 

  27. Zhang, P., Lan, C., Zeng, W., Xing, J., Xue, J., Zheng, N.: Semantics-guided neural networks for efficient skeleton-based human action recognition. In: CVPR, pp. 1112–1121 (2020)

    Google Scholar 

  28. Materzynska, J., Xiao, T., Herzig, R., Xu, H., Wang, X., Darrell, T.: Something-else: compositional action recognition with spatial-temporal interaction networks. In: CVPR, pp. 1049–1059 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mukherjee, S., Nagalakshmi, C. (2021). Human Action Recognition from 3D Landmark Points of the Performer. In: Singh, S.K., Roy, P., Raman, B., Nagabhushan, P. (eds) Computer Vision and Image Processing. CVIP 2020. Communications in Computer and Information Science, vol 1377. Springer, Singapore. https://doi.org/10.1007/978-981-16-1092-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-1092-9_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-1091-2

  • Online ISBN: 978-981-16-1092-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics