Skip to main content

MCTD: Motion-Coordinate-Time Descriptor for 3D Skeleton-Based Action Recognition

  • Conference paper
  • First Online:
Advances in Multimedia Information Processing – PCM 2017 (PCM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10735))

Included in the following conference series:

  • 2761 Accesses

Abstract

During the past few years, 3D-skeleton based action recognition has received increasing research attentions. Numerous approaches have been proposed. Most existing approaches extract the pose features in each frame along the video sequence for recognizing different actions. However, the motion information between adjacent poses is missing. In this paper, we propose a new descriptor by employing motion for action recognition. In our approach, the Lie algebra is employed to extract the motion between neighboring poses. The spatial coordinate and the timestamp information are also used to describe the space-temporal distribution of motion in the video sequence. For classification, we modify the SVM kernel to measure the distance between different action instances. Our experiments on three common datasets show that the proposed descriptor outperforms the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 155.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chaudhry, R., Ofli, F., Kurillo, G., Bajcsy, R., Vidal, R.: Bio-inspired dynamic 3D discriminative skeletal features for human action recognition. In: Conference on Computer Vision and Pattern Recognition Workshops, pp. 471–478 (2013)

    Google Scholar 

  2. Eweiwi, A., Cheema, M.S., Bauckhage, C., Gall, J.: Efficient pose-based action recognition. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9007, pp. 428–443. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16814-2_28

    Chapter  Google Scholar 

  3. Gavrila, D., Davis, L., et al.: Towards 3-D model-based tracking and recognition of human movement: a multi-view approach. In: International workshop on automatic face and gesture recognition, pp. 272–277. Citeseer (1995)

    Google Scholar 

  4. Hussein, M.E., Torki, M., Gowayyed, M.A., El-Saban, M.: Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations. In: International Joint Conference on Artificial Intelligence IJCAI, vol. 13, pp. 2466–2472 (2013)

    Google Scholar 

  5. Johansson, G.: Visual perception of biological motion and a model for its analysis. Percept. psychophys. 14(2), 201–211 (1973)

    Article  Google Scholar 

  6. Koniusz, P., Cherian, A., Porikli, F.: Tensor representations via kernel linearization for action recognition from 3D skeletons. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 37–53. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_3

    Chapter  Google Scholar 

  7. Koniusz, P., Yan, F., Gosselin, P.H., Mikolajczyk, K.: Higher-order occurrence pooling for bags-of-words: visual concept detection. IEEE Trans. Pattern Anal. Mach. Intell. 39(2), 313–326 (2017)

    Article  Google Scholar 

  8. Lan, Z., Hauptmann, A.G.: Beyond spatial pyramid matching: space-time extended descriptor for action recognition. arXiv preprint arXiv:1510.04565 (2015)

  9. Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 1–8. IEEE (2008)

    Google Scholar 

  10. Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 9–14. IEEE (2010)

    Google Scholar 

  11. Lv, F., Nevatia, R.: Recognition and segmentation of 3-D human action using HMM and multi-class AdaBoost. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 359–372. Springer, Heidelberg (2006). https://doi.org/10.1007/11744085_28

    Chapter  Google Scholar 

  12. Nie, S., Ji, Q.: Capturing global and local dynamics for human action recognition. In: International Conference on Pattern Recognition, pp. 1946–1951. IEEE (2014)

    Google Scholar 

  13. Shao, Z., Li, Y.: A new descriptor for multiple 3D motion trajectories recognition. In: IEEE International Conference on Robotics and Automation ICRA, pp. 4749–4754. IEEE (2013)

    Google Scholar 

  14. Vemulapalli, R., Arrate, F., Chellappa, R.: Human action recognition by representing 3D skeletons as points in a lie group. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)

    Google Scholar 

  15. Vemulapalli, R., Chellapa, R.: Rolling rotations for recognizing human actions from 3D skeletal data. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4471–4479 (2016)

    Google Scholar 

  16. Wang, J., Liu, Z., Wu, Y.: Learning Actionlet ensemble for 3D human action recognition. Human Action Recognition with Depth Cameras. SCS, pp. 11–40. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-04561-0_2

    Chapter  Google Scholar 

  17. Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining Actionlet ensemble for action recognition with depth cameras. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1290–1297. IEEE (2012)

    Google Scholar 

  18. Wang, P., Li, W., Ogunbona, P., Gao, Z., Zhang, H.: Mining mid-level features for action recognition based on effective skeleton representation. In: DlCTA, pp. 1–8. IEEE (2014)

    Google Scholar 

  19. Wang, P., Li, Z., Hou, Y., Li, W.: Action recognition based on joint trajectory maps using convolutional neural networks. In: ACM on Multimedia Conference, pp. 102–106. ACM (2016)

    Google Scholar 

  20. Xia, L., Chen, C.C., Aggarwal, J.: View invariant human action recognition using histograms of 3D joints. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 20–27. IEEE (2012)

    Google Scholar 

  21. Yang, X., Tian, Y.L.: Eigenjoints-based action recognition using naive-bayes-nearest-neighbor. In: Computer vision and pattern recognition workshops (CVPRW), pp. 14–19. IEEE (2012)

    Google Scholar 

Download references

Acknowledgments

The work described in this paper was supported by the National Natural Science Foundation of China (No. 61103127 and No. 61375016).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liang, Q., Wang, F. (2018). MCTD: Motion-Coordinate-Time Descriptor for 3D Skeleton-Based Action Recognition. In: Zeng, B., Huang, Q., El Saddik, A., Li, H., Jiang, S., Fan, X. (eds) Advances in Multimedia Information Processing – PCM 2017. PCM 2017. Lecture Notes in Computer Science(), vol 10735. Springer, Cham. https://doi.org/10.1007/978-3-319-77380-3_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77380-3_55

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77379-7

  • Online ISBN: 978-3-319-77380-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics