Skip to main content

A Deep Model Combining Structural Features and Context Cues for Action Recognition in Static Images

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10639))

Abstract

In this paper, we present a deep model for the task of action recognition in static images, which combines body structural information and context cues to build a more accurate classifier. Moreover, to construct more semantic and robust body structural features, we propose a new body descriptor, named limb angle discriptor(LAD), which uses the relative angles between the limbs in 2D skeleton. We evaluate our method on the PASCAL VOC 2012 Action dataset and compare it with the published results. The result shows that our method achieves 90.6% mean AP, outperforming the previous state-of-art approaches in the field.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    We refer to part pairs as limbs for clarity, despite the fact that some pairs are not human limbs (e.g., the torso).

References

  1. Bourdev, L., Malik, J.: Poselets: Body part detectors trained using 3d human pose annotations, pp. 1365–1372 (2010)

    Google Scholar 

  2. Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. arXiv preprint arXiv:1611.08050 (2016)

  3. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)

    Google Scholar 

  4. Ellis, C., Masood, S.Z., Tappen, M.F., Laviola, J.J., Sukthankar, R.: Exploring the trade-off between accuracy and observational latency in action recognition. Int. J. Comput. Vis. 101(3), 420–436 (2013)

    Article  Google Scholar 

  5. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Article  Google Scholar 

  6. Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)

    Google Scholar 

  7. Gkioxari, G., Girshick, R., Dollár, P., He, K.: Detecting and recognizing human-object interactions. arXiv preprint arXiv:1704.07333 (2017)

  8. Gkioxari, G., Girshick, R., Malik, J.: Actions and attributes from wholes and parts, pp. 2470–2478 (2015)

    Google Scholar 

  9. Gkioxari, G., Girshick, R., Malik, J.: Contextual action recognition with r*cnn, pp. 1080–1088 (2015)

    Google Scholar 

  10. Hoai, M., Ladicky, L., Zisserman, A.: Action recognition from weak alignment of body parts (2014)

    Google Scholar 

  11. Hoai12, M.: Regularized max pooling for image categorization (2014)

    Google Scholar 

  12. Hussein, M.E., Torki, M., Gowayyed, M.A., Elsaban, M.: Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations, pp. 2466–2472 (2013)

    Google Scholar 

  13. Kerola, T., Inoue, N., Shinoda, K.: Spectral graph skeletons for 3D action recognition, pp. 417–432 (2014)

    Google Scholar 

  14. Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points, pp. 9–14 (2010)

    Google Scholar 

  15. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  16. Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance, pp. 3177–3184 (2011)

    Google Scholar 

  17. Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1717–1724 (2014)

    Google Scholar 

  18. Prest, A., Schmid, C., Ferrari, V.: Weakly supervised learning of interactions between humans and objects. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 601–614 (2012)

    Article  Google Scholar 

  19. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  20. Uijlings, J.R., Van De Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)

    Article  Google Scholar 

  21. Yang, X., Tian, Y.L.: Eigenjoints-based action recognition using nave-bayes-nearest-neighbor, pp. 14–19 (2012)

    Google Scholar 

  22. Ziaeefard, M., Bergevin, R.: Semantic human activity recognition: A literature review. Pattern Recogn. 48(8), 2329–2345 (2015)

    Article  Google Scholar 

Download references

Acknowlwdgements

The work is partly supported by Beijing Natual Science Foundation (4172054).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kan Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Wang, X., Li, K., Li, Y. (2017). A Deep Model Combining Structural Features and Context Cues for Action Recognition in Static Images. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10639. Springer, Cham. https://doi.org/10.1007/978-3-319-70136-3_66

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70136-3_66

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70135-6

  • Online ISBN: 978-3-319-70136-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics