Skip to main content

Learning 3D Compact Binary Descriptor for Human Action Recognition in Video

  • Conference paper
  • First Online:
Biometric Recognition (CCBR 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9428))

Included in the following conference series:

  • 2396 Accesses

Abstract

Hand-crafted descriptors are widely used for human action recognition in video at present. However, they are not optimized and may lack discriminative information. To compensate this drawback, this paper presents a learning-based 3D compact binary descriptor (3D-CBD) for human action video representation. The proposed descriptor is a 3D extension of the compact binary face descriptor (CBFD). Given a video sequence, we first extract pixel difference vectors (PDVs) in local volumes and then learn a feature mapping to project these PDVs into low-dimensional binary vectors. Finally, we cluster and pool these binary codes into histogram feature as the representation of the video sequence. Experimental results on two action datasets (KTH and WEIZMANN) demonstrate the effectiveness of the proposed descriptor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. arXiv preprint arXiv:1505.04868 (2015)

  2. Bregonzio, M., Gong, S., Xiang, T.: Recognising action as clouds of space-time interest points. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1948–1955. IEEE (2009)

    Google Scholar 

  3. Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS, pp. 65–72 (2005)

    Google Scholar 

  4. Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2–3), 107–123 (2005)

    Article  Google Scholar 

  5. Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176. IEEE (2011)

    Google Scholar 

  6. Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: ACM International Conference on Multimedia, pp. 357–360. ACM (2007)

    Google Scholar 

  7. Klaser, A., Marszałek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: BMVC 2008–19th British Machine Vision Conference, pp. 275:1–275:10. BMVA (2008)

    Google Scholar 

  8. Lu, J., Liong, V.E., Zhou, X., Zhou, J.: Learning compact binary face descriptor for face recognition. IEEE TPAMI (2015)

    Google Scholar 

  9. Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE (2008)

    Google Scholar 

  10. Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: 2004 Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3, pp. 32–36. IEEE (2004)

    Google Scholar 

  11. Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: 2005 Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 2, pp. 1395–1402. IEEE (2005)

    Google Scholar 

  12. Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC 2009-British Machine Vision Conference, pp. 124–1. BMVA (2009)

    Google Scholar 

  13. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: A library for large linear classification. The Journal of Machine Learning Research, 1871–1874 (2008)

    Google Scholar 

  14. Ballan, L., Bertini, M., Del Bimbo, A., Seidenari, L., Serra, G.: Effective codebooks for human action categorization. In: ICCV Workshops (2009)

    Google Scholar 

  15. Wong, S.F., Cipolla, R.: Extracting spatiotemporal interest points using global information. In: 2007 IEEE 11th International Conference on Computer Vision, ICCV 2007, pp. 1–8. IEEE (2007)

    Google Scholar 

  16. Liu, J., Ali, S., Shah, M.: Recognizing human actions using multiple features. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dongcheng Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Huang, D., Li, X., Li, H., Zheng, W. (2015). Learning 3D Compact Binary Descriptor for Human Action Recognition in Video. In: Yang, J., Yang, J., Sun, Z., Shan, S., Zheng, W., Feng, J. (eds) Biometric Recognition. CCBR 2015. Lecture Notes in Computer Science(), vol 9428. Springer, Cham. https://doi.org/10.1007/978-3-319-25417-3_72

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25417-3_72

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25416-6

  • Online ISBN: 978-3-319-25417-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics