Abstract
Hand-crafted descriptors are widely used for human action recognition in video at present. However, they are not optimized and may lack discriminative information. To compensate this drawback, this paper presents a learning-based 3D compact binary descriptor (3D-CBD) for human action video representation. The proposed descriptor is a 3D extension of the compact binary face descriptor (CBFD). Given a video sequence, we first extract pixel difference vectors (PDVs) in local volumes and then learn a feature mapping to project these PDVs into low-dimensional binary vectors. Finally, we cluster and pool these binary codes into histogram feature as the representation of the video sequence. Experimental results on two action datasets (KTH and WEIZMANN) demonstrate the effectiveness of the proposed descriptor.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. arXiv preprint arXiv:1505.04868 (2015)
Bregonzio, M., Gong, S., Xiang, T.: Recognising action as clouds of space-time interest points. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1948–1955. IEEE (2009)
Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS, pp. 65–72 (2005)
Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2–3), 107–123 (2005)
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176. IEEE (2011)
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: ACM International Conference on Multimedia, pp. 357–360. ACM (2007)
Klaser, A., Marszałek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: BMVC 2008–19th British Machine Vision Conference, pp. 275:1–275:10. BMVA (2008)
Lu, J., Liong, V.E., Zhou, X., Zhou, J.: Learning compact binary face descriptor for face recognition. IEEE TPAMI (2015)
Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE (2008)
Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: 2004 Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3, pp. 32–36. IEEE (2004)
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: 2005 Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 2, pp. 1395–1402. IEEE (2005)
Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC 2009-British Machine Vision Conference, pp. 124–1. BMVA (2009)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: A library for large linear classification. The Journal of Machine Learning Research, 1871–1874 (2008)
Ballan, L., Bertini, M., Del Bimbo, A., Seidenari, L., Serra, G.: Effective codebooks for human action categorization. In: ICCV Workshops (2009)
Wong, S.F., Cipolla, R.: Extracting spatiotemporal interest points using global information. In: 2007 IEEE 11th International Conference on Computer Vision, ICCV 2007, pp. 1–8. IEEE (2007)
Liu, J., Ali, S., Shah, M.: Recognizing human actions using multiple features. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Huang, D., Li, X., Li, H., Zheng, W. (2015). Learning 3D Compact Binary Descriptor for Human Action Recognition in Video. In: Yang, J., Yang, J., Sun, Z., Shan, S., Zheng, W., Feng, J. (eds) Biometric Recognition. CCBR 2015. Lecture Notes in Computer Science(), vol 9428. Springer, Cham. https://doi.org/10.1007/978-3-319-25417-3_72
Download citation
DOI: https://doi.org/10.1007/978-3-319-25417-3_72
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25416-6
Online ISBN: 978-3-319-25417-3
eBook Packages: Computer ScienceComputer Science (R0)