Action recognition using 3D DAISY descriptor

Cao, Xiaochun; Zhang, Hua; Deng, Chao; Liu, Qiguang; Liu, Hanyu

doi:10.1007/s00138-013-0545-6

Action recognition using 3D DAISY descriptor

Original Paper
Published: 18 October 2013

Volume 25, pages 159–171, (2014)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Xiaochun Cao^1,2,
Hua Zhang¹,
Chao Deng¹,
Qiguang Liu³ &
…
Hanyu Liu⁴

634 Accesses
14 Citations
Explore all metrics

Abstract

In this paper we propose a novel spatial-temporal descriptor for action recognition. We extend a recent image local descriptor, DAISY, to three dimensions to deal with the information in the additional temporal domain in videos. The new 3D DAISY descriptor is both functionally discriminative and computationally efficient. We use the bag-of-words framework and non-linear SVM for classification. The experiments on public action datasets, KTH, WEIZMANN, YouTube, and UT-Interaction, demonstrate the promising results of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

Human Action Recognition and Prediction: A Survey

Article 28 March 2022

Yu Kong & Yun Fu

Real-Time Human Pose Detection and Recognition Using MediaPipe

References

Aggarwal, J.K., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. (CSUR) 43(3) (2011)
Ali, S. Basharat, A., Shah, M.: Chaotic invariants for human action recognition. In: ICCV (2007)
Bay, H., Ess, A., Tuytelaars, T., Gool, L.V.: SURF: speeded up robust features. CVIU 110(3), 346–359 (2008)
Google Scholar
Bregonzio, M., Gong, S., Xiang, T.: Recognising action as clouds of space-time interest points, CVPR (2009)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001)
Chen, C.-C., Aggarwal, K.K.: Recognizing human action from a far field of view. IEEE workshop on motion and video computing (WMVC) (2009)
Deng, C., Cao, X., Liu, H., Chen. J.: A global spatio-temporal representation for action recognition. In: ICPR, Istanbul, pp. 1816–1819 (2010)
Dollár, P., Rabaud, V., Cottrell, G., Belongie S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS, pp. 65–72 (2005)
Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV, Nice, pp. 726–733 (2003)
Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: CVPR, Anchorage, (2008)
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. PAMI 29(12), 2247–2253 (2007)
Article Google Scholar
Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. IEEE international conference on computer vision (ICCV) (2007)
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: ICCV, pp. 166–173 (2005)
Kläser, A., Marszałek, M., Schmid. C.: A spatio-temporal descriptor based on 3D-gradients. In: BMVC, pp. 995–1004 (2008)
Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: CVPR, San Francisco (2010)
Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR, Anchorage (2008)
Laptev, I.: On space-time interest points. IJCV 64(2), 107–123 (2005)
Article MathSciNet Google Scholar
Lin, Z., Jiang, Z., Davis, L.S.: Recognizing actions by shape-motion prototype trees. In: ICCV, Kyoto (2009)
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos ”in the wild”. In: CVPR, Miami, pp. 1996–2003 (2009)
Liu, J., Shah, M.: Learning human actions via information maximization. In: CVPR, Anchorage (2008)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)
Article Google Scholar
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR, Miami (2009)
Messing, R., Pal, C., Kautz, H.: Activity recognition using the velocity histories of tracked keypoints. In: ICCV, Kyoto (2009)
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. PAMI 27(10), 1615–1630 (2005)
Article Google Scholar
Niebles, J.C., Wang, H., Li, F.-F.: Unsupervised learning of human action categories using spatial-temporal words. IJCV 79(3), 299–318 (2008)
Article Google Scholar
Nowozin, S., Bakir, G., Tsuda K.: Discriminative subsequence mining for action classification. In: ICCV, pp. 1919–1923 (2007)
Rodriguez, M.D., Ahmed, J., Shah, M.: Action MACH a spatio-temporal maximum average correlation height filter for action recognition. In: CVPR, Anchorage (2008)
Ryoo, M.S., Aggarwal, J.K.: Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: ICCV, Kyoto, pp. 1593–1600 (2009)
Ryoo, M.S., Aggarwal, J.K.: UT-interaction dataset, ICPR contest on semantic description of human activities (SDHA) (2010)
Schüldt, C., Laptev, I. Caputo, B.: Recognizing human actions: a local SVM approach. In: ICPR, pp. 32–36 (2004)
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional SIFT descriptor and its application to action recognition. In: ACM multimedia, pp. 357–360 (2007)
Sun, J., Wu, X., Yan, S., Cheong, L.-F., Chua, T.-S., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. In: CVPR, pp. 2004–2011 (2009)
Tola, E., Lepetit, V., Fua, P.: A fast local descriptor for dense matching. In: CVPR, Anchorage (2008)
Waltisberg, D., Yao, A., Gall, J., Gool, L.V.: Variations of a Hough-voting action recognition system. Recognizing Patterns in Signals, Speech, Images and Videos, LNCS, vol. 6388, (2010)
Wang, H., Ullah, M.M., Kläser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC, London (2009)
Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. CVIU 104(2–3), 249–257 (2006)
Google Scholar
Willems, G., Tuytelaars, T., Gool, L.V.: An efficient dense and scale-invariant spatio-temporal interest point detector. ECCV 2, 650–663 (2008)
Google Scholar
Winder, S., Hua, G., Brown, M.: Picking the best daisy. In: CVPR, pp. 178–185 (2009)
Yilmaz, A. Shah, M.: Actions sketch: a novel action representation. In: CVPR, pp. 984–989 (2005)
Zhang, Z., Hu, Y., Chan, S., Chia, L.-T.: Motion context: a new representation for human action recognition. In: ECCV, pp. 817–829 (2008)
Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. IJCV 73(2), 213–238 (2007)
Article Google Scholar

Download references

Acknowledgments

This work was supported by National Natural Science Foundation of China (61332012), National Basic Research Program of China (2013CB329305), 100 Talents Programme of The Chinese Academy of Sciences, and Strategic Priority Research Program of the Chinese Academy of Sciences (\(XDA06030601\)).

Author information

Authors and Affiliations

School of Computer Science and Technology, Tianjin University, Tianjin, 300072, China
Xiaochun Cao, Hua Zhang & Chao Deng
State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China
Xiaochun Cao
University of Rochester, Rochester, NY, 14620, USA
Qiguang Liu
State University of New York, Stony Brook, NY, 11790, USA
Hanyu Liu

Authors

Xiaochun Cao
View author publications
You can also search for this author in PubMed Google Scholar
Hua Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Deng
View author publications
You can also search for this author in PubMed Google Scholar
Qiguang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hanyu Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaochun Cao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cao, X., Zhang, H., Deng, C. et al. Action recognition using 3D DAISY descriptor. Machine Vision and Applications 25, 159–171 (2014). https://doi.org/10.1007/s00138-013-0545-6

Download citation

Received: 03 December 2011
Revised: 17 July 2013
Accepted: 09 September 2013
Published: 18 October 2013
Issue Date: January 2014
DOI: https://doi.org/10.1007/s00138-013-0545-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Action recognition using 3D DAISY descriptor

Abstract

Access this article

Similar content being viewed by others

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

Human Action Recognition and Prediction: A Survey

Real-Time Human Pose Detection and Recognition Using MediaPipe

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Action recognition using 3D DAISY descriptor

Abstract

Access this article

Similar content being viewed by others

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

Human Action Recognition and Prediction: A Survey

Real-Time Human Pose Detection and Recognition Using MediaPipe

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation