Skip to main content
Log in

Action recognition new framework with robust 3D-TCCHOGAC and 3D-HOOFGAC

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Action recognition has very high academic research value, potential commercial value and wide market application prospect in computer vision. In order to improve the action recognition accuracy, two kinds of dynamic descriptors based on dense trajectories are proposed in this paper. Firstly, to capture the local position information that action occurs, dense sampling in motion regions is done by constraining and clustering of optical flow. Secondly, the motion corners of object are selected as feature points which are then tracked to obtain motion trajectories. Finally, the gradient information and optical flow gradient information are extracted respectively in the video cube centered at the trajectories, then the auto-correlation and normalization processing are carried out on the two above information to obtain two dynamic descriptors named 3D histograms of oriented gradients in trajectory centered cube auto-correlation and 3D histograms of oriented optical flow gradients auto-correlation, which can resist a certain degree of interferences caused by camera motion and complex background. However, the diversity of realistic videos makes dynamic or static descriptors alone unable to achieve accurate action classification. A new framework is proposed, which makes the dynamic descriptors and static descriptors fuse and supplement mutually to further improve the action recognition accuracy. This paper adopts the leave-one-out cross validation on datasets of Weizmann and UCF-Sports with action recognition accuracy of 100 % and 96.00 %, and adopts the four-fold cross validation on datasets of KTH and YouTube with action recognition accuracy of 97.17 % and 88.23 %, which has the better performance over the references.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Al Harbi N, Gotoh Y (2015) A unified spatio-temporal human body region tracking approach to action recognition. Neurocomputing 161:56–64

    Article  Google Scholar 

  2. Amerini I, Ballan L, Caldelli R et al (2011) A SIFT-based forensic method for copy–move attack detection and transformation recovery. IEEE Trans Inf Forensic Secur 6(3):1099–1110

    Article  Google Scholar 

  3. Bellamine I, Tairi H (2013) Motion detection and tracking using space-time interest points. In: ACS International Conference on Computer Systems and Applications (AICCSA), pp 1–7

  4. Bobick AF, Davis JW (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267

    Article  Google Scholar 

  5. Chaquet JM, Carmona EJ, Fernández-Caballero A (2013) A survey of video datasets for human action and activity recognition. Comput Vis Image Underst 117(6):633–659

    Article  Google Scholar 

  6. Cho J, Lee M, Chang HJ, Oh S (2014) Robust action recognition using local motion and group sparsity. Pattern Recogn 47(5):1813–1825

    Article  Google Scholar 

  7. Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 428–441

  8. Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp 226–231

  9. Everts I, van Gemert JC, Gevers T (2013) Evaluation of color STIPs for human action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2850–2857

  10. Farnebäck G (2003) Two-frame motion estimation based on polynomial expansion. In: Image analysis. Springer, pp 363–370

  11. Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253

    Article  Google Scholar 

  12. Kliper-Gross O, Gurovich Y, Hassner T, Wolf L (2012) Motion interchange patterns for action recognition in unconstrained videos. In: Proceedings of European Conference on Computer Vision (ECCV), pp 256–269

  13. Knopp J, Prasad M, Willems G, Timofte R, Van Gool L (2010) Hough transform and 3D SURF for robust three dimensional classification. In: Proceedings of European Conference on Computer Vision (ECCV), pp 589–602

  14. Kobayashi T, Otsu N (2012) Motion recognition using local auto-correlation of space–time gradients. Pattern Recogn Lett 33(9):1188–1195

    Article  Google Scholar 

  15. Lee H, Morariu V, Davis LS (2014) Robust pose features for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), pp 365–372

  16. Liu J, Luo J, Shah M (2009) Recognizing realistic actions from videos “in the wild”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1996–2003

  17. Ma S, Sigal L, Sclaroff S (2015) Space-time tree ensemble for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 5024–5032

  18. Rodriguez MD, Ahmed J, Shah M (2008) Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1–8

  19. Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: an efficient alternative to SIFT or SURF. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp 2564–2571

  20. Schüldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Proceedings of IEEE International Conference on Pattern Recognition (ICPR), pp 32–36

  21. Wang H, Klaser A, Schmid C, Lui C-L (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103(1):60–79

    Article  MathSciNet  Google Scholar 

  22. Wang H, Schmid C (2013) Action recognition with improved trajectories. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp 3551–3558

  23. Wang L, Suter D (2006) Informative shape representations for human action recognition. In: Proceedings of IEEE International Conference on Pattern Recognition (ICPR), pp 1266–1269

  24. Xing D, Wang X, Lu H (2014) Action recognition using hybrid feature descriptor and VLAD video encoding. In: Asian Conference on Computer Vision Workshops, pp 99–112

  25. Yilmaz A, Shah M (2008) A differential geometric approach to representing the human actions. Comput Vis Image Underst 109(3):335–351

    Article  Google Scholar 

  26. Zhang Z, Hu Y, Chan S, Chia L-T (2008) Motion context: a new representation for human action recognition. In: Proceedings of European Conference on Computer Vision (ECCV), pp 817–829

  27. Zhang Q, Wang Y, Li B (2015) Unsupervised video analysis based on a spatiotemporal saliency detector. arXiv preprint arXiv:1503.06917

  28. Zhang H, Zhou W, Reardon C et al (2014) Simplex-based 3D spatio-temporal feature description for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2067–2074

Download references

Acknowledgments

This work was supported in part by National Natural Science Foundation of China (Grant No. 61072110) and Science and Technology Overall Innovation Project of Shaanxi Province (Grant 2013KTZB03-03-03).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming Tong.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tong, M., Wang, H., Tian, W. et al. Action recognition new framework with robust 3D-TCCHOGAC and 3D-HOOFGAC. Multimed Tools Appl 76, 3011–3030 (2017). https://doi.org/10.1007/s11042-016-3279-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-3279-4

Keywords

Navigation