Skip to main content
Log in

One example based action detection in hough space

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Given a short action query video, to detect the same category action in a target video is a very important research topic. We propose a fast action detection method motivated by the idea of Hough Transformation. First, we extract the HOG features at the corner points from the query video. The corner points are referred to as interest points. Then, video clips are formed by sliding a window on the query video. For each T frames of a clip, in the displacement Hough space, the interest points in all of the frames are matched with the interest points in the first frame. We count the matched pairs in the cells of the Hough space to form a 2d displacement histogram. The query video is represented by a 2d displacement histogram sequence. After that, we divide the target video with motion into video cubes. These video cubes are similarly represented by displacement histogram sequences. The matrix cosine similarity is used to compute the similarities between the query video and the video cubes. This process is referred to as action matching. In the end, with the action matching results, we precisely localize the action using the locations of the matched interest points. Our key contribution is that we propose a very simple and fast algorithm that represents the actions as the displacement histogram sequences. Experiments on the challenging datasets containing both of the simple and realistic backgrounds confirm the effectiveness and efficiency of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Ali S, Shah M (2010) Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans Pattern Anal Mach Intell 32(2):288–303

    Article  Google Scholar 

  2. Bobick AF, Davis JW (2007) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):1257–1265

    Google Scholar 

  3. Boiman O Shechtman E, Irani M (2008) In defense of nearest-neighbor based image classification. In: Proc. IEEE conf. on computer vision and pattern recognition

  4. Cheung K, Baker S, Kanade T (2003) Shape-from-Silhouette of articulated objects and its use for human body kinematics estimation and motion capture. In: Proc. IEEE conf. on computer vision and pattern recognition

  5. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Proc. IEEE conf. on computer vision and pattern recognition

  6. Derpanis KG, Sizintsev M, Cannons K, Wildes RP (2010) Efficient action spotting based on a spacetime oriented structure representation. In: Proc. IEEE conf. on computer vision and pattern recognition

  7. Fu Y, Huang TS (2008) Image classification using correlation tensor analysis. IEEE Trans Image Process 17(2):226–234

    MathSciNet  Google Scholar 

  8. Gall J, Lempitsky V (2009) Class-specific Hough forests for object detection. In: Proc. IEEE conf. on computer vision and pattern recognition

  9. Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253

    Article  Google Scholar 

  10. Ikizler-Cinbis N, Cinbis RG, Sclaroff S (2009) Learning actions from the web. In: Proc. IEEE conf. on computer vision

  11. Jiang Z, Lin Z, Davis LS (2012) Recognizing human actions by learning and matching shape-motion prototype trees. IEEE Trans Pattern Anal Mach Intell 34(3):533–547

    Article  Google Scholar 

  12. Ke Y, Sukthankar R, Hebert M (2005) Efficient visual event detection using volumetric features. In: Proc. IEEE conf. on computer vision and pattern recognition

  13. Ke Y, Sukthankar R, Hebert M (2007) Event detection in crowded videos. In: Proc. IEEE conf. on computer vision

  14. Kim T, Cipolla R (2009) Canonical correlation analysis of video volume tensors for action categorization and detection. IEEE Trans Pattern Anal Mach Intell 31(8):1415–1428

    Article  Google Scholar 

  15. Lampert CH, Blaschko MB, Hofmann T (2008) Beyond sliding windows: object localization by efficient subwindow search. In: Proc. IEEE conf. on computer vision and pattern recognition

  16. Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: Proc. IEEE conf. on computer vision and pattern recognition

  17. Laptev I, Prez P (2007) Retrieving actions in movies. In: Proc. IEEE conf. on computer vision

  18. Laptev Z, Lindeberg T (2003) Space-time interest points. In: Proc. IEEE conf. on computer vision

  19. Little J, Boyd J (1998) Recognizing people by their gait: the shape of motion. J Comput Vis Res 1:2–32

    Google Scholar 

  20. Liu J, Ali S, Shah M (2008) Recognizing human actions using multiple features. In: Proc. IEEE conf. on computer vision and pattern recognition

  21. Mahmood T, Vasilescu A, Sethi S (2001) Recognition of action events from multiple video points. In: Proc. IEEE workshop detection and recognition of events in video

  22. Niebles J, Fei-Fei L (2007) A hierarchical models of shape and appearance for human action classification. In: Proc. IEEE conf. on computer vision and pattern recognition

  23. Niebles J, Wang H, Fei-Fei L (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis 79(3):299–318

    Article  Google Scholar 

  24. Ning H, Han T, Walther D, Liu M, Huang T (2009) Hierarchical space-time model enabling efficient search for human actions. IEEE Trans Circuits Syst Video Technol 19(6):808–820

    Article  Google Scholar 

  25. Oikonomopoulous A, Patras I, Pantic M 2005 Spatiotemporal saliency for human action recognition. In: Proc. IEEE conf. on multimedia and expo

  26. Schindler K, Gool LV (2008) Acion snippets: how many frames does human action recognition require? In: IEEE conf. on computer vision and pattern recognition

  27. Scovanner P, Ali S, Shah M (2007) A 3-dimensional SIFT descriptor and its application to action recognition. In: Proc. on ACM multimedia conference

  28. Seo HJ, Milanfar P (2009) Static and space-time visual saliency detection by sel-resemblance. J Vis 9(12):1–27

    Article  Google Scholar 

  29. Seo HJ, Milanfar P (2011) Action recognition from one example. IEEE Trans Pattern Anal Mach Intell 33(5):867–882

    Article  Google Scholar 

  30. Shechtman E, Irani M 2007 Space-time behavior-based correlation-or-how to tell if two underlying motion fields are similar without computing them? IEEE Trans Pattern Anal Mach Intell 29(11):2045–2056

    Article  Google Scholar 

  31. Yilmaz A, Shah M (2005) Action sketch: a novel action representaion. In: Proc. IEEE conf. on computer vision and pattern recognition

  32. Yu G, Yuan JS, Liu ZC (2011) Unsupervised random forest indexing for fast action search. In: Proc. IEEE conf. on computer vision and pattern recognition

  33. Yuan JS, Liu ZC, Wu Y (2011) Discriminative video pattern search for efficient action detection. IEEE Trans Pattern Anal Mach Intell 33(9):1728–1742

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported in part by the 973 National Basic Research Program of China (2010CB732501), Fundation of Sichuan Excellent Young Talents (09ZQ026-035) and the Fundamental Research Funds for the Central University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lishen Pei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pei, L., Ye, M., Xu, P. et al. One example based action detection in hough space. Multimed Tools Appl 72, 1751–1772 (2014). https://doi.org/10.1007/s11042-013-1478-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1478-9

Keywords

Navigation