Skip to main content
Log in

A new invariant descriptor for action recognition based on spherical harmonics

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

The aim of this paper is to introduce a new descriptor for the spatio-temporal volume (STV). Human motion is completely represented by STV (action volume) which is constructed over successive frames by stacking human silhouettes in consecutive frames. Action volume comprehensively contains spatial and temporal information about an action. The main contribution of this paper is to propose a new affine invariant action volume descriptor based on a function of spherical harmonic coefficients. This means, it is invariant under rotation, non-uniform scaling and translation. In the 3D shape analysis literature, there have been a few attempts to use coefficients of spherical harmonics to describe a 3D shape. However, those descriptors are not affine invariant and they are only rotation invariant. In addition, the proposed approach employs a parametric form of spherical harmonics that handles genus zero surfaces regardless of whether they are stellar or not. Another contribution of this paper is the way that action volume is constructed. We applied the proposed descriptor to the KTH, Weizmann, IXMAS and Robust datasets and compared the performance of our algorithm to competing methods available in the literature. The results of our experiments show that our method has a comparable performance to the most successful and recent existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Poppe RW (2010) A survey on vision-based human action recognition. Image Vis Comput 28:976–990

    Article  Google Scholar 

  2. Adelson EH, Bergen JR (1985) Spatiotemporal energy models for the perception of motion. J Opt Soc Am 2:284–299

    Article  Google Scholar 

  3. Yilmaz A, Shah M (2005) Actions sketch: a novel action representation. In: Computer Vision and Pattern Recognition (CVPR). Washington, USA

  4. Yan P, Khan SM, Shah M (2008) Learning 4D action feature model for arbitrary view action recognition. In: Computer Vision and Pattern Recognition (CVPR)

  5. Blank M, Gorelick L, Shechtman L, Irani M, Basri R (2005) Actions as space-time shapes. In: International Conference on Computer Vision (ICCV’05), Beijing, China, pp 1395–1402

  6. Gorelick L, Galun M, Sharon E, Brandt A, Basri R (2006) Shape representation and classification using the Poisson equation. IEEE Trans Pattern Anal Mach Intell 28:1–29

    Article  Google Scholar 

  7. Ali S, Shah M (2010) Human action recognition in videos using kinematics features and multiple instance learning. IEEE Trans Pattern Anal Mach Intell (PAMI) 32:288–303

    Article  Google Scholar 

  8. Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. Comput Vis Image underst (CVIU) 104:249–257

    Article  Google Scholar 

  9. Danafar S, Gheissari N (2007) Action recognition for surveillance applications using optic flow and SVM. In: Asian Conference on Computer Vision (ACCV’07), Tokyo, Japan, pp 457–466

  10. Lin Z, Jiang Z, Davis LS (2009) Recognizing actions by shape-motion prototype trees. In: International Conference on Computer Vision (ICCV)

  11. Lui YM, Beveridge JR, Kirby M (2010) Action classification on product manifolds. In: Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, pp 833–839

  12. Laptev I, Lindeberg T (2003) Space–time interest points. In: International Conference on Computer Vision (ICCV’03), Nice, France, pp 432–439

  13. Harris C, Stephens M (1988) A combined corner and edge detector. In: Alvey Vision Conference, Manchester, UK, pp 147–151

  14. Laptev I, Caputo B, Schüldt C, Lindeberg T (2007) Local velocity-adapted motion events for spatio-temporal recognition. Comput Vis Image Underst (CVIU) 108:207–229

    Article  Google Scholar 

  15. Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: Computer Vision and Pattern Recognition (CVPR’08), Anchorage, AK, pp 1–8

  16. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, pp 886–893

  17. Marszalek M, Laptev I, Schmid C (2009) Actions in context. In: Computer Vision and Pattern Recognition (CVPR’09), Miami, FL, pp 1–8

  18. Niebles JC, Wang H, Fei-fei L (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis (IJCV) 79:299–318

    Article  Google Scholar 

  19. Fathi A, Mori G (2008) Action recognition by learning mid-level motion features. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp 1–8

  20. Kazhdan M, Funkhouser T, Rusinkiewicz S (2003) Rotation invariant spherical harmonic representation of 3D shape descriptors. In: Eurographics/ACM SIGGRAPH symposium on Geometry, Aachen, Germany, pp 156–164

  21. Chung MK, Dalton KM, Shen L, Evans AC, Davidson RJ (2007) Weighted Fourier series representation and its application to quantifying the amount of gray matter. In: IEEE Transactions on Medical Imaging, pp 566–581

  22. Chung MK, Hartley R, Dalton KM, Davidson RJ (2008) Encoding cortical surface by spherical harmonics. In: Satistica Sinica, pp 1269–1291

  23. Holte MB, Moeslund TB, Fihl P (2010) View-invariant gesture recognition using 3D optical flow and harmonic motion context. In: Computer Vision and Image Understanding (CVIU), vol 114

  24. Belongie S, Malik J, Puzicha J (2002) Shape matching and object recognition using shape contexts. IEEE Trans Pattern Anal Mach Intell (PAMI) 24:509–522

    Article  Google Scholar 

  25. Zeng W, Samaras D, Gu D (2010) Ricci Flow for 3D Shape Analysis. IEEE Trans Pattern Anal Mach Intell (PAMI) 32:662–677

    Article  Google Scholar 

  26. Schönefeld V (2004) Spherical harmonics

  27. Duncan BS, Olson AJ (1993) Approximation and characterization of molecular surfaces. Biopolymers 33:219–229

    Article  Google Scholar 

  28. Brechbuhler C, Gerig G, Kuebler O (1995) Parametrization of closed surfaces for 3-D shape description. Comput Vis Image Underst (CVIU) 61:154–170

    Article  Google Scholar 

  29. Khairy K, Howard J (2008) Spherical harmonics-based parametric deconvolution of 3D surface images using bending energy minimization. Med Image Anal 12:217–227. http://www.sciencedirect.com/science/article/pii/S1361841507001016

    Google Scholar 

  30. Morris RJ, Najmanovich RJ, Kahraman A, Thornton JM (2005) Real spherical harmonic expansion coefficients as 3D shape descriptors for protein binding pocket and ligand comparisons. Bioinformatics 21:2347–2355

    Article  Google Scholar 

  31. Wood Z, Hoppe H, Desbrun M, Schröder P (2002) Isosurface topology simplification. In: SIGGRAPH

  32. Guskov I, Wood Z (2001) Topological noise removal. Graphics. In: Interface, pp 19–26

  33. Shattuck DW, Leahy RM (2001) Automated graph based analysis and correction of cortical volume topology. In: IEEE Transaction on Medical Imaging

  34. El-Sana J, Varshney A (1997) Controlled simplification of genus for polygonal models. In: Visualization, pp 403–412

  35. Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: International Conference on Pattern Recognition (ICPR’04), Cambridge, UK, pp 32–36

  36. Madzarov G, Gjorgjevikj D, Chorbev I (2009) A multi-class SVM classifier utilizing binary decision tree. Informatica 33:233–241

    MathSciNet  Google Scholar 

  37. Liu J, Shah M, Kuipers B, Savarese S (2011) Cross-view action recognition via view knowledge transfer. In: Computer Vision and Pattern Recognition (CVPR), Colorado, USA

  38. Junejo IN, Dexter E, Laptev I, Perez P (2008) Cross-view action recognition from temporal self-similarities. In: European Conference on Computer Vision (ECCV), Berlin, Heidelberg

  39. Liu J, Ali S, Shah M (2008) Recognizing human actions using multiple features. In: Computer Vision and Pattern Recognition (CVPR), Colorado, USA

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Parvin Razzaghi.

Additional information

M. Palhang and N. Gheissari contributed equally to this paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Razzaghi, P., Palhang, M. & Gheissari, N. A new invariant descriptor for action recognition based on spherical harmonics. Pattern Anal Applic 16, 507–518 (2013). https://doi.org/10.1007/s10044-012-0274-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-012-0274-x

Keywords

Navigation