Abstract
Human action recognition is an important problem in Computer Vision. Although most of the existing solutions provide good accuracy results, the methods are often overly complex and computationally expensive, hindering practical applications. In this regard, we introduce the combination of time-series representation for the silhouette and Symbolic Aggregate approXimation (SAX), which we refer to as SAX-Shapes, to address the problem of human action recognition. Given an action sequence, the extracted silhouettes of an actor from every frame are transformed into time series. Each of these time series is then efficiently converted into the symbolic vector: SAX. The set of all these SAX vectors (SAX-Shape) represents the action. We propose a rotation invariant distance function to be used by a random forest algorithm to perform the human action recognition. Requiring only silhouettes of actors, the proposed method is validated on two public datasets. It has an accuracy comparable to the related works and it performs well even in varying rotation.






Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bobick, A., Davis, J.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001)
Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Underst. 103(2–3), 249–257 (2006)
Syeda-Mahmood, T., Vasilescu, M., Sethi, S.: Recognizing action events from multiple viewpoints. In: Proc. EventVideo, pp. 64–72 (2001)
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)
Grundmann, M., Meier, F., Essa, I.: 3d shape context and distance transform for action recognition. In: Proc. ICPR, pp. 1–4 (2008)
Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2/3), 107–123 (2005)
Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3337–3344 (2011)
Niebles, J., Wang, H., Li, F.: Unsupervised learning of human action categories using spatial-temporal words. In: Proc. BMVC (2006)
Choi, W., Shahid, K., Savarese, S.: Learning context for collective activity recognition. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3273–3280 (2011)
Efros, A.A., Berg, A.C., Berg, E.C., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV, pp. 726–733 (2003)
Morency, L.-P., Quattoni, A., Darrell, T.: Latent-dynamic discriminative models for continuous gesture recognition. In: Proc. CVPR (2007)
Yilmaz, A., Shah, M.: Actions sketch: a novel action representation. In: Proc. CVPR, pp. I:984–989 (2005)
Lin, Z., Jiang, Z., Davis, L.S.: Recognizing actions by shape-motion prototype trees. In: Proc. ICCV (2009)
Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: Proc. CVPR, pp. 3169–3176 (2011)
Bregonzio, M., Gong, S., Xiang, T.: Recognising action as clouds of space-time interest points. In: Proc. CVPR, pp. 1948–1955 (2009)
Ye, L., Keogh, E.J.: Time series shapelets: a new primitive for data mining. In: Knowledge Discovery and Data Mining, pp. 947–956 (2009)
Keogh, E.J., Pazzani, M.J.: Scaling up dynamic time warping for datamining applications. In: Knowledge Discovery and Data Mining, pp. 285–289 (2000)
Lin, J., Keogh, E.J., Wei, L., Lonardi, S.: Experiencing sax: a novel symbolic representation of time series. Data Min. Knowl. Discov. 15(2), 107–144 (2007)
Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. SIGMOD Rec. 23(2), 419–429 (1994)
Chan, K.-P., Fu, W.-C.: Efficient time series matching by wavelets. In: International Conference on Data Engineering (1994)
Lin, J., Keogh, E.J., Lonardi, S., Chiu, B.Y.: A symbolic representation of time series, with implications for streaming algorithms. In: 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 2–11 (2003)
Freeman, H.: On the encoding of arbitrary geometric configurations. IRE Trans. Electron. Comput. EC-10(2), 260–268 (1961)
Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24(4), 509–522 (2002)
Abdelkader, M.F., Abd-Almageed, W., Srivastava, A., Chellappa, R.: Silhouette-based gesture and action recognition via modeling trajectories on Riemannian shape manifolds. Comput. Vis. Image Underst. 115(3), 439–455 (2011)
Hsiao, P.-C., Chen, C.-S., Chang, L.-W.: Human action recognition using temporal-state shape contexts. In: 19th International Conference on Pattern Recognition, 2008. ICPR 2008, pp. 1–4 (2008)
Lee, C.-S., Lui, Y.M., Chun, S.Y.: In: ICCV Workshops, pp. 1318–1323 (2011)
Matikainen, P., Hebert, M., Sukthankar, R.: Representing pairwise spatial and temporal relations for action recognition. In: Daniilidis, N.P.K., Maragos, P. (eds.) European Conference on Computer Vision 2010 (ECCV 2010) (2010)
Messing, R., Pal, C., Kautz, H.: Activity Recognition Using the Velocity Histories of Tracked Keypoints, pp. 104–111 (2009)
Ali, S., Basharat, A., Shah, M.: Chaotic invariants for human action recognition. In: Proc. ICCV (2007)
Li, R., Tian, T., Sclaroff, S.: Simultaneous learning of nonlinear manifold and dynamical models for high-dimensional time series. In: Proc. ICCV (2007)
Keogh, E., Chu, S., Hart, D., Pazzani, M.: An online algorithm for segmenting time series. In: ICDM, pp. 289–296 (2001)
Popivanov, I.: Similarity search over time series data using wavelets. In: ICDE, pp. 212–221 (2002)
Aghbari, Z.A.: Effective image mining by representing color histograms as time series. J. Adv. Comput. Intell. Intell. Inform. 13(2), 109–114 (2009)
Junejo, I.N., Aghbari, Z.A.: Using sax representation for human action recognition. J. Vis. Commun. Image Represent. 23(6), 853–861 (2012)
Keogh, E., Wei, L., Xi, X., Lee, S.-H., Vlachos, M.: Lb Keogh supports exact indexing of shapes under rotation invariance with arbitrary representations and distance measures. In: 33rd Very Large Data Bases Conference (2006)
Xi, X., Keogh, E., Wei, L., Mafra-Neto, A.: Finding motifs in database of shapes. In: Proc. SIAM Intl. Conf. on Data Mining. (2007)
Zunic, J., Rosin, P.L., Kopanja, L.: On the orientability of shapes. IEEE Trans. Image Process. 15(11), 3478–3487 (2006)
Bhanu, B., Zhou, X.: Face recognition from face profile using dynamic time warping. In: Proc. Intl. Conference on Pattern Recognition (2006)
Zhang, D., Lu, G.: Review of shape representation and description techniques. Pattern Recognit. 37(1), 1–19 (2004)
Cardone, A., Gupta, S.K., Karnik, M.: A survey of shape similarity assessment algorithms for product design and manufacturing applications. J. Comput. Inf. Sci. Eng. 3, 109–118 (2003)
Adamek, T., O’Connor, N.E.: A multiscale representation method for nonrigid shapes with a single closed contour. IEEE Trans. Circuits Syst. Video Technol. 14(5), 742–753 (2004)
Attalla, E., Siy, P.: Robust shape similarity retrieval based on contour segmentation polygonal multiresolution and elastic matching. Pattern Recognit. 38(12), 2229–2241 (2005)
Keogh, E.J., Lin, J., Fu, A.W.-C.: Hot sax: efficiently finding the most unusual time series subsequence. In: IEEE International Conf. on Data Mining (ICDM), pp. 226–233 (2005)
Javed, O., Shah, M.: Tracking and object classification for automated surveillance. In: The Seventh European Conference on Computer Vision (ECCV) (2002)
Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3D exemplars. In: Proc. ICCV (2007)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Aha, D., Kibler, D.: Instance-based learning algorithms. Mach. Learn. 6, 37–66 (1991)
Ikizler, N., Duygulu, P.: Human action recognition using distribution of oriented rectangular patches. In: Workshop on Human Motion, pp. 271–284 (2007)
Acknowledgements
This research is funded by University of Sharjah (Project 120227).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Junejo, I.N., Junejo, K.N. & Aghbari, Z.A. Silhouette-based human action recognition using SAX-Shapes. Vis Comput 30, 259–269 (2014). https://doi.org/10.1007/s00371-013-0842-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-013-0842-0