A compact discriminant hierarchical clustering approach for action recognition

Tong, Ming; Tian, Weijuan; Wang, Houyi; Wang, Fan

doi:10.1007/s11042-017-4660-7

A compact discriminant hierarchical clustering approach for action recognition

Published: 18 April 2017

Volume 77, pages 7539–7564, (2018)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Ming Tong¹,
Weijuan Tian¹,
Houyi Wang¹ &
…
Fan Wang¹

389 Accesses
1 Citation
Explore all metrics

Abstract

In order to improve the accuracy of action recognition, a compact discriminant hierarchical clustering approach and an action recognition new framework are respectively proposed. Firstly, on the bases of low-level features 3D Self-Correlation Histogram of Oriented Gradient in Trajectory (3D_SCHOGT) and 3D Self-Correlation Histogram of Oriented Optical Flow in Trajectory (3D_SCHOOFT), the mid-level semantics possessing purity, representativeness and discriminativeness simultaneously are obtained using the proposed compact discriminant hierarchical clustering approach, in which removal of singularities, quantitative evaluations of purity, representativeness and discriminativeness, as well as additive constraint of information entropies for clusters are conducted respectively to assure the better purity, representativeness and discriminativeness. Secondly, by introducing category constraint, a discriminant classification model of Category Constraint Latent Support Vector Machines (CC-LSVM) is proposed, which enhances the discriminative ability of classifier. Finally, to further improve the accuracy of action recognition, a new framework is proposed, which introduces low-level features, mid-level semantics and mid-level semantic self-correlation features into the proposed CC-LSVM classifier in a weighted association way, makes full use of category information of actions, and mines the correlations between multi-semantic features and action categories. Consequently, the action recognition accuracy is improved. The accuracies on Weizmann, KTH, UCF-Sports and YouTube datasets are 100%, 98.83%, 98.67% and 90.73% respectively, which outperform all those in contrastive methods. Experiments demonstrate the effectiveness of proposed compact discriminant hierarchical clustering approach and new framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discriminative Feature Fusion with Spectral Method for Human Action Recognition

Efficient Silhouette-based Input Methods for Reliable Human Action Recognition from Videos

Action recognition new framework with robust 3D-TCCHOGAC and 3D-HOOFGAC

Article 27 January 2016

References

Azizpour H (2016) Visual representations and models: from latent SVM to deep learning. Dissertation, KTH - Royal Institute of Technology
Bajcsy P, Ahuja N (1998) Location-and density-based hierarchical clustering using similarity analysis. IEEE Trans Pattern Anal Mach Intell 20(9):1011–1015
Article Google Scholar
Benmokhtar R (2014) Robust human action recognition scheme based on high-level feature fusion. Multimed Tools Appl 69(2):253–275
Article Google Scholar
Blake C, Merz CJ (1998) UCI Repository of machine learning databases. http://archive.ics.uci.edu/ml/
Byrne J (2015) Nested motion descriptors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 502–510
Cao XQ, Liu ZQ (2015) Type-2 fuzzy topic models for human action recognition. IEEE Trans Fuzzy Syst 23(5):1581–1593
Article Google Scholar
Chaquet JM, Carmona EJ, Fernández-Caballero A (2013) A survey of video datasets for human action and activity recognition. Comput Vis Image Underst 117(6):633–659
Article Google Scholar
Chatzis SP, Kosmopoulos D (2015) A nonparametric bayesian approach toward stacked convolutional independent component analysis. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2803–2811
Cho J, Lee M, Chang HJ, Oh S (2014) Robust action recognition using local motion and group sparsity. Pattern Recogn 47(5):1813–1825
Article Google Scholar
Das S, Abraham A, Konar A (2008) Automatic clustering using an improved differential evolution algorithm. IEEE Trans Syst Man Cybern 38(1):218–237
Article Google Scholar
Derpanis KG, Sizintsev M, Cannons KJ, Wildes RP (2013) Action spotting and recognition based on a spatiotemporal orientation analysis. IEEE Trans Pattern Anal Mach Intell 35(3):527–540
Article Google Scholar
Felzenszwalb P, Girshick R, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253
Article Google Scholar
Goudelis G, Karpouzis K, Kollias S (2013) Exploring trace transform for robust human action recognition. Pattern Recogn 46(12):3238–3248
Article Google Scholar
Guha T, Ward RK (2012) Learning sparse representations for human action recognition. IEEE Trans Pattern Anal Mach Intell 34(8):1576–1588
Article Google Scholar
Hsu YP, Liu C, Chen TY, Fu LC (2016) Online view-invariant human action recognition using rgb-d spatio-temporal matrix. Pattern Recogn 60:215–226
Article Google Scholar
Ikizler-Cinbis N, Sclaroff S (2010) Object, scene and actions: combining multiple features for human action recognition. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 494–507
Jain A, Gupta A, Rodriguez M, Davis LS (2013) Representing videos using mid-level discriminative patches. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2571–2578
Jiang Z, Lin Z, Davis LS (2013) Label consistent K-SVD: learning a discriminative dictionary for recognition. IEEE Trans Pattern Anal Mach Intell 335(11):2651–2664
Article Google Scholar
Junejo I, Dexter E, Laptev I, Perez P (2011) View-independent action recognition from temporal self-similarities. IEEE Trans Pattern Anal Mach Intell 33(1):172–185
Article Google Scholar
Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892
Article MATH Google Scholar
Klaser A, Marszałek M, Schmid C (2008) A spatio-temporal descriptor based on 3d–gradients. In: British Machine Vision Conference (BMVC), pp 275:1–275:10
Kobayashi T, Otsu N (2012) Motion recognition using local auto-correlation of space–time gradients. Pattern Recogn Lett 33(9):1188–1195
Article Google Scholar
Lan T, Zhu Y, Zamir AR, Savarese S (2015) Action recognition by hierarchical mid-level action elements. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 4552–4560
Le QV, Zou WY, Yeung SY, Ng AY (2011) Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3361–3368
Li LJ, Jha RK, Thomee B, Shamma DA, Cao L, Wang Y (2016) Where the photos were taken: location prediction by learning from flickr photos. In: Large-Scale Visual Geo-Localization, pp 41–58
Liu J, Luo J, Shah M (2009) Recognizing realistic actions from videos “in the wild”. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1996–2003
Liu J, Yang Y, Saleemi I, Shah M (2012) Learning semantic features for action recognition via diffusion maps. Comput Vis Image Underst 116(3):361–377
Article Google Scholar
Liu L, Shao L, Zheng F, Li X (2014) Realistic action recognition via sparsely-constructed Gaussian processes. Pattern Recogn 47(12):3819–3827
Article Google Scholar
Liu W, Liu H, Tao D, Wang Y, Lu K (2015) Multiview hessian regularized logistic regression for action recognition. Signal Process 110:101–107
Article Google Scholar
Liu L, Shao L, Li X, Lu K (2016) Learning spatio-temporal representations for action recognition: a genetic programming approach. IEEE Trans Cybern 46:158–170
Article Google Scholar
Narayan S, Ramakrishnan KR (2014) A cause and effect analysis of motion trajectories for modeling actions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2633–2640
Nasiri JA, Charkari NM, Mozafari K (2014) Energy-based model of least squares twin support vector machines for human action recognition. Signal Process 104:248–257
Article Google Scholar
Nguyen TV, Song Z, Yan S (2015) STAP: spatial-temporal attention-aware pooling for action recognition. IEEE Trans Circuits Syst Video Technol 25(1):77–86
Article Google Scholar
Niebles JC, Wang H, Li FF (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vis 79(3):299–318
Article Google Scholar
Pei L, Ye M, Xu P, Li T (2015) Fast multi-class action recognition by querying inverted index tables. Multimed Tools Appl 74(23):10801–10822
Article Google Scholar
Peng X, Qiao Y, Peng Q, Qi X (2013) Exploring motion boundary based sampling and spatial-temporal context descriptors for action recognition. In: British Machine Vision Conference (BMVC), pp 1–11
Peng X, Zou C, Qiao Y, Peng Q (2014) Action recognition with stacked fisher vectors. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 581–595
Ramakanth SA, Babu RV (2012) Feature match: an efficient low dimensional PatchMatch technique. In: Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP), pp 45:1–45:7
Raptis M, Kokkinos I, Soatto S (2012) Discovering discriminative action parts from mid-level video representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1242–1249
Rodriguez MD, Ahmed J, Shah M (2008) Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1–8
Rrnyi A (1961) On measures of entropy and information. In: Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability, pp 547–561
Sapienza M, Cuzzolin F, Torr PH (2014) Learning discriminative space-time action parts from weakly labelled videos. Int J Comput Vis 110(1):30–47
Article Google Scholar
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Proceedings of the IEEE International Conference on Pattern Recognition (ICPR), pp 32–36
Shao L, Zhen X, Tao D, Li X (2014) Spatio-temporal laplacian pyramid coding for action recognition. IEEE Trans Cybern 44(6):817–827
Article Google Scholar
Singh S, Gupta A, Efros AA (2012) Unsupervised discovery of mid-level discriminative patches. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 73–86
Wang H, Kläser A, Schmid C, Liu CL (2011) Action recognition by dense trajectories. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3169–3176
Wang J, Chen Z, Wu Y (2011) Action recognition with multiscale spatio-temporal contexts. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 3185–3192
Wang H, Yuan C, Weiming H, Sun C (2012) Supervised class-specific dictionary learning for sparse modeling in action recognition. Pattern Recogn 45(11):3902–3911
Article Google Scholar
Wang C, Wang Y, Yuille AL (2013) An approach to pose-based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 915–922
Wang H, Kläser A, Schmid C, Liu CL (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103(1):60–79
Article MathSciNet Google Scholar
Wang L, Qiao Y, Tang X (2013) Mining motion atoms and phrases for complex action recognition. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 2680–2687
Wang L, Qiao Y, Tang X (2013) Motionlets: mid-level 3D parts for human motion recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2674–2681
Wang H, Yuan C, Hu W, Ling H, Yang W, Sun C (2014) Action recognition using nonnegative action component representation and sparse basis selection. IEEE Trans Image Process 23(2):570–581
Article MathSciNet MATH Google Scholar
Wang L, Qiao Y, Tang X (2016) MoFAP: a multi-level representation for action recognition. Int J Comput Vis 119(3):254–271
Article MathSciNet Google Scholar
Wang X, Thome N, Cord M (2016) Gaze latent support vector machine for image classification. In: IEEE International Conference on Image Processing (ICIP), pp 236–240
Wang X, Yang X, Liu W, Duan C, Latecki LJ (2016) Location-aware image classification. In: International Confernce on Multimedia Modeling (MMM), pp 829–841
Wu X, Xu D, Duan L, Luo J (2011) Action recognition using context and appearance distribution features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 489–496
Wu X, Xu D, Duan L, Luo J, Jia Y (2013) Action recognition using multilevel features and latent structural SVM. IEEE Trans Circuits Syst Video Technol 23(8):1422–1431
Article Google Scholar
Yang X, Tian Y (2014) Action recognition using super sparse coding vector with spatio-temporal awareness. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 727–741
Yang Y, Liu R, Deng C, Gao X (2016) Multi-task human action recognition via exploring super-category. Signal Process 124:36–44
Article Google Scholar
Yi Y, Lin Y (2013) Human action recognition with salient trajectories. Signal Process 93(11):2932–2941
Article Google Scholar
Yi Y, Lin M (2016) Human action recognition with graph-based multiple-instance learning. Pattern Recogn 53:148–162
Article Google Scholar
Yu H, Liu Z, Wang G (2014) An automatic method to determine the number of clusters using decision-theoretic rough set. Int J Approx Reason 55(1):101–115
Article MathSciNet MATH Google Scholar
Yuan C, Li X, Hu W, Ling H, Maybank S (2013) 3D R transform on spatio-temporal interest points for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 724–730
Zhang Z, Tao D (2012) Slow feature analysis for human action recognition. IEEE Trans Pattern Anal Mach Intell 34(3):436–450
Article MathSciNet Google Scholar
Zhang H, Zhou W, Reardon C, Parker LE (2014) Simplex-based 3D spatio-temporal feature description for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2067–2074
Zhou W, Wang C, Xiao B, Zhang Z (2014) Action recognition via structured codebook construction. Signal Process Image Commun 29(4):546–555
Article Google Scholar
Zhu J, Wang B, Yang X, Zhang W, Tu Z (2013) Action recognition with actons. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp 3559–3566

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China [No. 61072110]; Science and Technology Overall Innovation Project of Shaanxi Province [2013KTZB03-03-03]; Industrial Research Project of Shaanxi Province [2015GY011]; International Cooperation Project of Shaanxi Province [2015KW-004]; International Cooperation Project of Shaanxi Province [2016KW-042].

Author information

Authors and Affiliations

School of Electronic Engineering, Xidian University, Xi’an, 710071, China
Ming Tong, Weijuan Tian, Houyi Wang & Fan Wang

Authors

Ming Tong
View author publications
You can also search for this author in PubMed Google Scholar
Weijuan Tian
View author publications
You can also search for this author in PubMed Google Scholar
Houyi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Fan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ming Tong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tong, M., Tian, W., Wang, H. et al. A compact discriminant hierarchical clustering approach for action recognition. Multimed Tools Appl 77, 7539–7564 (2018). https://doi.org/10.1007/s11042-017-4660-7

Download citation

Received: 02 September 2016
Revised: 10 February 2017
Accepted: 29 March 2017
Published: 18 April 2017
Issue Date: March 2018
DOI: https://doi.org/10.1007/s11042-017-4660-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A compact discriminant hierarchical clustering approach for action recognition

Abstract

Access this article

Similar content being viewed by others

Discriminative Feature Fusion with Spectral Method for Human Action Recognition

Efficient Silhouette-based Input Methods for Reliable Human Action Recognition from Videos

Action recognition new framework with robust 3D-TCCHOGAC and 3D-HOOFGAC

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A compact discriminant hierarchical clustering approach for action recognition

Abstract

Access this article

Similar content being viewed by others

Discriminative Feature Fusion with Spectral Method for Human Action Recognition

Efficient Silhouette-based Input Methods for Reliable Human Action Recognition from Videos

Action recognition new framework with robust 3D-TCCHOGAC and 3D-HOOFGAC

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation