Robust relative attributes for human action recognition

Zhang, Zhong; Wang, Chunheng; Xiao, Baihua; Zhou, Wen; Liu, Shuang

doi:10.1007/s10044-013-0349-3

Robust relative attributes for human action recognition

Industrial and Commercial Application
Published: 11 September 2013

Volume 18, pages 157–171, (2015)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Zhong Zhang¹,
Chunheng Wang¹,
Baihua Xiao¹,
Wen Zhou¹ &
…
Shuang Liu¹

564 Accesses
27 Citations
Explore all metrics

Abstract

High-level semantic feature is important to recognize human action. Recently, relative attributes, which are used to describe relative relationship, have been proposed as one of high-level semantic features and have shown promising performance. However, the training process is very sensitive to noises and moreover it is not robust to zero-shot learning. In this paper, to overcome these drawbacks, we propose a robust learning framework using relative attributes for human action recognition. We simultaneously add Sigmoid and Gaussian envelops into the loss objective. In this way, the influence of outliers will be greatly reduced in the process of optimization, thus improving the accuracy. In addition, we adopt Gaussian Mixture models for better fitting the distribution of actions in rank score space. Correspondingly, a novel transfer strategy is proposed to evaluate the parameters of Gaussian Mixture models for unseen classes. Our method is verified on three challenging datasets (KTH, UIUC and HOLLYWOOD2), and the experimental results demonstrate that our method achieves better results than previous methods in both zero-shot classification and traditional recognition task for human action recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discriminative Feature Fusion with Spectral Method for Human Action Recognition

Extracting Discriminative Parts with Flexible Number from Low-Rank Features for Human Action Recognition

Article 23 February 2016

Discriminative Joint Non-negative Matrix Factorization for Human Action Classification

References

Kuehne H, Jhuang H, Garrote E, Poggio T, Serre T (2011) HMDB: a large video database for human motion recognition. In: IEEE conference on computer vision (ICCV), pp 2556–2563
Aggarwal JK, Cai Q (1997) Human motion analysis: a review. In: IEEE nonrigid and articulated motion workshop, pp 90–102
Yilmaz A, Shah M (2005) Actions sketch: a novel action representation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 984–989
Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: IEEE conference on computer vision (ICCV), pp 1395–1402
Lin Z, Jiang Z, Davis LS (2009) Recognizing actions by shape-motion prototype trees. In: IEEE conference on computer vision (ICCV), pp 444–451
Lv F, Nevatia R (2007) Single view human action recognition using key pose matching and viterbi path searching. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Efros A, Berg A, Mori G, Malik J (2003) Recognizing action at a distance. In: IEEE conference on computer vision (ICCV), pp 726–733
Raptis M, Soatto S (2010) Tracklet descriptors for action modeling and video analysis. In: European Conference on Computer Vision (ECCV) pp 577–590
Dollár P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: IEEE conference on computer vision workshop on visual surveillance and performance evaluation of tracking and surveillance (VS-PETS), pp 65–72
Liu J, Shah M (2008) Learning human actions via information maximization. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Liu J, Yang Y, Shah M (2009) Learning semantic visual vocabularies using diffusion distance. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 461–468
Zhang Z, Wang C, Xiao B, Zhou W, Liu S (2012) Action recognition using context-constrained linear coding. IEEE Signal Process Lett 19(7):439–442
Article Google Scholar
Savarese S, DelPozo A, Niebles JC, Fei-Fei L (2008) Spatial-Temporal correlatons for unsupervised action classification. In: IEEE workshop on Motion and Video Computing (WMVC), pp 1–8
Ryoo MS, Aggarwal JK (2009) Spatio-temporal relationship match: Video structure comparison for recognition of complex human activities. In: IEEE conference on computer vision (ICCV), pp 1593–1600
Kovashka A, Grauman K (2010) Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2046–2053
Farhadi A, Endres I, Hoiem D, Forsyth D (2009) Describing objects by their attributes. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1778–1785
Lampert CH, Nickisch H, Harmeling S (2009) Learning to detect unseen object classes by between-class attribute transfer. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 951–958
Parikh D, Grauman K (2011) Relative attributes. In: IEEE conference on computer vision (ICCV), pp 503–510
Liu J, Kuipers B, Savarese S (2011) Recognizing human actions by attributes. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3337–3344
Kumar N, Berg AC, Belhumeur PN, Nayar SK (2009) Attribute and simile classifiers for face verification, In: IEEE conference on computer vision (ICCV), pp 365–372
Wang Y, Mori G (2010) A discriminative latent model of object classes and attributes. In: European Conference on Computer Vision (ECCV), pp 155–168
Hwang SJ, Sha F, Grauman K (2011) Sharing features between objects and their attributes. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1761–1768
Argyriou A, Evgeniou T, Pontil M (2008) Convex multi-task feature learning. Mach Learn 73(3):243–272
Article Google Scholar
Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l _{2, 1}-norm minimization. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence, pp 339–348
Berg T, Berg A, Shih J (2010) Automatic attribute discovery and characterization from noisy web data. In: European Conference on Computer Vision (ECCV), pp 663–676
Elsas JL, Carvalho VR, Carbonell JG (2008) Fast learning of document ranking functions with the committee perceptron. In: ACM conference on web search and data mining (WSDM), pp 55–64
Perez-Cruz F, Navia-Vazquez A, Figueiras-Vidal AR, Artes-Rodriguez A (2008) Empirical risk minimization for support vector classifiers. IEEE Trans Neural Netw 14(2):296–303
Article Google Scholar
Larochelle H, Erhan D, Bengio Y (2008) Zero-data learning of new tasks. In: AAAI Conference on Artificial Intelligence (AAAI), pp 646–651
Laptev I (2005) On space-time interest points. Int J Comput Vis (IJCV) 64(2):107–123
Article MathSciNet Google Scholar
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 886–893
Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. In: European Conference on Computer Vision (ECCV), pp 428–441
Wang H, Ullah MM, Klaser A, Laptev I, Schmid C (2009) Evaluation of local spatio-temporal features for action recognition. In: British Machine Vision Conference (BMVC)
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Article Google Scholar
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: A local SVM approach. In: International Conference on Pattern Recognition (ICPR), pp 32–36
Tran D, Sorokin A (2008) Human activity recognition with metric learning. In: European Conference on Computer Vision (ECCV), pp 548–561
Marszalek M, Laptev I, Schmid C (2009) Actions in context. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2929–2936
Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Wang J, Chen Z, Wu Y (2011) Action recognition with multiscale spatio-temporal contexts. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 3185–3192
Ji S, Xu W, Yang M, Yu K (2013) 3D convolutional neural networks for human action recognition In: IEEE Transactions on Pattern Analysis and Machine Intelligence
Han D, Bo L, Sminchisescu C (2009) Selection and context for action recognition. In: IEEE conference on computer vision (ICCV), pp 1933–1940
Gilbert A, Illingworth J, Bowden R (2009) Fast realistic multi-action recognition using mined dense spatio-temporal features. In: IEEE conference on computer vision (ICCV), pp 925–931
Ullah M, Parizi S, Laptev I (2010) Improving bag-of-features action recognition with non-local cues. In: British Machine Vision Conference (BMVC)
Chakraborty B, Holte M, Moeslund T, Gonzà àlez J (2012) Selective spatio-temporal interest points. Comput Vis Image Underst 116(3):396–410
Article Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (NSFC) under Grants No. 60933010, No. 61172103 and No. 61271429 and National High-tech R&D Program of China (863 Program) under Grant No. 2012AA041312.

Author information

Authors and Affiliations

The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, ZhongGuanCun East Rd. 95, Beijing, China
Zhong Zhang, Chunheng Wang, Baihua Xiao, Wen Zhou & Shuang Liu

Authors

Zhong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chunheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Baihua Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Wen Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Shuang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chunheng Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Z., Wang, C., Xiao, B. et al. Robust relative attributes for human action recognition. Pattern Anal Applic 18, 157–171 (2015). https://doi.org/10.1007/s10044-013-0349-3

Download citation

Received: 16 August 2012
Accepted: 19 August 2013
Published: 11 September 2013
Issue Date: February 2015
DOI: https://doi.org/10.1007/s10044-013-0349-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust relative attributes for human action recognition

Abstract

Access this article

Similar content being viewed by others

Discriminative Feature Fusion with Spectral Method for Human Action Recognition

Extracting Discriminative Parts with Flexible Number from Low-Rank Features for Human Action Recognition

Discriminative Joint Non-negative Matrix Factorization for Human Action Classification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Robust relative attributes for human action recognition

Abstract

Access this article

Similar content being viewed by others

Discriminative Feature Fusion with Spectral Method for Human Action Recognition

Extracting Discriminative Parts with Flexible Number from Low-Rank Features for Human Action Recognition

Discriminative Joint Non-negative Matrix Factorization for Human Action Classification

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation