Power difference template for action recognition

Wang, Liangliang; Li, Ruifeng; Fang, Yajun

doi:10.1007/s00138-017-0848-0

Power difference template for action recognition

Original Paper
Published: 14 June 2017

Volume 28, pages 463–473, (2017)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Liangliang Wang¹,
Ruifeng Li¹ &
Yajun Fang²

402 Accesses
4 Citations
Explore all metrics

Abstract

This paper proposes power difference template as a new spatial-temporal representation for action recognition. Specifically, spatial power features are first extracted according to the transform of Gaussian convolution on gradients between logarithmic and exponential domain. Using the forward–backward frame power difference method, we thus present normalized projection histogram (NPH) to characterize segmented action spatial features by normalizing histogram of the 2D horizontal–vertical projections. Furthermore, from the perspective of energy conservation, motion kinetic velocity (MKV) is introduced as a supplement for representing temporal relationships of power features by supposing that the variation of power is produced by motion in the form of kinetic energy. Our power difference template fusing NPH and MKV is further integrated to a bag of word model for training and testing under a support vector machine framework. Experiments on KTH, UCF Sports, UCF101 and HMDB datasets demonstrate the effectiveness of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective human action recognition using global and local offsets of skeleton joints

Article 20 July 2018

Human action recognition using bag of global and local Zernike moment features

Article 15 May 2019

Real-Time Human Action Recognition Using DMMs-Based LBP and EOH Features

References

Ma, S., et al.: Action recognition and localization by hierarchical space-time segments. In: Proceedings of IEEE Conference on Computer Vision, pp. 2744–2751 (2013)
Cao, X., et al.: Action recognition using 3d daisy descriptor. Mach. Vis. Appl. 25, 159–171 (2014)
Article Google Scholar
Ballas, N., et al.: Space-time robust representation for action recognition. In: Proceedings of IEEE Conference Computer Vision, pp. 2704–2711 (2013)
LE, Q., et al.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Proceedings of IEEE Conference Computer Vision Pattern Recognition, pp. 3361–3368 (2011)
Wang, H., et al.: Action recognition using nonnegative action component representation and sparse basis selection. IEEE Trans. Image Process. 23, 570–581 (2014)
Article MathSciNet Google Scholar
Cao, L., et al.: Scene aligned pooling for complex video recognition. In: Proceedings of European Conference Computer Vision, pp. 688–701 (2012)
Brendel, W., Todorovic, S.: Learning spatiotemporal graphs of human activities. In: Proceedings of IEEE International Conference Computer Vision, pp. 778–785 (2011)
Laptev, I., Linderberg, T.: Space-time interest points. In: Proceedings of IEEE International Conference Computer Vision, pp. 3362–3364 (2003)
Willems, T.T.G., Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: Proceedings of European Conference Computer Vision (2008)
Laptev, I., et al.: Learning realistic human actions from movies. In: IEEE Conference Computer Vision Pattern Recognition, pp. 23–28 (2008)
Klaser, M.M.A., Schmid, C.: A spatio-temporal descriptor based on 3d gradients. In: Proceedings of Bmvc (2008)
Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: Proceedings of IEEE Conference Computer Vision Pattern Recognition, pp. 2046–2053 (2010)
Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23, 257–267 (2001)
Article Google Scholar
Rodriguez, J.A.M., Shah, M.: Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings of IEEE Conference Computer Vision Pattern Recognition, pp. 1–8 (2008)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1106–1114 (2012)
Google Scholar
Tran, D., et al.: Learning spatiotemporal features with 3d convolutional network. In: ICCV, pp. 4489–4497 (2015)
Veeriah, V., Zhuang, N., Qi, G.: Differential recurrent neural networks for action recognition. In: ICCV, pp. 4041–4049 (2015)
Xin, M., et al.: Arch: Adaptive recurrent-convolutional hybrid networks for long-term action recognition. Neurocomputing 178, 87–102 (2016)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proceedings of IEEE Conference Computer Vision Pattern Recognition, pp. 886–893 (2005)
Schuldt, I.L.C., Caputo, B.: Recognizing human actions: a local svm approach. In: Proceedings of 17th International Conference Pattern Recognition, pp. 32–36 (2004)
Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28, 976–990 (2010)
Article Google Scholar
Wang, H., et al.: Evaluation of local spatio-temporal features for action recognition. In: Proceedings of IEEE British Machine Vision Conference, pp. 124.1–124.11 (2009)
Kaaniche, M., Bremond, F.: Gesture recognition by learning local motion signatures. In: Proceedings of IEEE Conference Computer Vision Pattern Recognition, pp. 2745–2752 (2010)
Wu, X., et al.: Action recognition using context and appearance distribution features. In: Proceedings of IEEE Conference Computer Vision Pattern Recognition, pp. 489–496 (2011)
Sadanand, S., Corso, J.J.: Action bank: a high-level representation of activity in video. In: Proceedings of IEEE Conference Computer Vision Pattern Recogniton, pp. 1234–1241 (2013)
Derpanis, K., et al.: Action spotting and recognition based on a spatiotemporal orientation analysis. IEEE Trans. Pattern. Anal. Mach. Intell. 35, 527–540 (2013)
Article Google Scholar
Eweiwi, M.C.A., Bauckhage, C.: Action recognition in still images by learning spatial interest regions from videos. Pattern. Recognit. Lett. 51, 8–15 (2014)
Article Google Scholar
Jiang, Z.L.Z., Davvis, L.: A unified tree-based framework for joint action localization, recognition and segmentation. Comput. Vis. Image Underst 117, 1345–1355 (2013)
Article Google Scholar
Adeli-Mosabbeb, E., Fathy, M.: Non-negative matrix completion for action detection. Image Vis. Comput. 39, 38–51 (2015)
Article Google Scholar
Sheng, W.Y.B., Sun, C.: Action recognition using direction-dependent feature pairs and non-negative low rank sparse model. Neuro 158, 73–80 (2015)
Google Scholar
Kuehne, H., et al.: Hmdb: a large video database for human motion recognition. In: Proceedings of IEEE International Conference Computer Vision, pp. 2556–2563 (2011)
Chang, C., Lin, C.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2011)
Article Google Scholar
Shih, Y., et al.: Style transfer for headshot portraits. ACM Trans. Graph. 33 (2014)
Khan, R., et al.: Spatial histograms of soft pairwise similar patches to improve the bag-of-visual-words model. Comput. Vis. Image. Underst. 132, 102–112 (2015)
Article Google Scholar
Su, F.D.S., Agrawala, M.: De-emphasis of distracting image regions using texture power maps. In: Proceedings of Symposium Applied Perception Graphics Visual, pp. 119–124 (2005)
Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. Comput. Sci. (2016)
Vedaldi, A., Fulkerson, B.: Vlfeat: an open and portable library of computer vision algorithms. VLFeat.org (Online). http://www.vlfeat.org/ (2008)
Li, F., et al.: Libsvm-parallel-chi2 library, version 1.0. version. Dept. Mathe. Sci., Lund Uni. Lund, Sweden (Online). http://www.maths.lth.se/matematiklth/personal/sminchis/code/libsvm-chi2.html/ (2012)
Iosifidis, A.T.A., Pitas, I.: Discriminant bag of words based representation for human action recognition. Pattern Recognit. Let. 49, 185–192 (2014)
Article Google Scholar
Karpathy, A., et al.: Large-scale video classification with convolutional neural networks. In: CVPR, pp. 1725–1732 (2014)
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of IEEE International Conference Computer Vision, pp. 3551–3558 (2013)

Download references

Acknowledgements

This research is supported by the National Natural Science Foundation of China (Grant No.: 661273339). The author also would like to thank Berthold K. P. Horn for his good ideas during author’s visit study at MIT CSAIL.

Author information

Authors and Affiliations

State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin, 150001, China
Liangliang Wang & Ruifeng Li
Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
Yajun Fang

Authors

Liangliang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ruifeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Yajun Fang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liangliang Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, L., Li, R. & Fang, Y. Power difference template for action recognition. Machine Vision and Applications 28, 463–473 (2017). https://doi.org/10.1007/s00138-017-0848-0

Download citation

Received: 28 March 2016
Revised: 01 January 2017
Accepted: 22 May 2017
Published: 14 June 2017
Issue Date: August 2017
DOI: https://doi.org/10.1007/s00138-017-0848-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Power difference template for action recognition

Abstract

Access this article

Similar content being viewed by others

Effective human action recognition using global and local offsets of skeleton joints

Human action recognition using bag of global and local Zernike moment features

Real-Time Human Action Recognition Using DMMs-Based LBP and EOH Features

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Power difference template for action recognition

Abstract

Access this article

Similar content being viewed by others

Effective human action recognition using global and local offsets of skeleton joints

Human action recognition using bag of global and local Zernike moment features

Real-Time Human Action Recognition Using DMMs-Based LBP and EOH Features

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation