Action recognition using edge trajectories and motion acceleration descriptor

Wang, Xiaofang; Qi, Chun

doi:10.1007/s00138-016-0746-x

Action recognition using edge trajectories and motion acceleration descriptor

Original Paper
Published: 30 January 2016

Volume 27, pages 861–875, (2016)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Xiaofang Wang^1,2 &
Chun Qi¹

581 Accesses
13 Citations
Explore all metrics

Abstract

This paper presents a method for action recognition based on edge trajectories. First, to exploit long-term motion information for action representation more effectively, we propose to track edge points across video frames to extract spatiotemporal edge trajectories and use the ones derived from the edge points located on the boundaries of action-related area to describe actions. Second, besides the existing shape, histogram of oriented gradients, histogram of optical flow and motion boundary histogram, a new trajectory descriptor named histogram of motion acceleration is introduced, which is computed using the temporal derivative of the optical flow in the spatiotemporal neighborhood centered along a trajectory and describes the temporal relative motion of actions. Finally, using Fisher vector to encode trajectory descriptors and MKL-based multi-class SVM to predict action labels, we evaluate the proposed approach on seven benchmark datasets, namely KTH, ADL, UT-Interaction, UCF sports, YouTube, HMDB51 and UCF101. The experimental results demonstrate the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

Human Action Recognition and Prediction: A Survey

Article 28 March 2022

Yu Kong & Yun Fu

Real-Time Human Pose Detection and Recognition Using MediaPipe

References

Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010)
Weinland, D., Ronfard, R., Boyer, E.: A survey of vision-based methods for action representation, segmentation and recognition. Comput. Vis. Image Underst. 115(2), 224–241 (2011)
Aggarwal, J., Ryoo, M.: Human activity analysis: a review. ACM Comput. Surv. 43(3), 1–43 (2011)
Chaquet, J.M., Carmona, E.J., Caballero, A.F.: A survey of video datasets for human action and activity recognition. Comput. Vis. Image Underst. 117, 633–659 (2013)
Article Google Scholar
Niebles, J., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vis. 79(3), 299–318 (2008)
Niebles, J., Fei-Fei, L.: A hierarchical model of shape and appearance for human action classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2007)
Zhang, Y., Liu, X., Chang, M.-C., Ge, W., Chen, T.: Spatio-temporal phrases for activity recognition. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 707–721 (2012)
Gaur, U., Zhu, Y., Song, B., Roy-Chowdhury, A.: A “string of feature graphs” model for recognition of complex activities in natural videos. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2595–2602 (2011)
Bregonzio, M., Xiang, T., Gong, S.: Fusing appearance and distribution information of interest points for action recognition. Pattern Recognit. 45(3), 1220–1234 (2012)
Li, N., Cheng, X., Zhang, S., Wu, Z.: Realistic human action recognition by Fast HOG3D and self-organization feature map. Mach. Vis. Appl. 25, 1793–1812 (2014)
Article Google Scholar
Sun, J., Wu, X., Yan, S., Cheong, L.-F., Chua, T.-S., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2004–2011 (2009)
Messing, R., Pal, C., Kautz, H.: Activity recognition using the velocity histories of tracked keypoints. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 104–111 (2009)
Wu, S., Oreifej, O., Shah, M.: Action recognition in videos acquired by a moving camera using motion decomposition of Lagrangian particle trajectories. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 1419–1426 (2011)
Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176 (2011)
Yuan, F., Xia, G.-S., Sahbi, H., Prinet, V.: Mid-level features and spatio-temporal context for activity recognition. Pattern Recognit. 45(12), 4182–4191 (2012)
Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Dense trajectories and motion boundary descriptors for action recognition. Int. J. Comput. Vis. 103(1), 60–79 (2013)
Raptis, M., Kokkinos, I., Soatto, S.: Discovering discriminative action parts from mid-level video representations. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1242–1249 (2012)
Sun, J., Mu, Y., Yan, S., Cheong, L.-F.: Activity recognition using dense long-duration trajectories. In: Proceedings of International Conference on Multimedia and Expo (ICME), pp. 322–327 (2010)
Bregonzio, M., Li, J., Gong, S., Xiang, T.: Discriminative topics modelling for action feature selection and recognition. In: Proceedings of British Machine Vision Conference (BMVC), pp. 1–11 (2010)
Yuan, F., Prinet, V., Yuan, J.: Middle-level representation for human activities recognition: the role of spatio-temporal relationships. Proc. Eur. Conf. Comput. Vis. Workshop Hum. Motion (ECCVW) 6553, 168–180 (2010)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2–3), 107–123 (2005)
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of International Joint Conference on Artificial Intelligence, pp. 674–679 (1981)
Sundaram, N., Brox, T., Keutzer, K.: Dense point trajectories by GPU-accelerated large displacement optical flow. Proc. Eur. Conf. Comput. Vis. (ECCV) 6311, 438–451 (2010)
Google Scholar
Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: Proceedings of British Machine Vision Conference (BMVC), pp. 1–11 (2009)
Vig, E., Dorr, M., Cox, D.: Space-variant descriptor sampling for action recognition based on saliency and eye movements. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 84–97 (2012)
Somasundaram, G., Cherian, A., Morellas, V., Papanikolopoulos, N.: Action recognition using global spatio-temporal features derived from sparse representations. Comput. Vis. Image Underst. 123, 1–13 (2014)
Article Google Scholar
Peng, X., Qiao, Y., Peng, Q.: Exploring motion boundary based sampling and spatial-temporal context descriptors for action recognition. In: Proceedings of British Machine Vision Conference (BMVC), pp. 59.1–59.11 (2013)
Peng, X., Qiao, Y., Peng, Q.: Motion boundary based sampling and 3D co-occurrence descriptors for action recognition. Image Vis. Comput. 32(9), 616–628 (2014)
Article Google Scholar
Murthy, O.R., Goecke, R.: Ordered trajectories for large scale human action recognition. In: Proceedings of IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 412–419 (2013)
Yi, Y., Lin, Y.: Human action recognition with salient trajectories. Signal Process. 93(11), 2932–2941 (2013)
Wang, L., Wang, Y., Jiang, T., Zhao, D., Gao, W.: Learning discriminative features for fast frame-based action recognition. Pattern Recognit. 46, 1832–1840 (2013)
Article Google Scholar
Bai, S., Matsumoto, T., Takeuchi, Y., Kudo, H., Ohnishi, N.: Informative patches sampling for image classification by utilizing bottom-up and top-down information. Mach. Vis. Appl. 24(5), 959–970(2013)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)
Jain, M., Jégou, H., Bouthemy, P.: Better exploiting motion for better action recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2555–2562 (2013)
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime TV-L1 optical flow. Pattern Recognit. 4713, 214–223 (2007)
Article Google Scholar
Wang, X., Wang, L., Qiao, Y.: A comparative study of encoding, pooling and normalization methods for action recognition. In: Proceedings of Asian Conference on Computer Vision (ACCV), pp. 572–585 (2012)
Peng, X., Wang, L., Wang, X., Qiao, Y.: Bag of visual words and fusion methods for action recognition: comprehensive study and good practice. arXiv:1405.4506 (2014)
Ballan, L., Bertini, M., Bimbo, A., Seidenari, L., Serra, G.: Effective codebooks for human action representation and classification in unconstrained videos. IEEE Trans. Multimed. 14(4), 1234–1245 (2012)
Article Google Scholar
Tian, Y., Ruan, Q., An, G., Xu, W.: Context and locality constrained linear coding for human action recognition. Neurocomputing 167, 359–370 (2015)
Article Google Scholar
Zhou, W., Wang, C., Xiao, B., Zhang, Z.: Human action recognition using weighted pooling. IET Comput. Vis. 8(6), 579–587 (2014)
Article Google Scholar
Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.: Image classification with the fisher vector: theory and practice. Int. J. Comput. Vis. 105(3), 222–245 (2013)
Article MathSciNet MATH Google Scholar
Rakotomamonjy, A., Bach, F.R., Canu, S., Grandvalet, Y.: Simplemkl. J. Mach. Learn. Res. 9, 2491–2521 (2008)
MathSciNet MATH Google Scholar
Gönen, M., Alpaydın, E.: Multiple kernel learning algorithms. J. Mach. Learn. Res. 12, 2211–2268 (2011)
MathSciNet MATH Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. Proc. Int. Conf. Pattern Recognit. (ICPR) 3, 32–36 (2004)
Google Scholar
Raptis, M., Soatto, S.: Tracklet descriptors for action modeling and video analysis. Proc. Eur. Conf. Comput. Vis. (ECCV) 6311, 577–590 (2010)
Google Scholar
Ryoo, M., Chen, C.-C., Aggarwal, J., Roy-Chowdhury, A.: An overview of contest on semantic description of human activities. Proc. Int. Conf. Pattern Recognit. (ICPR) 6388, 270–285 (2010)
Google Scholar
Rodriguez, M., Ahmed, J., Shah, M.: Action MACH: a spatiotemporal maximum average correlation height filter for action recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8 (2008)
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1996–2003 (2009)
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 2556–2563 (2011)
Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human actions classes from videos in the wild. In: Technical Report CRCV-TR-12-01, UCF Center for Research in Computer Vision (2012)
Jiang, Y.-G., Liu, J., Roshan Zamir, A., Laptev, I., Piccardi, M., Shah, M., Sukthankar R.: THUMOS challenge: action recognition with a large number of classes (2013)
Yu, J., Jeon, M., Pedrycz, W.: Weighted feature trajectories and concatenated bag-of-features for action recognition. Neurocomputing 131, 200–207 (2014)
Article Google Scholar
Wu, J., Zhang, Y., Lin, W.: Towards good practices for action video encoding, in: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2577–2584 (2014)
Cai, Z., Wang, L., Peng, X., Qiao, Y.: Multi-view super vector for action recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 596–603 (2014)
Cho, J., Lee, M., Chang, H.J., Oh, S.: Robust action recognition using local motion and group sparsity. Pattern Recognit. 47(5), 1813–1825 (2014)
Article Google Scholar

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Grant No. 61572395) and the Specialized Research Fund for the Doctoral Program of Higher Education (Grant No. 20110201110012).

Author information

Authors and Affiliations

School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi, 710049, China
Xiaofang Wang & Chun Qi
School of Electrical Engineering and Automation, Qilu University of Technology, Jinan, Shandong, 250353, China
Xiaofang Wang

Authors

Xiaofang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chun Qi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chun Qi.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Qi, C. Action recognition using edge trajectories and motion acceleration descriptor. Machine Vision and Applications 27, 861–875 (2016). https://doi.org/10.1007/s00138-016-0746-x

Download citation

Received: 15 April 2015
Revised: 11 October 2015
Accepted: 11 December 2015
Published: 30 January 2016
Issue Date: August 2016
DOI: https://doi.org/10.1007/s00138-016-0746-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Action recognition using edge trajectories and motion acceleration descriptor

Abstract

Access this article

Similar content being viewed by others

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

Human Action Recognition and Prediction: A Survey

Real-Time Human Pose Detection and Recognition Using MediaPipe

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Action recognition using edge trajectories and motion acceleration descriptor

Abstract

Access this article

Similar content being viewed by others

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

Human Action Recognition and Prediction: A Survey

Real-Time Human Pose Detection and Recognition Using MediaPipe

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation