Human activity recognition in RGB-D videos by dynamic images

Mukherjee, Snehasis; Anvitha, Leburu; Lahari, T. Mohana

doi:10.1007/s11042-020-08747-3

Human activity recognition in RGB-D videos by dynamic images

Published: 30 March 2020

Volume 79, pages 19787–19801, (2020)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

400 Accesses
12 Citations
Explore all metrics

Abstract

Human Activity Recognition in RGB-D videos has been an active research topic during the last decade. However, only a few efforts have been made, for recognizing human activity in RGB-D videos where several performers are performing simultaneously. In this paper we introduce such a challenging dataset with several performers performing the activities simultaniously. We present a novel method for recognizing human activities performed simultaniously in the same videos. The proposed method aims in capturing the motion information of the whole video by producing a dynamic image corresponding to the input video. We use two parallel ResNet-101 architectures to produce the dynamic images for the RGB video and depth video separately. The dynamic images contain only the motion information of the whole frame, which is the main cue for analyzing the motion of the performer during action. Hence, dynamic images help recognizing human action by concentrating only on the motion information appeared on the frame. We send the two dynamic images through a fully connected layer for classification of activity. The proposed dynamic image reduces the complexity of the recognition process by extracting a sparse matrix from a video, while preserving the motion information required for activity recognition, and produces comparable results with respect to the state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Multimode Two-Stream Network for Egocentric Action Recognition

A New Hybrid Architecture for Human Activity Recognition from RGB-D Videos

A Review of Dynamic Maps for 3D Human Motion Recognition Using ConvNets and Its Improvement

Article 30 July 2020

References

Aghbolaghi MA, Bertiche H, Roig V, Kasaei S, Escalera S (2017) Action recognition from RGB-D data: comparison and fusion of spatio-temporal handcrafted features and deep strategies. In: ICCV workshops
Akula A, Shah AK, Ghosh R (2018) Deep learning approach for human action recognition in infrared images. Cognitive Systems Research. https://doi.org/10.1016/j.cogsys.2018.04.002
Baek S, Shi Z, Kawade M, Kim TK (2017) Kinematic-layout-aware random forests for depth-based action recognition BMVC
Bilen H, Fernando B, Gavves E, Vedaldi A, Gould S (2017) Action recognition with dynamic image networks. IEEE Tran PAMI, https://doi.org/10.1109/TPAMI.2017.2769085
Chen J, Zhao G, Kellokumpu VP, Pietikäinen M (2011) Combining sparse and dense descriptors with temporal semantic structures for robust human action recognition. In: ICCV, pp 1524–1531
Chen C, Jafari R, Kehtarnavaz N (2017) A survey of depth and inertial sensor fusion for human action recognition. Multimed Tools Applic 76(3):4405–4425
Article Google Scholar
Chen C, Zhang B, Hou Z, Jiang J, Liu M, Yang Y (2017) Action recognition from depth sequences using weighted fusion of 2D and 3D auto-correlation of gradients features. Multimed Tools Applic 76(3):4651–4669
Article Google Scholar
Fernando B, Gavves E, Oramas J, Ghodrati A, Tuytelaars T (2017) Rank pooling for action recognition. IEEE Tran PAMI 39(4):773–787
Article Google Scholar
Gonzalez-Sanchez T, Puig D (2011) Real-time body gesture recognition using depth camera. Electron Lett 47(12):697–698
Article Google Scholar
Guindel C, Martin Jose D, Armingol M (2019) Traffic scene awareness for intelligent vehicles using ConvNets and stereo vision. Robot Auton Syst 112:109–122
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778
Hu JF, Zheng WS, Pan J, Lai J, Zhang J (2018) Deep bilinear learning for RGB-D action recognition. In: ECCV, pp 1–17
Ji Y, Xu F, Yang Y, Shen F, Shen HT, Zheng WS (2019) A large-scale varying-view RGB-D action dataset for arbitrary-view human action recognition. arxiv:1904.10681
Kong Y, Fu Y (2015) Bilinear heterogeneous information machine for RGB-D action recognition. In: CVPR
Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. In: CVPR workshops, pp 9–14
Mukherjee S, Mukherjee DP (2013) A design-of-experiment based statistical technique for detection of key-frames. Multimed Tools Applic 62(3):847–877
Article Google Scholar
Maryam AA, Kasaei S (2018) Supervised spatio-temporal kernel descriptor for human action recognition from RGB-depth videos. Multimed Tools Applic 77 (11):14115–14135
Article Google Scholar
Negin F, Zdemir FO, Akgul CB, Yuksel KA, Ercil A (2013) A decision forest based feature selection framework for action recognition from rgb-depth cameras. In: ICIAR
Oreifej O, Liu Z, Redmond WA (2013) HON4D: histogram of oriented 4D normals for activity recognition from depth sequences. In: CVPR
Rahmani H, Mahmood A, Huynh DQ, Mian A (2014) HOPC: histogram of oriented principal components of 3d pointclouds for action recognition. In: ECCV
Shahroudy A, Liu J, Ng TT, Wang G (2016) NTU RGB+D: a large scale dataset for 3D human activity analysis. In: CVPR, pp 1010–1019
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR
Smola AJ, Scholkopf B (2004) A tutorial on support vector regression. Stat Comput 14:199–222
Article MathSciNet Google Scholar
Spinello L, Arras KO (2011) People detection in rgb-d data. In: IROS
Wang J, Liu Z, Wu Y, Yuan J (2012) Mining action let ensemble for action recognition with depth cameras. CVPR, 1290–1297
Wang P, Li W, Gao Z, Zhang J, Tang C, Ogunbona PO (2016) Action recognition from depth maps using deep convolutional neural networks. IEEE Tran HMS, 46(4)
Wang P, Wang S, Gao Z, Hou Y, Li W (2017) Structured images for RGB-D action recognition. In: ICCV workshops, pp 1005–1014
Wang P, Li W, Ogunbona P, Wan J, Escalera S (2018) RGB-D-based human motion recognition with deep learning: a survey. arXiv:1711.08362v2 [cs.CV]
Wang P, Li W, Wan J, Ogunbona P, Liu X (2018) Cooperative training of deep aggregation networks for RGB-D action recognition. In: AAAI, pp 7404–7411
Wilson G, Pereyda C, Raghunath N, de la Cruz G, Goel S, Nesaei S, Minor B, Edgecombe MS, Taylor ME, Cook DJ (2018) Robot-enabled support of daily activities in smart home environments. Cognitive Systems Research, https://doi.org/10.1016/j.cogsys.2018.10.032
Xie S, Girshick R, Dollar P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: CVPR, pp 5987–5995
Yang X, Tian Y (2012) EigenJoints-based action recognition using Native-Bayes-Nearest-Neighbor. In: CVPR workshops, pp 14–19
Yang X, Zhang C, Tian Y (2012) Recognizing actions using depth motion maps-based histograms of oriented gradients. In: ACM multimedia, pp 1057–1060
Zhang J, Li W, Ogunbona P, Wang P, Tang C (2016) RGB-D-based action recognition datasets: a survey. Pattern Recogn 60(2016):86–105
Article Google Scholar
Zhang H, Li Y, Wang P, Liu Y, Shen C (2018) RGB-D based action recognition with light-weight 3D convolutional networks. arxiv:1811.09908
Ziaeetabar F, Kulvicius T, Tamosiunaite M, Worgotter F (2018) Recognition and prediction of manipulation actions using enriched semantic event chains. Robot Auton Syst 110:173–188
Article Google Scholar

Download references

Acknowledgements

The authors wish to acknowledge the financial support provided by the Science and Engineering Research Board (SERB), the Government of India, through the project grant numbered ECR/2016/00652. The authors wish to acknowledge the NVIDIA GPU grant team for providing graphics card to perform the necessary experiments for this work.

Author information

Authors and Affiliations

Indian Institute of Information Technology SriCity, Chittoor, India
Snehasis Mukherjee, Leburu Anvitha & T. Mohana Lahari

Authors

Snehasis Mukherjee
View author publications
You can also search for this author inPubMed Google Scholar
Leburu Anvitha
View author publications
You can also search for this author inPubMed Google Scholar
T. Mohana Lahari
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Snehasis Mukherjee.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mukherjee, S., Anvitha, L. & Lahari, T.M. Human activity recognition in RGB-D videos by dynamic images. Multimed Tools Appl 79, 19787–19801 (2020). https://doi.org/10.1007/s11042-020-08747-3

Download citation

Received: 21 March 2019
Revised: 30 December 2019
Accepted: 17 February 2020
Published: 30 March 2020
Issue Date: July 2020
DOI: https://doi.org/10.1007/s11042-020-08747-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human activity recognition in RGB-D videos by dynamic images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Multimode Two-Stream Network for Egocentric Action Recognition

A New Hybrid Architecture for Human Activity Recognition from RGB-D Videos

A Review of Dynamic Maps for 3D Human Motion Recognition Using ConvNets and Its Improvement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now