Online action recognition from RGB-D cameras based on reduced basis decomposition

Arunraj, Muniandi; Srinivasan, Andy; Vimala Juliet, A.

doi:10.1007/s11554-018-0778-8

Online action recognition from RGB-D cameras based on reduced basis decomposition

Original Research Paper
Published: 05 May 2018

Volume 17, pages 341–356, (2020)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Muniandi Arunraj¹,
Andy Srinivasan² &
A. Vimala Juliet¹

536 Accesses
10 Citations
Explore all metrics

Abstract

Human action recognition from RGB-D cameras has recently become one of the major fields of research. While accuracy improvement was given more importance in previous action/gesture recognition methods, there are opportunities to work on improving the computational efficiency too. This paper introduces an efficient dimensionality reduction technique and classification mechanism to recognize actions from depth motion map features. For our proposed work, a recently introduced technique called reduced basis decomposition (RBD) is employed, which manages faster dimensional reduction with its unique mechanism of generating compressed basis vectors. The RBD has an offline error-determination and an online approximation mechanism, and it is faster than PCA/SVD. For classification, this paper employs a Probabilistic Collaborative Representation Classifier (Pro-CRC). The recommended classifier works based on probability in connection with \({l_2}\)-regularization. The combined effect of the methods above helps in achieving the state-of-the-art efficiency. In the standard protocol tests carried out in the MSR-Action3D dataset, our proposed method achieved a considerable accuracy of 91.7% which is better than the currently efficient method. Further, our proposed method also proved its effectiveness in the challenging, subject-generic test with a reported accuracy of 89.64% and an average accuracy of 85.70% in the cross fixed tests which included 252 combinations of all the subjects without repetition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discriminative Dictionary Learning for Skeletal Action Recognition

Spectral Graph Skeletons for 3D Action Recognition

Human Action Recognition Based on Temporal Pyramid of Key Poses Using RGB-D Sensors

References

Sun, J., Wu, X., Yan, S., Cheong, L.F., Chua, T., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. Proceedings of IEEE conference on computer vision and pattern recognition, pp. 2004–2011, Miami (2009)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. Proceedings of IEEE conference on computer vision and pattern recognition, pp. 1–8, Anchorage (2008)
Dollar, P., Rabaud, V., Cottrell, G., Belongie., S.: Behavior recognition via sparse spatio-temporal features. Proceedings of the 14th international conference on computer communications and networks, pp. 65–72., IEEE Computer Society, Washington (2005)
Yang, X., Tian, Y.: Eigen joints-based action recognition using Naïve-Bayes-Nearest-Neighbor. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 14–19, Province (2012)
Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. The IEEE conference on computer vision and pattern recognition (CVPR), pp. 1110–1118 (2015)
Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. Proceedings of ACM international conference on multimedia, pp. 1057–1060 (2012)
Chen, C., Liu, K., Kehtarnavaz, N.: Real-time human action recognition based on depth motion maps. J. Real Time Image Proc. 12(1), 155–163 (2013)
Article Google Scholar
Chen, C., Jafari, R., Kehtarnavaz, N.: Action recognition from depth sequences using depth motion maps-based local binary patterns, IEEE winter conference on applications of computer vision, Waikoloa, HI, pp. 1092–1099 (2015)
Yang, R., Yang, R.: DMM-pyramid based deep architectures for action recognition with depth cameras. Computer vision—ACCV 2014. Lecture notes in computer science, vol. 9007, Springer, Cham (2015)
Google Scholar
Zhang, C., Tian, Y.: Edge enhanced depth motion map for dynamic hand gesture recognition. IEEE conference on computer vision and pattern recognition workshops, Portland, OR, pp. 500–505 (2013)
Azary, S., Savakis, A.: Grassmannian sparse representations and motion depth surfaces for 3D action recognition. IEEE conference on computer vision and pattern recognition workshops, Portland, OR, pp. 492–499 (2013)
Oreifej, O., Liu., Z.: HON4D: histogram of oriented 4D normals for activity recognition from depth sequences, IEEE conference on computer vision and pattern recognition, Portland, OR, pp. 716–723 (2013)
Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3D action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer vision—ECCV 2012. Lecture notes in computer science, vol. 7573. Springer, Berlin (2012)
Google Scholar
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)
Article Google Scholar
Tang, S., Wang, X., Lv, X., Han, T.X., Keller, J., He, Z., Skubic, M., Lao, S.: Histogram of oriented normal vectors for object recognition with a depth sensor. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) Computer vision—ACCV 2012. ACCV 2012. Lecture notes in computer science, vol. 7725. Springer, Berlin (2012)
Google Scholar
Yang, X., Tian, Y.: Super normal vector for human activity recognition with depth cameras. IEEE Trans. Pattern Anal. Mach. Intell. 99, 1–1 (2016)
Google Scholar
Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. IEEE Computer Society conference on computer vision and pattern recognition—workshops, San Francisco, CA, pp. 9–14 (2010)
Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.M.: STOP: space-time occupancy patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) Progress in pattern recognition, image analysis, computer vision, and applications. CIARP 2012, Lecture Notes in Computer Science, vol. 7441. Springer, Berlin (2012)
Google Scholar
Liu, M., Liu, H., Chen, C.: 3D action recognition using multi-scale energy-based global ternary image. IEEE Trans. Circ. Syst. Video Technol. 99, 1–1 (2017)
Article Google Scholar
Chen, Y.: Reduced basis decomposition: a certified and fast lossy data compression algorithm. Comput. Math. Appl. 70(10), 2566–2574 (2015)
Article MathSciNet Google Scholar
Chen, Y., Hesthaven, J., Maday, Y., Rodríguez., J.: Certified reduced basis methods and output bounds for the Harmonic Maxwell’s equations. SIAM J. Sci. Comput. 32(2), 970–996 (2010)
Article MathSciNet Google Scholar
Halko, N., Martinsson, P., Tropp., J.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)
Article MathSciNet Google Scholar
Cai, S., Zhang, L., Zuo, W., Feng., X.: A probabilistic collaborative representation based approach for pattern classification. IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, pp. 2950–2959 (2016)
Zhang, L., Yang, M., Feng, X.: Sparse representation or collaborative representation: which helps face recognition? Proceedings of IEEE international conference on computer vision, pp. 471–478, Barcelona, Spain (2011)
Shi, Q., Eriksson, A., Hengel, A., Shen, C.: Is face recognition really a compressive sensing problem? Proceedings of IEEE conference on computer vision and pattern recognition, pp. 553–560, Colorado Springs, CO (2011)
Golub, G., Hansen, P.C., O’Leary, D.: Tikhonov regularization and total least squares. SIAM J Matrix Anal. Appl. 21(1), 185–194 (1999)
Article MathSciNet Google Scholar
Xiao, Z., Lu, H., Wang, D.: L2RLS-based object tracking. IEEE Trans. Circuits Syst. Video Technol. 24(8), 1301 (2014)
Article Google Scholar
Hansen, P., O’Leary, D.: The use of the L-curve in the regularization of discrete ill-posed problems. SIAM J Sci. Comput. 14(6), 1487–1503 (1993)
Article MathSciNet Google Scholar
Qian, H.: Counting the floating point operations (FLOPS). https://in.mathworks.com/matlabcentral/fileexchange/50608-counting-the-floating-point-operations--flops-. Accessed 30 July 2016
Hunger, R.: Floating point operations in matrix-vector calculus. Munich University of Technology, Inst. for Circuit Theory and Signal Processing, Munich (2005)
Minka, T.: The Lightspeed Matlab toolbox. http://research.microsoft.com/minka/software/lightspeed/. Accessed 20 Feb 2017
Zhang, J., Li, W., Ogunbona, P.O., Wang, P, Tang, C.: RGB-D-based action recognition datasets: a survey. Pattern Recogn. 60, 86–105 (2016)
Article Google Scholar
Jiang, J., Chen, Y.: Narayan, A.: Offline-enhanced reduced basis method through adaptive construction of the surrogate training set. J. Sci. Comput. 73(2), 853–875 (2017)
Article MathSciNet Google Scholar
Padilla-López, J.R., Chaaraoui, A.A., Flórez-Revuelta, F.: A discussion on the validation tests employed to compare human action recognition methods using the MSR action3D dataset. arXiv preprint arXiv:1407.7390 (2015)
Chaaraoui, A.A., Padilla-López, J.R., Climent-Pérez, P., Flórez-Revuelta, F.: Evolutionary joint selection to improve human action recognition with RGB-D devices. Exp. Syst. Appl. 41(3), 786–794 (2014)
Article Google Scholar
Chaaraoui, A., Padilla-Lopez, J., Flórez-Revuelta, F.: Fusion of skeletal and silhouette-based features for human action recognition with RGB-D devices. Proceedings of the IEEE international conference on computer vision workshops, pp. 91–97 (2013)
Azary, S., Savakis, A.: 3D Action classification using sparse spatio-temporal feature representations. In: Bebis. G. et al. (eds.) Advances in visual computing, pp. 166–175. Springer, Berlin, Heidelberg (2012)
Chapter Google Scholar
Celiktutan, O., Wolf, C., Sankur, B., Lombardi, E.: Fast exact hyper-graph matching with dynamic programming for spatio-temporal data. J. Math. Imaging Vision 51(1), 1–21 (2015)
Article MathSciNet Google Scholar
Eweiwi, A., Cheema, M.S., Bauckhage, C., Gall, J.: Efficient pose-based action recognition. In Asian conference on computer vision, Springer International Publishing pp. 428–443 (2014)
Rahmani, H., Mahmood, A., Huynh, D.Q., Mian, A.: HOPC: Histogram of oriented principal components of 3D pointclouds for action recognition. In European conference on computer vision, Springer International Publishing pp. 742–757 (2014)
Tran, Q.D., Ly, N.Q.: Sparse spatio-temporal representation of joint shape-motion cues for human action recognition in depth sequences. In Computing and communication technologies, research, innovation, and vision for the future (RIVF), IEEE RIVF international conference on pp. 253–258 (2013)
Rahmani, H., Mahmood, A., Huynh, D.Q., Mian, A.: Real time action recognition using histograms of depth gradients and random decision forests. In Applications of computer vision (WACV), IEEE winter conference on, pp. 626–633 (2014)
Iosifidis, A., Tefas, A., Pitas, I.: On the kernel extreme learning machine classifier. Pattern Recogn. Lett. 54, 11–17 (2015)
Article Google Scholar
Louppe, G.: Understanding random forests: from theory to practice. arXiv preprint arXiv:1407.7502 (2014)
Tsang, I., Kwok, J., Cheung, P.-M.: Core vector machines: Fast SVM training on very large data sets. J. Mach. Learn. Res. 6(1), 363–392 (2005)
MathSciNet MATH Google Scholar
Chen, Y.: Reduced basis decomposition (RBD). https://in.mathworks.com/matlabcentral/fileexchange/50125-reduced-basis-decomposition. Accessed 1 Jan 2017

Download references

Author information

Authors and Affiliations

Department of EIE, SRM Institute of Science and Technology, Kattankulathur Campus, Kancheepuram District, Tamilnadu, 603203, India
Muniandi Arunraj & A. Vimala Juliet
Department of EIE, Valliammai Engineering College, Kattankulathur Campus, Kancheepuram District, Tamilnadu, 603203, India
Andy Srinivasan

Authors

Muniandi Arunraj
View author publications
You can also search for this author in PubMed Google Scholar
Andy Srinivasan
View author publications
You can also search for this author in PubMed Google Scholar
A. Vimala Juliet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muniandi Arunraj.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arunraj, M., Srinivasan, A. & Vimala Juliet, A. Online action recognition from RGB-D cameras based on reduced basis decomposition. J Real-Time Image Proc 17, 341–356 (2020). https://doi.org/10.1007/s11554-018-0778-8

Download citation

Received: 10 May 2017
Accepted: 23 April 2018
Published: 05 May 2018
Issue Date: April 2020
DOI: https://doi.org/10.1007/s11554-018-0778-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online action recognition from RGB-D cameras based on reduced basis decomposition

Abstract

Access this article

Similar content being viewed by others

Discriminative Dictionary Learning for Skeletal Action Recognition

Spectral Graph Skeletons for 3D Action Recognition

Human Action Recognition Based on Temporal Pyramid of Key Poses Using RGB-D Sensors

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Online action recognition from RGB-D cameras based on reduced basis decomposition

Abstract

Access this article

Similar content being viewed by others

Discriminative Dictionary Learning for Skeletal Action Recognition

Spectral Graph Skeletons for 3D Action Recognition

Human Action Recognition Based on Temporal Pyramid of Key Poses Using RGB-D Sensors

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation