Abstract
Human action recognition from RGB-D cameras has recently become one of the major fields of research. While accuracy improvement was given more importance in previous action/gesture recognition methods, there are opportunities to work on improving the computational efficiency too. This paper introduces an efficient dimensionality reduction technique and classification mechanism to recognize actions from depth motion map features. For our proposed work, a recently introduced technique called reduced basis decomposition (RBD) is employed, which manages faster dimensional reduction with its unique mechanism of generating compressed basis vectors. The RBD has an offline error-determination and an online approximation mechanism, and it is faster than PCA/SVD. For classification, this paper employs a Probabilistic Collaborative Representation Classifier (Pro-CRC). The recommended classifier works based on probability in connection with \({l_2}\)-regularization. The combined effect of the methods above helps in achieving the state-of-the-art efficiency. In the standard protocol tests carried out in the MSR-Action3D dataset, our proposed method achieved a considerable accuracy of 91.7% which is better than the currently efficient method. Further, our proposed method also proved its effectiveness in the challenging, subject-generic test with a reported accuracy of 89.64% and an average accuracy of 85.70% in the cross fixed tests which included 252 combinations of all the subjects without repetition.
Similar content being viewed by others
References
Sun, J., Wu, X., Yan, S., Cheong, L.F., Chua, T., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. Proceedings of IEEE conference on computer vision and pattern recognition, pp. 2004–2011, Miami (2009)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. Proceedings of IEEE conference on computer vision and pattern recognition, pp. 1–8, Anchorage (2008)
Dollar, P., Rabaud, V., Cottrell, G., Belongie., S.: Behavior recognition via sparse spatio-temporal features. Proceedings of the 14th international conference on computer communications and networks, pp. 65–72., IEEE Computer Society, Washington (2005)
Yang, X., Tian, Y.: Eigen joints-based action recognition using Naïve-Bayes-Nearest-Neighbor. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 14–19, Province (2012)
Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. The IEEE conference on computer vision and pattern recognition (CVPR), pp. 1110–1118 (2015)
Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. Proceedings of ACM international conference on multimedia, pp. 1057–1060 (2012)
Chen, C., Liu, K., Kehtarnavaz, N.: Real-time human action recognition based on depth motion maps. J. Real Time Image Proc. 12(1), 155–163 (2013)
Chen, C., Jafari, R., Kehtarnavaz, N.: Action recognition from depth sequences using depth motion maps-based local binary patterns, IEEE winter conference on applications of computer vision, Waikoloa, HI, pp. 1092–1099 (2015)
Yang, R., Yang, R.: DMM-pyramid based deep architectures for action recognition with depth cameras. Computer vision—ACCV 2014. Lecture notes in computer science, vol. 9007, Springer, Cham (2015)
Zhang, C., Tian, Y.: Edge enhanced depth motion map for dynamic hand gesture recognition. IEEE conference on computer vision and pattern recognition workshops, Portland, OR, pp. 500–505 (2013)
Azary, S., Savakis, A.: Grassmannian sparse representations and motion depth surfaces for 3D action recognition. IEEE conference on computer vision and pattern recognition workshops, Portland, OR, pp. 492–499 (2013)
Oreifej, O., Liu., Z.: HON4D: histogram of oriented 4D normals for activity recognition from depth sequences, IEEE conference on computer vision and pattern recognition, Portland, OR, pp. 716–723 (2013)
Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3D action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer vision—ECCV 2012. Lecture notes in computer science, vol. 7573. Springer, Berlin (2012)
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)
Tang, S., Wang, X., Lv, X., Han, T.X., Keller, J., He, Z., Skubic, M., Lao, S.: Histogram of oriented normal vectors for object recognition with a depth sensor. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) Computer vision—ACCV 2012. ACCV 2012. Lecture notes in computer science, vol. 7725. Springer, Berlin (2012)
Yang, X., Tian, Y.: Super normal vector for human activity recognition with depth cameras. IEEE Trans. Pattern Anal. Mach. Intell. 99, 1–1 (2016)
Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. IEEE Computer Society conference on computer vision and pattern recognition—workshops, San Francisco, CA, pp. 9–14 (2010)
Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.M.: STOP: space-time occupancy patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) Progress in pattern recognition, image analysis, computer vision, and applications. CIARP 2012, Lecture Notes in Computer Science, vol. 7441. Springer, Berlin (2012)
Liu, M., Liu, H., Chen, C.: 3D action recognition using multi-scale energy-based global ternary image. IEEE Trans. Circ. Syst. Video Technol. 99, 1–1 (2017)
Chen, Y.: Reduced basis decomposition: a certified and fast lossy data compression algorithm. Comput. Math. Appl. 70(10), 2566–2574 (2015)
Chen, Y., Hesthaven, J., Maday, Y., Rodríguez., J.: Certified reduced basis methods and output bounds for the Harmonic Maxwell’s equations. SIAM J. Sci. Comput. 32(2), 970–996 (2010)
Halko, N., Martinsson, P., Tropp., J.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)
Cai, S., Zhang, L., Zuo, W., Feng., X.: A probabilistic collaborative representation based approach for pattern classification. IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, pp. 2950–2959 (2016)
Zhang, L., Yang, M., Feng, X.: Sparse representation or collaborative representation: which helps face recognition? Proceedings of IEEE international conference on computer vision, pp. 471–478, Barcelona, Spain (2011)
Shi, Q., Eriksson, A., Hengel, A., Shen, C.: Is face recognition really a compressive sensing problem? Proceedings of IEEE conference on computer vision and pattern recognition, pp. 553–560, Colorado Springs, CO (2011)
Golub, G., Hansen, P.C., O’Leary, D.: Tikhonov regularization and total least squares. SIAM J Matrix Anal. Appl. 21(1), 185–194 (1999)
Xiao, Z., Lu, H., Wang, D.: L2RLS-based object tracking. IEEE Trans. Circuits Syst. Video Technol. 24(8), 1301 (2014)
Hansen, P., O’Leary, D.: The use of the L-curve in the regularization of discrete ill-posed problems. SIAM J Sci. Comput. 14(6), 1487–1503 (1993)
Qian, H.: Counting the floating point operations (FLOPS). https://in.mathworks.com/matlabcentral/fileexchange/50608-counting-the-floating-point-operations--flops-. Accessed 30 July 2016
Hunger, R.: Floating point operations in matrix-vector calculus. Munich University of Technology, Inst. for Circuit Theory and Signal Processing, Munich (2005)
Minka, T.: The Lightspeed Matlab toolbox. http://research.microsoft.com/minka/software/lightspeed/. Accessed 20 Feb 2017
Zhang, J., Li, W., Ogunbona, P.O., Wang, P, Tang, C.: RGB-D-based action recognition datasets: a survey. Pattern Recogn. 60, 86–105 (2016)
Jiang, J., Chen, Y.: Narayan, A.: Offline-enhanced reduced basis method through adaptive construction of the surrogate training set. J. Sci. Comput. 73(2), 853–875 (2017)
Padilla-López, J.R., Chaaraoui, A.A., Flórez-Revuelta, F.: A discussion on the validation tests employed to compare human action recognition methods using the MSR action3D dataset. arXiv preprint arXiv:1407.7390 (2015)
Chaaraoui, A.A., Padilla-López, J.R., Climent-Pérez, P., Flórez-Revuelta, F.: Evolutionary joint selection to improve human action recognition with RGB-D devices. Exp. Syst. Appl. 41(3), 786–794 (2014)
Chaaraoui, A., Padilla-Lopez, J., Flórez-Revuelta, F.: Fusion of skeletal and silhouette-based features for human action recognition with RGB-D devices. Proceedings of the IEEE international conference on computer vision workshops, pp. 91–97 (2013)
Azary, S., Savakis, A.: 3D Action classification using sparse spatio-temporal feature representations. In: Bebis. G. et al. (eds.) Advances in visual computing, pp. 166–175. Springer, Berlin, Heidelberg (2012)
Celiktutan, O., Wolf, C., Sankur, B., Lombardi, E.: Fast exact hyper-graph matching with dynamic programming for spatio-temporal data. J. Math. Imaging Vision 51(1), 1–21 (2015)
Eweiwi, A., Cheema, M.S., Bauckhage, C., Gall, J.: Efficient pose-based action recognition. In Asian conference on computer vision, Springer International Publishing pp. 428–443 (2014)
Rahmani, H., Mahmood, A., Huynh, D.Q., Mian, A.: HOPC: Histogram of oriented principal components of 3D pointclouds for action recognition. In European conference on computer vision, Springer International Publishing pp. 742–757 (2014)
Tran, Q.D., Ly, N.Q.: Sparse spatio-temporal representation of joint shape-motion cues for human action recognition in depth sequences. In Computing and communication technologies, research, innovation, and vision for the future (RIVF), IEEE RIVF international conference on pp. 253–258 (2013)
Rahmani, H., Mahmood, A., Huynh, D.Q., Mian, A.: Real time action recognition using histograms of depth gradients and random decision forests. In Applications of computer vision (WACV), IEEE winter conference on, pp. 626–633 (2014)
Iosifidis, A., Tefas, A., Pitas, I.: On the kernel extreme learning machine classifier. Pattern Recogn. Lett. 54, 11–17 (2015)
Louppe, G.: Understanding random forests: from theory to practice. arXiv preprint arXiv:1407.7502 (2014)
Tsang, I., Kwok, J., Cheung, P.-M.: Core vector machines: Fast SVM training on very large data sets. J. Mach. Learn. Res. 6(1), 363–392 (2005)
Chen, Y.: Reduced basis decomposition (RBD). https://in.mathworks.com/matlabcentral/fileexchange/50125-reduced-basis-decomposition. Accessed 1 Jan 2017
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Arunraj, M., Srinivasan, A. & Vimala Juliet, A. Online action recognition from RGB-D cameras based on reduced basis decomposition. J Real-Time Image Proc 17, 341–356 (2020). https://doi.org/10.1007/s11554-018-0778-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-018-0778-8