Skip to main content

Advertisement

Log in

Online action recognition from RGB-D cameras based on reduced basis decomposition

  • Original Research Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Human action recognition from RGB-D cameras has recently become one of the major fields of research. While accuracy improvement was given more importance in previous action/gesture recognition methods, there are opportunities to work on improving the computational efficiency too. This paper introduces an efficient dimensionality reduction technique and classification mechanism to recognize actions from depth motion map features. For our proposed work, a recently introduced technique called reduced basis decomposition (RBD) is employed, which manages faster dimensional reduction with its unique mechanism of generating compressed basis vectors. The RBD has an offline error-determination and an online approximation mechanism, and it is faster than PCA/SVD. For classification, this paper employs a Probabilistic Collaborative Representation Classifier (Pro-CRC). The recommended classifier works based on probability in connection with \({l_2}\)-regularization. The combined effect of the methods above helps in achieving the state-of-the-art efficiency. In the standard protocol tests carried out in the MSR-Action3D dataset, our proposed method achieved a considerable accuracy of 91.7% which is better than the currently efficient method. Further, our proposed method also proved its effectiveness in the challenging, subject-generic test with a reported accuracy of 89.64% and an average accuracy of 85.70% in the cross fixed tests which included 252 combinations of all the subjects without repetition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Sun, J., Wu, X., Yan, S., Cheong, L.F., Chua, T., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. Proceedings of IEEE conference on computer vision and pattern recognition, pp. 2004–2011, Miami (2009)

  2. Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. Proceedings of IEEE conference on computer vision and pattern recognition, pp. 1–8, Anchorage (2008)

  3. Dollar, P., Rabaud, V., Cottrell, G., Belongie., S.: Behavior recognition via sparse spatio-temporal features. Proceedings of the 14th international conference on computer communications and networks, pp. 65–72., IEEE Computer Society, Washington (2005)

  4. Yang, X., Tian, Y.: Eigen joints-based action recognition using Naïve-Bayes-Nearest-Neighbor. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 14–19, Province (2012)

  5. Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. The IEEE conference on computer vision and pattern recognition (CVPR), pp. 1110–1118 (2015)

  6. Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. Proceedings of ACM international conference on multimedia, pp. 1057–1060 (2012)

  7. Chen, C., Liu, K., Kehtarnavaz, N.: Real-time human action recognition based on depth motion maps. J. Real Time Image Proc. 12(1), 155–163 (2013)

    Article  Google Scholar 

  8. Chen, C., Jafari, R., Kehtarnavaz, N.: Action recognition from depth sequences using depth motion maps-based local binary patterns, IEEE winter conference on applications of computer vision, Waikoloa, HI, pp. 1092–1099 (2015)

  9. Yang, R., Yang, R.: DMM-pyramid based deep architectures for action recognition with depth cameras. Computer vision—ACCV 2014. Lecture notes in computer science, vol. 9007, Springer, Cham (2015)

    Google Scholar 

  10. Zhang, C., Tian, Y.: Edge enhanced depth motion map for dynamic hand gesture recognition. IEEE conference on computer vision and pattern recognition workshops, Portland, OR, pp. 500–505 (2013)

  11. Azary, S., Savakis, A.: Grassmannian sparse representations and motion depth surfaces for 3D action recognition. IEEE conference on computer vision and pattern recognition workshops, Portland, OR, pp. 492–499 (2013)

  12. Oreifej, O., Liu., Z.: HON4D: histogram of oriented 4D normals for activity recognition from depth sequences, IEEE conference on computer vision and pattern recognition, Portland, OR, pp. 716–723 (2013)

  13. Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3D action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) Computer vision—ECCV 2012. Lecture notes in computer science, vol. 7573. Springer, Berlin (2012)

    Google Scholar 

  14. Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)

    Article  Google Scholar 

  15. Tang, S., Wang, X., Lv, X., Han, T.X., Keller, J., He, Z., Skubic, M., Lao, S.: Histogram of oriented normal vectors for object recognition with a depth sensor. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) Computer vision—ACCV 2012. ACCV 2012. Lecture notes in computer science, vol. 7725. Springer, Berlin (2012)

    Google Scholar 

  16. Yang, X., Tian, Y.: Super normal vector for human activity recognition with depth cameras. IEEE Trans. Pattern Anal. Mach. Intell. 99, 1–1 (2016)

    Google Scholar 

  17. Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. IEEE Computer Society conference on computer vision and pattern recognition—workshops, San Francisco, CA, pp. 9–14 (2010)

  18. Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.M.: STOP: space-time occupancy patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) Progress in pattern recognition, image analysis, computer vision, and applications. CIARP 2012, Lecture Notes in Computer Science, vol. 7441. Springer, Berlin (2012)

    Google Scholar 

  19. Liu, M., Liu, H., Chen, C.: 3D action recognition using multi-scale energy-based global ternary image. IEEE Trans. Circ. Syst. Video Technol. 99, 1–1 (2017)

    Article  Google Scholar 

  20. Chen, Y.: Reduced basis decomposition: a certified and fast lossy data compression algorithm. Comput. Math. Appl. 70(10), 2566–2574 (2015)

    Article  MathSciNet  Google Scholar 

  21. Chen, Y., Hesthaven, J., Maday, Y., Rodríguez., J.: Certified reduced basis methods and output bounds for the Harmonic Maxwell’s equations. SIAM J. Sci. Comput. 32(2), 970–996 (2010)

    Article  MathSciNet  Google Scholar 

  22. Halko, N., Martinsson, P., Tropp., J.: Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53(2), 217–288 (2011)

    Article  MathSciNet  Google Scholar 

  23. Cai, S., Zhang, L., Zuo, W., Feng., X.: A probabilistic collaborative representation based approach for pattern classification. IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, pp. 2950–2959 (2016)

  24. Zhang, L., Yang, M., Feng, X.: Sparse representation or collaborative representation: which helps face recognition? Proceedings of IEEE international conference on computer vision, pp. 471–478, Barcelona, Spain (2011)

  25. Shi, Q., Eriksson, A., Hengel, A., Shen, C.: Is face recognition really a compressive sensing problem? Proceedings of IEEE conference on computer vision and pattern recognition, pp. 553–560, Colorado Springs, CO (2011)

  26. Golub, G., Hansen, P.C., O’Leary, D.: Tikhonov regularization and total least squares. SIAM J Matrix Anal. Appl. 21(1), 185–194 (1999)

    Article  MathSciNet  Google Scholar 

  27. Xiao, Z., Lu, H., Wang, D.: L2RLS-based object tracking. IEEE Trans. Circuits Syst. Video Technol. 24(8), 1301 (2014)

    Article  Google Scholar 

  28. Hansen, P., O’Leary, D.: The use of the L-curve in the regularization of discrete ill-posed problems. SIAM J Sci. Comput. 14(6), 1487–1503 (1993)

    Article  MathSciNet  Google Scholar 

  29. Qian, H.: Counting the floating point operations (FLOPS). https://in.mathworks.com/matlabcentral/fileexchange/50608-counting-the-floating-point-operations--flops-. Accessed 30 July 2016

  30. Hunger, R.: Floating point operations in matrix-vector calculus. Munich University of Technology, Inst. for Circuit Theory and Signal Processing, Munich (2005)

  31. Minka, T.: The Lightspeed Matlab toolbox. http://research.microsoft.com/minka/software/lightspeed/. Accessed 20 Feb 2017

  32. Zhang, J., Li, W., Ogunbona, P.O., Wang, P, Tang, C.: RGB-D-based action recognition datasets: a survey. Pattern Recogn. 60, 86–105 (2016)

    Article  Google Scholar 

  33. Jiang, J., Chen, Y.: Narayan, A.: Offline-enhanced reduced basis method through adaptive construction of the surrogate training set. J. Sci. Comput. 73(2), 853–875 (2017)

    Article  MathSciNet  Google Scholar 

  34. Padilla-López, J.R., Chaaraoui, A.A., Flórez-Revuelta, F.: A discussion on the validation tests employed to compare human action recognition methods using the MSR action3D dataset. arXiv preprint arXiv:1407.7390 (2015)

  35. Chaaraoui, A.A., Padilla-López, J.R., Climent-Pérez, P., Flórez-Revuelta, F.: Evolutionary joint selection to improve human action recognition with RGB-D devices. Exp. Syst. Appl. 41(3), 786–794 (2014)

    Article  Google Scholar 

  36. Chaaraoui, A., Padilla-Lopez, J., Flórez-Revuelta, F.: Fusion of skeletal and silhouette-based features for human action recognition with RGB-D devices. Proceedings of the IEEE international conference on computer vision workshops, pp. 91–97 (2013)

  37. Azary, S., Savakis, A.: 3D Action classification using sparse spatio-temporal feature representations. In: Bebis. G. et al. (eds.) Advances in visual computing, pp. 166–175. Springer, Berlin, Heidelberg (2012)

    Chapter  Google Scholar 

  38. Celiktutan, O., Wolf, C., Sankur, B., Lombardi, E.: Fast exact hyper-graph matching with dynamic programming for spatio-temporal data. J. Math. Imaging Vision 51(1), 1–21 (2015)

    Article  MathSciNet  Google Scholar 

  39. Eweiwi, A., Cheema, M.S., Bauckhage, C., Gall, J.: Efficient pose-based action recognition. In Asian conference on computer vision, Springer International Publishing pp. 428–443 (2014)

  40. Rahmani, H., Mahmood, A., Huynh, D.Q., Mian, A.: HOPC: Histogram of oriented principal components of 3D pointclouds for action recognition. In European conference on computer vision, Springer International Publishing pp. 742–757 (2014)

  41. Tran, Q.D., Ly, N.Q.: Sparse spatio-temporal representation of joint shape-motion cues for human action recognition in depth sequences. In Computing and communication technologies, research, innovation, and vision for the future (RIVF), IEEE RIVF international conference on pp. 253–258 (2013)

  42. Rahmani, H., Mahmood, A., Huynh, D.Q., Mian, A.: Real time action recognition using histograms of depth gradients and random decision forests. In Applications of computer vision (WACV), IEEE winter conference on, pp. 626–633 (2014)

  43. Iosifidis, A., Tefas, A., Pitas, I.: On the kernel extreme learning machine classifier. Pattern Recogn. Lett. 54, 11–17 (2015)

    Article  Google Scholar 

  44. Louppe, G.: Understanding random forests: from theory to practice. arXiv preprint arXiv:1407.7502 (2014)

  45. Tsang, I., Kwok, J., Cheung, P.-M.: Core vector machines: Fast SVM training on very large data sets. J. Mach. Learn. Res. 6(1), 363–392 (2005)

    MathSciNet  MATH  Google Scholar 

  46. Chen, Y.: Reduced basis decomposition (RBD). https://in.mathworks.com/matlabcentral/fileexchange/50125-reduced-basis-decomposition. Accessed 1 Jan 2017

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muniandi Arunraj.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Arunraj, M., Srinivasan, A. & Vimala Juliet, A. Online action recognition from RGB-D cameras based on reduced basis decomposition. J Real-Time Image Proc 17, 341–356 (2020). https://doi.org/10.1007/s11554-018-0778-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-018-0778-8

Keywords

Navigation