Abstract
Human action recognition is a hot research topic; however, the change in shapes, the high variability of appearances, dynamitic background, potential occlusions in different actions and the image limit of 2D sensor make it more difficult. To solve these problems, we pay more attention to the depth channel and the fusion of different features. Thus, we firstly extract different features for depth image sequence, and then, multi-feature mapping and dictionary learning model (MMDLM) is proposed to deeply discover the relationship between these different features, where two dictionaries and a feature mapping function are simultaneously learned. What is more, these dictionaries can fully characterize the structure information of different features, while the feature mapping function is a regularization term, which can reveal the intrinsic relationship between these two features. Large-scale experiments on two public depth datasets, MSRAction3D and DHA, show that the performances of these different depth features have a big difference, but they are complementary. Further, the features fusion by MMDLM is very efficient and effective on both datasets, which is comparable to the state-of-the-art methods.

Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Lin Y-C, Hua M-C, Cheng W-H, Hsieh Y-H, Chen H-M (2012) Human action recognition and retrieval using sole depth information, ACM MM 2012, pp 1–8
Wang J, Liu Z, Wu Y, Yuan J (2012) Mining actionlet ensemble for action recognition with depth cameras. CPRR, pp 1290–1297
Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. Human Communicative Behavior Analysis Workshop (in conjunction with CVPR), 2010, pp 2, 5, 6
Ni B, Wang G, Moulin P (2012) RGBD-HuDaAct: A color-depth video database for human daily activity recognition. ICCV workshop, pp 1–8
Megavannan V, Agarwal B, Venkatesh Babu R (2012) Human action recognition using depth maps. In: International conference on signal processing and communications (SPCOM), pp 1–8
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE TPAMI 31(2):210–227
Wang JJY, Bensmail H, Yao N, Gao X (2013) Discriminative sparse coding on multi-manifolds. Knowl Based Syst 54:199–206
Wang JJY, Bensmail H, Gao X (2013) Joint learning and weighting of visual vocabulary for bag-of-feature based tissue classification. Pattern Recognit 46(12):3249–3255
Uddin MD, Thang ND, Kim JT, Kim T-S (2011) Human activity recognition using body joint-angle features and hidden markov model. ETRI J 33(4):569–579
Jalal A, Kim JT, Kim T-S (2012) Human activity recognition using the labeled depth body parts information of depth silhouettes. In: Proceeding of the 6th international symposium on sustainable healthy buildings
Hu M-C, Chen C-W, Cheng W-H, Chang C-H et al (2014) Real-time human movement retrieval and assessment with Kinect sensor. IEEE Trans Cybern 45(4). doi:10.1109/TCYB.2014.2335540
Ofli F, Chaudhry R, Kurillo G et al (2012) Sequence of the most informative joints (SMIJ): a new representation for human skeletal action recognition. In: Proceeding of IEEE conference on CVPR workshop, pp 8–13
Xia L, Chen C-C, Aggarwal JK (2012) View invariant human action recognition using histograms of 3D joints. In: Proceeding of IEEE conference on CVPR workshop, pp 20–27
Gao Z, Zhang H, Liu A-A, Xue Y-b, Guang-ping X (2014) Human action recognition using pyramid histograms of oriented gradients and collaborative multi-task learning. KSII Trans Internet Inf Syst 8(2):483–503
Schwarz LA, Mateus D, Castaneda V, Navab N (2010) Manifold learning for TOF-based human body tracking and activity recognition. In: Proceeding of the British machine vision conference, pp 1–11
Yang X, Zhang C, Tian Y (2012) Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceeding of ACM multimedia, pp 1057–1060
Gao Z, Song J, Zhang H, Liu AA, Xue Y, Xu G (2014) Human action recognition via multi-modality information. J Electr Eng Technol 9(2):739–748
Wang J, Liu Z, Chorowski J, Chen Z, Ying W (2012) Robust 3D action recognizing with random occupancy patterns. Proc ECCV 2:872–885
Vieira AW, Nascimento ER, Oliveira GL et al (2012) STOP: space-time occupancy patterns for 3D action recognition from depth map sequences. In: Proceeding of 17th Iberoamerican congress on pattern recognition, pp 252–259
Gao Z, Chen M, Hauptmann AG, Cai A (2010) Comparing evaluation protocols on the KTH dataset. In: International conference on pattern recognition, pp 88–100
Xia L, Aggarwal JK (2013) Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In: 24th IEEE conference on computer vision and pattern recognition (CVPR), Portland, Oregon, June 2013
Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vis 42(3):145–175
Gao Y, Wang M, Ji R, Wu X, Dai Q (2014) 3D object retrieval with hausdorff distance learning. IEEE Trans Ind Electron 61(4):2088–2098
Hu M-C, Cheng W-H, Hu C-S et al (2014) Efficient human detection in crowded environment. Multimedia Systems, pp 1432–1882
Lin D, Tang X (2005) Coupled space learning of image style transformation. In: ICCV, IEEE, pp 1, 2, 3, 4
Chang KW, Hsieh CJ, Lin CJ (2008) Coordinate descent method for large-scale L2-loss linear support vector machines. J Mach Learn Res 9(7):1369–1398
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499
Oreifej O, Liu Z (2013) HON4D: histogram of oriented 4D normals for activity recognition from depth sequences. In: CVPR, Portland, Oregon, June 2013
Yang X, Tian Y (2012) EigenJoints-based action recognition using naïve-bayes-nearest-neighbor. In: IEEE workshop on CVPR, pp 14–19
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (61202168, 61201234, 61100124, 21106095), the grant of Elite Scholar Program of Tianjin University, the grant of Introducing Talents to Tianjin Normal University (5RL123), the grant of Introduction of One Thousand High-level Talents in 3 years in Tianjin, Tianjin Research Program of Application Foundation and Advanced Technology (14JCZDJC31700, 13JCQNJC0040 and 20120802).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Gao, Z., Zhang, H., Liu, A.A. et al. Human action recognition on depth dataset. Neural Comput & Applic 27, 2047–2054 (2016). https://doi.org/10.1007/s00521-015-2002-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-015-2002-0