Abstract
A novel class-dependent joint weighting method is proposed to mine the key skeletal joints for human action recognition. Existing deep learning methods or those based on hand-crafted features may not adequately capture the relevant joints of different actions which are important to recognize the actions. In the proposed method, for each class of human actions, each joint is weighted according to its temporal variations and its inherent ability in extension or flexion. These weights can be used as a prior knowledge in skeletal joints-based methods. Here, a novel human action recognition algorithm is also proposed in order to use these weights in two different ways. First, for each frame of a skeletal sequence, the histogram of 3D joints is weighted according to the contribution of joints in the corresponding class of human action. Second, a weighted motion energy function is defined to dynamically divide the temporal pyramid of actions. Experimental results on three benchmark datasets show the efficiency of proposed weighting method, especially when occlusion occurs.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig13_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig14_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-019-7740-z/MediaObjects/11042_2019_7740_Fig15_HTML.png)
Similar content being viewed by others
References
Aggarwal J, Ryoo MS (2011) Human activity analysis: A review. ACM Computing Surveys (CSUR) 43(3):16
Aggarwal J, Xia L (2014) Human activity recognition from 3d data: A review. Pattern Recogn Lett
Amor BB, Su J, Srivastava A (2016) Action recognition using rate-invariant analysis of skeletal shape trajectories. IEEE Trans Pattern Anal Mach Intell 38(1):1–13
Chaaraoui AA, Padilla-López JR, Climent-Pérez P, Flórez-Revuelta F (2014) Evolutionary joint selection to improve human action recognition with RGB-D devices. Expert Syst Appl 41(3):786–794
Chen G, Clarke D, Giuliani M, Gaschler A, Knoll A (2015) Combining unsupervised learning and discrimination for 3D action recognition. Signal Process 110:67–81
Chen L, Wei H, Ferryman J (2013) A survey of human motion analysis using depth imagery. Pattern Recogn Lett 34(15):1995–2006
Cippitelli E, Gasparrini S, Gambi E, Spinsante S (2016) A human activity recognition system using skeleton data from rgbd sensors. Computational Intelligence and Neuroscience 2016:21
Costantini L, Seidenari L, Serra G, Capodiferro L, Del Bimbo A (2011) Space-time Zernike moments and pyramid kernel descriptors for action classification. In: International Conference on Image Analysis and Processing. Springer, pp 199–208
Devanne M, Wannous H, Berretti S, Pala P, Daoudi M, Del Bimbo A (2013) Space-time pose representation for 3D human action recognition. In: International Conference on Image Analysis and Processing. Springer, pp 456–464
Devanne M, Wannous H, Berretti S, Pala P, Daoudi M, Del Bimbo A (2015) 3-D human action recognition by shape analysis of motion trajectories on Riemannian manifold. IEEE Transactions on Cybernetics 45(7):1340–1352
Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. Proc IEEE Conf Comput Vis Pattern Recognit:1110–1118
Faria DR, Premebida C, Nunes U (2014) A probabilistic approach for human everyday activities recognition using body motion from RGB-D images. In: Robot and Human Interactive Communication, 2014 RO-MAN: The 23rd IEEE International Symposium on. IEEE, pp 732–737
Gaglio S, Re GL, Morana M (2015) Human activity recognition process using 3-D posture data. IEEE Transactions on Human-Machine Systems 45(5):586–597
Guo Y, Li Y, Shao Z (2018) DSRF: A flexible trajectory descriptor for articulated human action recognition. Pattern Recogn 76:137–148. https://doi.org/10.1016/j.patcog.2017.10.034
Gupta R, Chia AY-S, Rajan D (2013) Human activities recognition using depth images. In: Proceedings of the 21st ACM international conference on Multimedia. ACM, pp 283–292
Han F, Reily B, Hoff W, Zhang H (2017) Space-time representation of people based on 3D skeletal data: A review. Comput Vis Image Underst 158:85–105
Hershey JR, Olsen PA (2007) Approximating the Kullback Leibler divergence between Gaussian mixture models. In: Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on. IEEE, pp IV-317-IV-320
Ijjina EP, Mohan CK (2014) Human action recognition based on mocap information using convolution neural networks. In: Machine Learning and Applications (ICMLA), 2014 13th International Conference on. IEEE, pp 159–164
Ji X, Cheng J, Tao D, Wu X, Feng W (2017) The spatial Laplacian and temporal energy pyramid representation for human action recognition using depth sequences. Knowl-Based Syst
Jiang M, Kong J, Bebis G, Huo H (2015) Informative joints based human action recognition using skeleton contexts. Signal Process Image Commun 33:29–40
Johansson G (1973) Visual perception of biological motion and a model for its analysis. Percept Psychophys 14(2):201–211
Koppula HS, Gupta R, Saxena A (2013) Learning human activities and object affordances from rgb-d videos. The International Journal of Robotics Research 32(8):951–970
Li M, Leung H (2017) Graph-based approach for 3D human skeletal action recognition. Pattern Recogn Lett 87:195–202
Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2Activity: recognizing complex activities from sensor data. In: Twenty-fourth international joint conference on artificial intelligence
Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115
Liu J, Wang G, Duan L-Y, Abdiyeva K, Kot AC (2018) Skeleton-based human action recognition with global context-aware attention LSTM networks. IEEE Trans Image Process 27(4):1586–1599
Luo J, Wang W, Qi H (2014) Spatio-temporal feature extraction and representation for RGB-D human action recognition. Pattern Recogn Lett 50:139–148
Masood SZ, Ellis C, Nagaraja A, Tappen MF, LaViola JJ Jr, Sukthankar R (2011) Measuring and reducing observational latency when recognizing actions. In: Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on. IEEE, pp 422–429
Moreno PJ, Ho PP, Vasconcelos N (2003) A Kullback-Leibler divergence based kernel for SVM classification in multimedia applications. In: Advances in neural information processing systems. p None
Ni B, Pei Y, Moulin P, Yan S (2013) Multilevel depth and image fusion for human activity detection. IEEE Transactions on Cybernetics 43(5):1383–1394
Ofli F, Chaudhry R, Kurillo G, Vidal R, Bajcsy R (2014) Sequence of the most informative joints (SMIJ): A new representation for human skeletal action recognition. J Vis Commun Image Represent 25(1):24–38. https://doi.org/10.1016/j.jvcir.2013.04.007
Panero J, Zelnik M (2014) Human dimension and interior space: a source book of design reference standards. Watson-Guptill, New York
Parisi GI, Weber C, Wermter S (2015) Self-organizing neural integration of pose-motion features for human action recognition. Front Neurorobot 9:3
Pham H-H, Khoudour L, Crouzil A, Zegers P, Velastin SA (2018) Exploiting deep residual networks for human action recognition from skeletal data. Comput Vis Image Underst
Posada D, Buckley TR (2004) Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Syst Biol 53(5):793–808
Presti LL, La Cascia M (2016) 3D skeleton-based human action classification: A survey. Pattern Recogn 53:130–147
Presti LL, La Cascia M, Sclaroff S, Camps O (2014) Gesture modeling by hanklet-based hidden markov model. In: Asian Conference on Computer Vision. Springer, pp 529–546
Shabaninia E, Naghsh-Nilchi AR, Kasaei S (2018) Extended histogram: probabilistic modelling of video content temporal evolutions. Multidim Syst Sign Process:1–19
Slama R, Wannous H, Daoudi M, Srivastava A (2015) Accurate 3D action recognition using learning on the Grassmann manifold. Pattern Recogn 48(2):556–567
Sung J, Ponce C, Selman B, Saxena A (2012) Unstructured human activity detection from rgbd images. In: Robotics and Automation (ICRA), 2012 IEEE International Conference on. IEEE, pp 842–849
Theodorakopoulos I, Kastaniotis D, Economou G, Fotopoulos S (2014) Pose-based human action recognition via sparse representation in dissimilarity space. J Vis Commun Image Represent 25(1):12–23
Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3d skeletons as points in a lie group. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE:588–595
Wang J, Liu Z, Wu Y, Yuan J (2014) Learning actionlet ensemble for 3D human action recognition. IEEE Trans Pattern Anal Mach Intell 36(5):914–927
Weng J, Weng C, Yuan J (2017) Spatio-temporal naive-bayes nearest-neighbor (st-nbnn) for skeleton-based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 4171–4180
Wu D, Shao L (2014) Leveraging hierarchical parametric networks for skeletal joints based action segmentation and recognition. Proc IEEE Conf Comput Vis Pattern Recognit:724–731
Xia L, Chen C-C, Aggarwal J (2012) View invariant human action recognition using histograms of 3d joints. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on. IEEE, pp 20–27
Yang X, Tian Y (2012) Eigenjoints-based action recognition using naive-bayes-nearest-neighbor. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on. IEEE, pp 14–19
Yang X, Tian Y (2014) Effective 3D action recognition using eigenjoints. J Vis Commun Image Represent 25(1):2–11
Zhang C, Tian Y (2012) Rgb-d camera-based daily living activity recognition. Journal of Computer Vision and Image Processing 2(4):12
Zhu Y, Chen W, Guo G (2013) Fusing spatiotemporal features and joints for 3d action recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops:486–491
Zhu W, Lan C, Xing J, Zeng W, Li Y, Shen L, Xie X (2016) Co-Occurrence Feature Learning for Skeleton Based Action Recognition Using Regularized Deep LSTM Networks. In: AAAI. p 8
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Shabaninia, E., Naghsh-Nilchi, A.R. & Kasaei, S. A weighting scheme for mining key skeletal joints for human action recognition. Multimed Tools Appl 78, 31319–31345 (2019). https://doi.org/10.1007/s11042-019-7740-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-7740-z