Dynamic hand gesture recognition using motion pattern and shape descriptors

Xing, Meng; Hu, Jing; Feng, Zhiyong; Su, Yong; Peng, Weilong; Zheng, Jinqing

doi:10.1007/s11042-018-6553-9

Dynamic hand gesture recognition using motion pattern and shape descriptors

Published: 10 September 2018

Volume 78, pages 10649–10672, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Meng Xing¹,
Jing Hu¹,
Zhiyong Feng ORCID: orcid.org/0000-0001-8158-7453²,
Yong Su¹,
Weilong Peng¹ &
…
Jinqing Zheng³

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

The key problems of dynamic hand gesture recognition are large intra-class (gesture types, without considering hand configuration) spatial-temporal variability and similar inter-class (gesture types, only considering hand configuration) motion pattern. Firstly, for intra-class spatial-temporal variability, the key is to reduce the spatial-temporal variability. Due to the average operation can improve the robustness very well, we propose a motion pattern descriptor, Time-Wise Histograms of Oriented Gradients (TWHOG), which extracts the average spatial-temporal information in the space-time domain from three orthogonal projection views (XY, YT, XT). Secondly, for similar inter-class motion pattern, accurate representation of hand configuration is especially important. Therefore, the difference in detail needs to be fully captured, and the shape descriptor can amplify subtle differences. Specifically, we introduce Depth Motion Maps-based Histograms of Oriented Gradients (DMM-HOG) to capture subtle differences in hand configurations between different types of gestures with similar motion patterns. Finally, we concatenate TWHOG and DMM-HOG to form the final feature vector Time-Shape Histograms of Oriented Gradients (TSHOG) and verify the effectiveness of the connection from quantitative and qualitative perspective. Comparison study with the state-of-the-art approaches are conducted on two challenge depth gesture datasets (MSRGesture3D, SKIG). The experiment result shows that TSHOG can achieve satisfactory performance while keeping a relative simple model with lower complexity as well as higher generality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fusing shape and spatio-temporal features for depth-based dynamic hand gesture recognition

Article 28 September 2016

A new weighted multi-scale descriptor for hand gesture recognition

Article 12 October 2023

Fast and Accurate Gesture Recognition Based on Motion Shapes

References

Ahmed W, Chanda K, Mitra S (2016) Vision based hand gesture recognition using dynamic time warping for Indian sign language [C]//Information Science (ICIS), International Conference on. IEEE 120–125
Baraldi L, Paci F, Serra G, et al. (2014) Gesture recognition in ego-centric videos using dense trajectories and hand segmentation [C]//IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS. IEEE
Chen C, Jafari R, Kehtarnavaz N (2015) Action recognition from depth sequences using depth motion maps-based local binary patterns [C]//Applications of Computer Vision (WACV), 2015 IEEE Winter Conference on. IEEE 1092–1099
Chen C, Liu K, Kehtarnavaz N (2016) Real-time human action recognition based on depth motion maps [J]. J Real-Time Image Proc 12(1):155–163
Article Google Scholar
Chen C, Zhang B, Hou Z et al (2017) Action recognition from depth sequences using weighted fusion of 2D and 3D auto-correlation of gradients features [J]. Multimed Tools Appl 76(3):4651–4669.30
Article Google Scholar
Choi H, Park H (2014) A hierarchical structure for gesture recognition using RGB-D sensor [C]//Proceedings of the second international conference on Human-agent interaction. ACM 265–268
Cirujeda P, Binefa X 4DCov: a nested covariance descriptor of spatio-temporal features for gesture recognition in depth sequences [C]//3D vision (3DV), 2014 2nd international conference on. IEEE 2014(1):657–664
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection [C]//computer vision and pattern recognition, 2005. CVPR 2005. IEEE computer Society conference on. IEEE (1):886–893
Derpanis KG, Sizintsev M, Cannons KJ et al (2013) Action spotting and recognition based on a spatiotemporal orientation analysis [J]. IEEE Trans Pattern Anal Mach Intell 35(3):527–540
Article Google Scholar
Dominio F, Donadeo M, Zanuttigh P (2014) Combining multiple depth-based descriptors for hand gesture recognition [J]. Pattern Recogn Lett 50:101–111
Article Google Scholar
El Madany N E D, He Y, Guan L (2015) Human action recognition using temporal hierarchical pyramid of depth motion map and keca [C]//Multimedia Signal Processing (MMSP), 2015 IEEE 17th International Workshop on. IEEE 1–6
Fan RE, Chang KW, Hsieh CJ et al (2008) LIBLINEAR: a library for large linear classification [J]. J Mach Learn Res 9:1871–1874
MATH Google Scholar
Gupta PMXYS, Kautz KKSTJ (2016) Online detection and classification of dynamic hand gestures with recurrent 3d convolutional neural networks [C]. CVPR
Hasan H, Abdul-Kareem S (2014) RETRACTED ARTICLE: static hand gesture recognition using neural networks [J]. Artif Intell Rev 41(2):147–181
Article Google Scholar
Jiang M, Jin K, Kong J (2018) Action Recognition Using Multi-Temporal DMMs Based on Adaptive Vague Division [C]//Proceedings of the 2018 International Conference on Image and Graphics Processing. ACM 8–13
Kim Y, Toomajian B (2016) Hand gesture recognition using micro-Doppler signatures with convolutional neural network [J]. IEEE Access 4:7125–7130
Article Google Scholar
Kraskov A, Stögbauer H, Grassberger P (2004) Estimating mutual information [J]. Phys Rev E 69(6):066138
Article MathSciNet Google Scholar
Kurakin A, Zhang Z, Liu Z (2012) A real time system for dynamic hand gesture recognition with a depth sensor [C]//Signal Processing Conference (EUSIPCO), 2012 Proceedings of the 20th European. IEEE 1975–1979
Liang B, Zheng L (2015) Spatio-temporal pyramid cuboid matching for action recognition using depth maps [C]//Image Processing (ICIP), 2015 IEEE International Conference on. IEEE 2070–2074
Liu M, Liu H (2016) Depth context: a new descriptor for human activity recognition by using sole depth sequences [J]. Neurocomputing 175:747–758
Article Google Scholar
Liu L, Shao L (2013) Learning Discriminative Representations from RGB-D Video Data [C]//IJCAI. 1: 3
Molchanov P, Gupta S, Kim K, et al. (2015) Hand gesture recognition with 3D convolutional neural networks [C]//Proceedings of the IEEE conference on computer vision and pattern recognition workshops 1–7
Nishida N, Nakayama H (2015) Multimodal gesture recognition using multi-stream recurrent neural network [C]//Pacific-rim symposium on image and video technology. Springer, Cham, pp 682–694
Google Scholar
Ohn-Bar E, Trivedi M M (2013) Joint angles similarities and HOG2 for action recognition [C]//Computer vision and pattern recognition workshops (CVPRW), 2013 IEEE conference on. IEEE 465–470
Oreifej O, Liu Z (2013) Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences [C]//Computer vision and pattern recognition (CVPR), 2013 IEEE conference on. IEEE 716–723
Rahmani H, Mahmood A, Huynh DQ, Mian A (2014) Real time action recognition using histograms of depth gradients and random decision forests. In: Applications of Computer Vision (WACV), 2014 IEEE Winter Conference on. IEEE 626–633
Santos DG, Fernandes BJT, Bezerra BLD (2015) HAGR-D: a novel approach for gesture recognition with depth maps [J]. Sensors 15(11):28646–28664
Article Google Scholar
Shen X, Hua G, Williams L et al (2012) Dynamic hand gesture recognition: an exemplar-based approach from motion divergence fields [J]. Image Vis Comput 30(3):227–235
Article Google Scholar
Tran Q D, Ly N Q (2013) Sparse spatio-temporal representation of joint shape-motion cues for human action recognition in depth sequences [C]//Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2013 IEEE RIVF International Conference on. IEEE 253–258
Tung P T, Ngoc L Q (2014) Elliptical density shape model for hand gesture recognition [C]//Proceedings of the Fifth Symposium on Information and Communication Technology. ACM 186–191
Wang X, Xia M, Cai H, et al. (2012) Hidden-markov-models-based dynamic hand gesture recognition [J]. Math Problems Eng
Wang J, Liu Z, Chorowski J et al (2012) Robust 3d action recognition with random occupancy patterns [M]//computer vision–ECCV 2012. Springer, Berlin, pp 872–885
Google Scholar
Wang L, Xiong Y, Wang Z, et al (2017) Temporal Segment Networks for Action Recognition in Videos [J]. arXiv preprint arXiv:1705.02953
Wold S, Esbensen K, Geladi P (1987) Principal component analysis [J]. Chemom Intell Lab Syst 2(1–3):37–52
Article Google Scholar
Yang X, Tian Y L (2014) Super normal vector for activity recognition using depth sequences [C]//Proceedings of the IEEE conference on computer vision and pattern recognition 804–811
Yang X, Zhang C, Tian Y L (2012) Recognizing actions using depth motion maps-based histograms of oriented gradients [C]//Proceedings of the 20th ACM international conference on Multimedia. ACM 1057–1060
Yuan J, Liu Z, Wu Y (2011) Discriminative video pattern search for efficient action detection [J]. IEEE Trans Pattern Anal Mach Intell 33(9):1728–1743
Article Google Scholar
Zhang C, Tian Y (2013) Edge enhanced depth motion map for dynamic hand gesture recognition [C]//Computer Vision and Pattern Recognition Workshops (CVPRW), 2013 IEEE Conference on. IEEE 500–505
Zhang C, Yang X, Tian Y L (2013) Histogram of 3D facets: A characteristic descriptor for hand gesture recognition [C]//Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on. IEEE 1–8
Zhao G, Pietikainen M (2007) Dynamic texture recognition using local binary patterns with an application to facial expressions [J]. IEEE Trans Pattern Anal Mach Intell 29(6):915–928
Article Google Scholar
Zheng J, Feng Z, Xu C et al (2017) Fusing shape and spatio-temporal features for depth-based dynamic hand gesture recognition [J]. Multimed Tools Appl 76(20):20525–20544
Article Google Scholar
Zhu Y, Chen W, Guo G (2015) Fusing multiple features for depth-based action recognition [J]. ACM Trans Intel Syst Technol (TIST) 6(2):18
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Tianjin University, Tianjin, 300072, China
Meng Xing, Jing Hu, Yong Su & Weilong Peng
School of Computer Software, Tianjin University, Tianjin, 300072, China
Zhiyong Feng
Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Jinqing Zheng

Authors

Meng Xing
View author publications
You can also search for this author inPubMed Google Scholar
Jing Hu
View author publications
You can also search for this author inPubMed Google Scholar
Zhiyong Feng
View author publications
You can also search for this author inPubMed Google Scholar
Yong Su
View author publications
You can also search for this author inPubMed Google Scholar
Weilong Peng
View author publications
You can also search for this author inPubMed Google Scholar
Jinqing Zheng
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Zhiyong Feng.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xing, M., Hu, J., Feng, Z. et al. Dynamic hand gesture recognition using motion pattern and shape descriptors. Multimed Tools Appl 78, 10649–10672 (2019). https://doi.org/10.1007/s11042-018-6553-9

Download citation

Received: 20 October 2017
Revised: 13 August 2018
Accepted: 15 August 2018
Published: 10 September 2018
Issue Date: April 2019
DOI: https://doi.org/10.1007/s11042-018-6553-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic hand gesture recognition using motion pattern and shape descriptors

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fusing shape and spatio-temporal features for depth-based dynamic hand gesture recognition

A new weighted multi-scale descriptor for hand gesture recognition

Fast and Accurate Gesture Recognition Based on Motion Shapes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now