Fusing shape and spatio-temporal features for depth-based dynamic hand gesture recognition

Zheng, Jinqing; Feng, Zhiyong; Xu, Chao; Hu, Jing; Ge, Weimin

doi:10.1007/s11042-016-3988-8

Fusing shape and spatio-temporal features for depth-based dynamic hand gesture recognition

Published: 28 September 2016

Volume 76, pages 20525–20544, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jinqing Zheng¹,
Zhiyong Feng²,
Chao Xu²,
Jing Hu¹ &
…
Weimin Ge¹

680 Accesses
18 Citations
Explore all metrics

Abstract

Hand gesture recognition has many practical applications including human-computer interfaces. Many depth-based features for dynamic hand gesture recognition task have been proposed. However the performance is still unsatisfactory due to the limitation that these features can’t efficiently capture both effective shape information and detailed variation of hands in spatial and temporal domains. In this paper, we propose a new effective descriptor, DLEH², for depth-based dynamic hand gesture recognition which is developed based on the characteristics of dynamic hand gesture through fusing simple shape and spatio-temporal features of depth sequences. For shape information, depth motion maps (DMMs) are first employed to obtain 3D structure and shape information of hands. To enhance critical shape cues, the local texture and edge information of three DMMs for hand gesture sequences are captured using DLE descriptor. However, DMMs compress the temporal information of the depth sequences into space domain, which loses critical discrimination for temporal sequence recognition to some degree. Simple but effective spatio-temporal features, HOG², are concatenated with DLE to compensate the temporal information loss during DMMs generation and capture the detailed spatial and temporal variation of hands. Experimental results on two public benchmark datasets, 99.10 % for MSRGesture3D dataset and 98.43 % for SKIG dataset, show that the proposed fusion scheme outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Computer vision-based hand gesture recognition for human-robot interaction: a review

Article Open access 19 July 2023

An Exploration into Human–Computer Interaction: Hand Gesture Recognition Management in a Challenging Environment

Article Open access 12 June 2023

Toward human activity recognition: a survey

Article 20 October 2022

References

Bandera J, Marfil R, Bandera A, Rodríguez JA, Molina-Tanco L, Sandoval F (2009) Fast gesture recognition based on a two-level representation. Pattern Recogn Lett 30(13):1181–1189
Article MATH Google Scholar
Bulbul MF, Jiang Y, Ma J (2015) Real-time human action recognition using DMMs-based LBP and EOH features. Intell Comput Theor Method. Springer: 271–282
Bulbul MF, Jiang Y, Ma J (2015) DMMs-based multiple features fusion for human action recognition. Int J Multimed Data Eng Manag (IJMDEM) 6(4):23–39
Article Google Scholar
Chen C, Jafari R, Kehtarnavaz N (2015) Action recognition from depth sequences using depth motion maps-based local binary patterns. Appl Comput Vision (WACV), IEEE Winter Conf. IEEE: 1092–1099
Chen C, Liu K, Kehtarnavaz N (2013) Real-time human action recognition based on depth motion maps. J Real-Time Imag Process: 1–9
Chen C, Zhang B, Hou Z, Jiang J, Liu M, Yang Y (2016) Action recognition from depth sequences using weighted fusion of 2D and 3D auto-correlation of gradients features. Multimed Tools Appl: 1–19
Choi H, Park H (2014) A hierarchical structure for gesture recognition using RGB-D sensor. Proc Second Int Conf Human-Agent Interact. ACM; 265–268
Cirujeda P, Binefa X (2014) 4DCov: a nested covariance descriptor of spatio-temporal features for gesture recognition in depth sequences. 3D Vision (3DV), 2014 2nd Int Conf. IEEE: 657–664
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. 2005 I.E. Comput Soc Conf Comput Vision Pattern Recognit (CVPR’05). IEEE: 886–893
Fan R, Chang K, Hsieh C, Wang X, Lin C (2008) LIBLINEAR: a library for large linear classification. J Mach Learn Res 9:1871–1874
MATH Google Scholar
Hong P, Turk M, Huang TS (2000) Gesture modeling and recognition using finite state machines. Automatic face and gesture recognition, 2000. Proc Fourth IEEE Int Conf. IEEE: 410–415
Kurakin A, Zhang Z, Liu Z (2012) A real time system for dynamic hand gesture recognition with a depth sensor. Sign Process Conf (EUSIPCO), 2012 Proc 20th Europ. IEEE: 1975–1979
Li W, Chen C, Su H, Du Q (2015) Local binary patterns for spatial-spectral classification of hyperspectral imagery. IEEE Trans Geosci Remote Sens 53(7):3681–3693
Article Google Scholar
Liang B, Zheng L (2015) Spatio-temporal pyramid cuboid matching for action recognition using depth maps. Image Process (ICIP), 2015 I.E. Int Conf. IEEE: 2070–2074
Liu M, Liu H (2016) Depth context: a new descriptor for human activity recognition by using sole depth sequences. Neurocomputing 175:747–758
Article Google Scholar
Liu L, Shao L (2013) Learning discriminative representations from RGB-D video data. IJCAI
Madany E, El Din N, He Y, Guan L (2015) Human action recognition using temporal hierarchical pyramid of depth motion map and KECA. Multimed Sign Process (MMSP), 2015 I.E. 17th Int Workshop. IEEE: 1–6
Nishida N, Nakayama H (2015) Multimodal gesture recognition using multi-stream recurrent neural network. Imag Video Technol. Springer: 682–694
Ohn-Bar E, Trivedi M (2013) Joint angles similarities and HOG² for action recognition. Proc IEEE Conf Comput Vision Pattern Recognition Workshops: 465–470
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Anal Mach Intell, IEEE Trans 24(7):971–987
Article MATH Google Scholar
Oreifej O, Liu Z (2013) Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences. Proc IEEE Conf Comput Vision Pattern Recognit: 716–723
Rahmani H, Mahmood A, Huynh DQ, Mian A (2014) Real time action recognition using histograms of depth gradients and random decision forests. Appl Comput Vision (WACV), 2014 I.E. Winter Conf. IEEE: 626–633
Ramamoorthy A, Vaswani N, Chaudhury S, Banerjee S (2003) Recognition of dynamic hand gestures. Pattern Recogn 36(9):2069–2081
Article MATH Google Scholar
Santos DG, Fernandes BJ, Bezerra BL (2015) HAGR-D: a novel approach for gesture recognition with depth maps. Sensors 15(11):28646–28664
Article Google Scholar
Shen X, Hua G, Williams L, Wu Y (2012) Dynamic hand gesture recognition: an exemplar-based approach from motion divergence fields. Image Vis Comput 30(3):227–235
Article Google Scholar
Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R (2013) Real-time human pose recognition in parts from single depth images. Commun ACM 56(1):116–124
Article Google Scholar
Suk H-I, Sin B-K, Lee S-W (2008) Recognizing hand gestures using dynamic bayesian network. Automatic Face Gesture Recognit, 2008. FG’08. 8th IEEE Int Conf. IEEE: 1–6
Tran QD, Ly NQ (2013) Sparse spatio-temporal representation of joint shape-motion cues for human action recognition in depth sequences. Comput Commun Technol, Res, Innova Vision Future (RIVF), 2013 I.E. RIVF Int Conf, 2013. IEEE: 253–258
Tung PT, Ngoc LQ. Elliptical density shape model for hand gesture recognition. Proc Fifth Symp Inform Commun Technol. ACM: 186–191
Wang J, Liu Z, Chorowski J, Chen Z, Wu Y (2012) Robust 3d action recognition with random occupancy patterns. Comput Vision–ECCV 2012. Springer: 872–885
Wang SB, Quattoni A, Morency L-P, Demirdjian D, Darrell T (2006) Hidden conditional random fields for gesture recognition. 2006 I.E. Comput Soc Conf Comput Vision Pattern Recognition (CVPR’06). IEEE: 1521–1527
Wang X, Xia M, Cai H, Gao Y, Cattani C (2012) Hidden-markov-models-based dynamic hand gesture recognition. Math Problems Eng 2012
Yang X, Tian Y (2014) Super normal vector for activity recognition using depth sequences. Proc IEEE Conf Comput Vision Pattern Recognit: 804–811
Yang X, Zhang C, Tian Y (2012) Recognizing actions using depth motion maps-based histograms of oriented gradients. Proc 20th ACM Int Conf Multimed. ACM: 1057–1060
Zhang L, Gao Y, Xia Y, Dai Q, Li X (2015) A fine-grained image categorization system by cellet-encoded spatial pyramid modeling. IEEE Trans Ind Electron 62(1):564–571
Article Google Scholar
Zhang L, Gao Y, Xia Y, Lu K, Shen J, Ji R (2014) Representative discovery of structure cues for weakly-supervised image segmentation. IEEE Trans Multimed 16(2):470–479
Article Google Scholar
Zhang L, Han Y, Yang Y, Song M, Yan S, Tian Q (2013) Discovering discriminative graphlets for aerial image categories recognition. IEEE Trans Image Process 22(12):5071–5084
Article MathSciNet Google Scholar
Zhang C, Tian Y (2013) Edge enhanced depth motion map for dynamic hand gesture recognition. Proc IEEE Conf Comput Vision Pattern Recognition Workshops: 500–505
Zhang L, Yang Y, Gao Y, Yu Y, Wang C, Li X (2014) A probabilistic associative model for segmenting weakly supervised images. IEEE Trans Image Process 23(9):4150–4159
Article MathSciNet Google Scholar
Zhu Y, Chen W, Guo G (2015) Fusing multiple features for depth-based action recognition. ACM Trans Intell Syst Technol (TIST) 6(2):18
Google Scholar

Download references

Acknowledgments

This work was partially supported by the National Natural Science Foundation of China (no. 61304262).

Author information

Authors and Affiliations

School of Computer Science and Technology, Tianjin University, Tianjin, 300072, China
Jinqing Zheng, Jing Hu & Weimin Ge
School of Computer Software, Tianjin University, Tianjin, 300072, China
Zhiyong Feng & Chao Xu

Authors

Jinqing Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Feng
View author publications
You can also search for this author in PubMed Google Scholar
Chao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jing Hu
View author publications
You can also search for this author in PubMed Google Scholar
Weimin Ge
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinqing Zheng.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zheng, J., Feng, Z., Xu, C. et al. Fusing shape and spatio-temporal features for depth-based dynamic hand gesture recognition. Multimed Tools Appl 76, 20525–20544 (2017). https://doi.org/10.1007/s11042-016-3988-8

Download citation

Received: 27 March 2016
Revised: 03 August 2016
Accepted: 16 September 2016
Published: 28 September 2016
Issue Date: October 2017
DOI: https://doi.org/10.1007/s11042-016-3988-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fusing shape and spatio-temporal features for depth-based dynamic hand gesture recognition

Abstract

Access this article

Similar content being viewed by others

Computer vision-based hand gesture recognition for human-robot interaction: a review

An Exploration into Human–Computer Interaction: Hand Gesture Recognition Management in a Challenging Environment

Toward human activity recognition: a survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Fusing shape and spatio-temporal features for depth-based dynamic hand gesture recognition

Abstract

Access this article

Similar content being viewed by others

Computer vision-based hand gesture recognition for human-robot interaction: a review

An Exploration into Human–Computer Interaction: Hand Gesture Recognition Management in a Challenging Environment

Toward human activity recognition: a survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation