Abstract
Over the past 2 decades, vision-based dynamic hand gesture recognition (HGR) has made significant progresses and been widely adopted in many practical applications. Although the advent of RGB-D cameras and deep learning-based methods provides more feasible solutions for HGR, it is still very challenging to satisfy the requirements of both high efficiency and accuracy for real-world HGR systems. In this paper, we propose a novel method using the feature covariance matrix for effective and efficient dynamic HGR. We extract a set of local feature vectors that represent local motion patterns to construct the feature covariance matrix efficiently, which also provides a compact representation of a dynamic hand gesture. By tracking hand keypoints in three successive frames and calculating their motion features, our method can be extended to both 2D dynamic HGR and 3D dynamic HGR. To evaluate the effectiveness of the proposed framework, we perform extensive experiments on three publicly available datasets (one 2D dataset and two 3D datasets). The experimental results demonstrate the effectiveness of our proposed method.











Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Badi HS, Hussein S (2014) Hand posture and gesture recognition technology. Neural Comput Appl 25(3–4):871–878
Wang C, Liu Z, Chan SC (2015) Superpixel-based hand gesture recognition with kinect depth camera. IEEE Trans Multimed 17(1):29–39
Badi H, Hasan HS, Kareem SA (2014) Feature extraction and ML techniques for static gesture recognition. Neural Comput Appl 25(3–4):733–741
Yoon HS, Soh J, Bae YJ, Yang HS (2001) Hand gesture recognition using combined features of location, angle and velocity. Pattern Recognit 34(7):1491–1501
Chen FS, Fu CM, Huang CL (2003) Hand gesture recognition using a real-time tracking method and hidden markov models. Image Vis Comput 21(8):745–758
Just A, Marcel S (2009) A comparative study of two state-of-the-art sequence processing techniques for hand gesture recognition. Comput Vis Image Underst 113(4):532–543
Premaratne P, Yang S, Vial P, Lfthikar Z (2016) Centroid tracking based dynamic hand gesture recognition using discrete hidden markov models. Neurocomputing 228:79–83
Singha J, Misra S, Laskar RH (2016) Effect of variation in gesticulation pattern in dynamic hand gesture recognition system. Neurocomputing 208:269–280
Singha J, Roy A, Laskar RH (2016) Dynamic hand gesture recognition using vision-based approach for human–computer interaction. Neural Comput Appl 29(4):1–13
Maqueda AI, Del-Blanco CR, Jaureguizar F, Garcia N (2015) Human–computer interaction based on visual hand-gesture recognition using volumetric spatiograms of local binary patterns. Comput Vis Image Underst 141:583–584
Boato G, Conci N, Daldoss M et al (2009) Hand tracking and trajectory analysis for physical rehabilitation. In: IEEE international workshop on multimedia signal processing, pp 1–6
Shen X, Hua G, Williams L, Wu Y (2012) Dynamic hand gesture recognition: an exemplar-based approach from motion divergence fields. Image Vis Comput 30(3):227–235
Laptev I (2008) Learning realistic human actions from movies. In: IEEE conference on computer vision and pattern recognition, pp 1–8
Chaudhry R, Ravichandran A, Hager G, Vidal R (2009) Histograms of oriented optical flow and Binet–Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: IEEE conference on computer vision and pattern recognition, pp 1932–1939
Willems G, Tuytelaars T, Gool LV (2008) An efficient dense and scale-invariant spatio-temporal interest point detector. In: European conference on computer vision. Springer, pp 650–663
Kurakin A, Zhang Z, Liu Z (2012) A real time system for dynamic hand gesture recognition with a depth sensor. In: Proceedings of the 20th European signal processing conference. IEEE, pp 1975–1979
Zhang C, Tian Y (2013) Edge enhanced depth motion map for dynamic hand gesture recognition. In: IEEE conference on computer vision and pattern recognition workshops, pp 500–505
Ohn-Bar E, Trivedi MM (2014) Hand gesture recognition in real time for automotive interfaces: a multimodal vision-based approach and evaluations. IEEE Trans Intell Transp Syst 15(6):2368–2377
Smedt QD, Wannous H, Vandeborre JP (2016) Skeleton-based dynamic hand gesture recognition. In: IEEE conference on computer vision and pattern recognition workshops, pp 1206–1214
Guo K, Ishwar P, Konrad J (2013) Action recognition from video using feature covariance matrices. IEEE Trans Image Process 22(6):2479–2494
Tuzel O, Porikli F, Meer P (2006) Region covariance: a fast descriptor for detection and classification. In: European conference on computer vision. Springer, pp 589–600
Tuzel O, Porikli F, Meer P (2008) Pedestrian detection via classification on riemannian manifolds. IEEE Trans Pattern Anal Mach Intell 30(10):1713–1727
Pisharady PK, Saerbeck M (2015) Recent methods and databases in vision-based hand gesture recognition. Comput Vis Image Underst 141:151–165
Zhang Z (2012) Microsoft kinect sensor and its effect. IEEE Multimedia 19(2):4–10
Arsigny V, Fillard P, Pennec X, Ayache N (2006) Log-Euclidean metrics for fast and simple calculus on diffusion tensors. Magn Reson Med 56(2):411–421
Bouguet YJ (2001) Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm. Intel Corp 5(4):1–10
Wu G, Kang W (2016) Robust fingertip detection in a complex environment. IEEE Trans Multimed 18(6):978–987
Rosten E, Drummond T (2006) Machine learning for high-speed corner detection. In: European conference on computer vision. Springer, pp 430–443
Supancic JS, Rogez G, Yang Y, Shotton J, Ramanan D (2015) Depth-based hand pose estimation: methods, data, and challenges. In: IEEE international conference on computer vision, pp 1868–1876
Hussein ME, Torki M, Gowayyed MA, El-Saban M (2013) Human action recognition using a temporal hierarchy of covariance descriptors on 3D joint locations. In: IJCAI, pp 2466–2472
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27
Molchanov P, Gupta S, Kim K, Kautz J (2016) Hand gesture recognition with 3D convolutional neural network. In: IEEE conference on computer vision and pattern recognition workshops, pp 1–7
Molchanov P, Gupta S, Kim K, Pulli K (2015) Multi-sensor system for driver’s hand-gesture recognition. In: IEEE conference on computer vision and pattern recognition workshops, pp 1–8
Nishida N, Nakayama H (2015) Multimodal gesture recognition using multi-stream recurrent neural network. In: Pacific-rim symposium on image and video technology. Springe, pp 682–694
Molchanov P, Yang X, Gupta S, Kim K, Tyree S, Kautz J (2016) Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural networks. In: IEEE conference on computer vision and pattern recognition, pp 4207–4215
Li C, Zhang X, Jin L (2017) LPSNet: a novel log path signature feature based hand gesture recognition framework. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 631–639
Chen X, Guo H, Wang G, Zhang L (2017) Motion feature augmented recurrent neural network for skeleton-based dynamic hand gesture recognition. In: ICIP [Online]. https://arxiv.org/abs/1708.03278
Liu L, Shao L (2013) Learning discriminative representation from RGB-D video data. In: IJCAI, pp 1493–1500
Wang H, Oneata D, Verbeek J, Schmid C (2016) A robust and efficient video representation for action recognition. Int J Comput Vis 119(3):219–238
Betancourt A, Morerio P, Barakova E, Marcenaro L, Rauterberg M, Regazzoni C (2017) Left/right hand segmentation in egocentric videos. Comput Vis Image Underst 154:73–81
Betancourt A, Morerio P, Barakova E I, Marcenaro L, Rauterberg M, Regazzoni C S (2015) A dynamic approach and a new dataset for hand-detection in first person vision. In: International conference on computer analysis of images and patterns. Springer, pp 274–287
Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951
Zhang H, Cao X, Ho JKL, Chow TWS (2017) Object-level video advertising: an optimization framework. IEEE Trans Industr Inf 13(2):520–531
Acknowledgements
This work was supported by National Natural Science Foundation of China (No. 61573151), the Guangdong Natural Science Foundation (No. 2016A030313468) and Science and Technology Planning Project of Guangdong Province (No. 2017A010101026).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fang, L., Wu, G., Kang, W. et al. Feature covariance matrix-based dynamic hand gesture recognition. Neural Comput & Applic 31, 8533–8546 (2019). https://doi.org/10.1007/s00521-018-3719-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-018-3719-3