Skip to main content

Advertisement

Log in

Improving articulated hand pose detection for static finger sign recognition in RGB-D images

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

With the emergence of consumer RGB-D sensors, discriminative modeling has been shown to perform well in estimating human body pose. However, articulated hand pose estimation remains a challenging problem, mostly due to its high flexibility, occlusions, noisy data, and small area of the fingertips. In this paper, we present an efficient discriminative-based scheme to improve the performance of hand pose estimation from a single depth image. The proposed scheme is inspired by decision forest-based framework, but with several well-motivated modifications. Specifically, we propose a method to estimate 2D in-plane orientation of the hand, which is then utilized to enforce the depth comparison features and make them invariant to in-plane rotation. Subsequently, we investigate the use of random decision forests (RDF) and mean shift algorithm to predict a primary version of hand parts and joint locations. Based on this primary prediction, an adaptive spatial clustering method is applied to correct the misclassified regions, and to deliver the final estimation of hand pose. Along with the proposed scheme, we further develop a new set of highly-distinctive features for static finger sign recognition by utilizing the estimated hand pose configurations and RGB information. The proposed features are straightforward and can effectively capture different aspects of hand pose, such as links from each joint to the closest joints and orientation of each hand part. Extensive experiments on several challenging datasets demonstrate that our approach, compared to decision forest-based methods, is able to provide more precise estimation of hand poses (with up to 21% improvement in joint localization accuracy), and can efficiently recognize more complex static finger signs (93.85% mean recognition accuracy on a challenging 34-finger sign dataset). Our approach is also robust to illumination, inter-hand occlusion, scale, and rotation variance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26

Similar content being viewed by others

References

  1. Aly W, Aly S, Almotairi S (2019) User-independent American Sign Language alphabet recognition based on depth image and PCANet features. IEEE Access 7:123138–123150

    Article  Google Scholar 

  2. Barsoum E (2016) Articulated hand pose estimation review. arXiv:1604.06195

  3. Bhuyan MK, MacDorman KF, Kar MK, Neog DR, Lovell BC, Gadde P (2015) Hand pose recognition from monocular images by geometrical and texture analysis. J Vis Lang Comput 28:39–55

    Article  Google Scholar 

  4. Breiman L (2001) Random forests. Mach Learn 45:5–32

    Article  Google Scholar 

  5. Chen X, Wang G, Guo H, Zhang C (2019) Pose guided structured region ensemble network for cascaded hand pose estimation. Neurocomputing

  6. Cheng H, Dai Z, Liu Z, Zhao Y (2016) An image-to-class dynamic time warping approach for both 3D static and trajectory hand gesture recognition. Pattern Recogn 55:137–147

    Article  Google Scholar 

  7. Cheng H, Yang L, Liu Z (2016) A survey on 3D hand gesture recognition. IEEE Trans Circ Sys Video Technol 9:1659–1673

    Article  Google Scholar 

  8. Choi D, Cho H, Seo K, Lee S, Lee J, Ko J (2019) Designing hand pose aware virtual keyboard with hand drift tolerance. IEEE Access 7:96035–96047

    Article  Google Scholar 

  9. Choi C, Sinha A, Choi JH, Jang S, Ramani K (2015) A collaborative filtering approach to real-time hand pose estimation. In: IEEE international conference on computer vision, pp 2336–2344

  10. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24:603–619

    Article  Google Scholar 

  11. Dominio F, Donadeo M, Marin G, Zanuttigh P, Cortelazzo GM (2013) Hand gesture recognition with depth data. In: 4th ACM/IEEE international workshop on analysis and retrieval of tracked events and motion in imagery stream, pp 9–16

  12. Dominio F, Donadeo M, Zanuttigh P (2014) Combining multiple depth-based descriptors for hand gesture recognition. Pattern Recogn Lett 50:101–111

    Article  Google Scholar 

  13. Dong C, Leu MC, Yin Z (2015) American sign language alphabet recognition using microsoft kinect. In: IEEE conference on computer vision and pattern recognition workshops, pp 44–52

  14. Elboushaki A, Hannane R, Afdel K, Koutti L (2017) A robust approach for object matching and classification using partial dominant orientation descriptor. Pattern Recogn 64:168–186

    Article  Google Scholar 

  15. Estrela BNS, Chavezy GC, Campos MFM (2013) Sign language recognition using partial least squares and RGB-d information. In: Visão computacional workshop (WVC)

  16. Ferreira PM, Cardoso JS, Rebelo A (2019) On the role of multimodal learning in the recognition of sign language. Multimed Tools Appl 78:10035–10056

    Article  Google Scholar 

  17. Fleishman S, Kliger M, Lerner A, Kutliroff G (2015) ICPIK: inverse kinematics based articulated-ICP. In: IEEE conference on computer vision and pattern recognition, pp 28–35

  18. Ge L, Liang H, Yuan J, Thalmann D (2016) Robust 3D hand pose estimation in single depth images: from single-view CNN to multi-view CNNs. In: IEEE conference on computer vision and pattern recognition

  19. Grzejszczak T, Kawulok M, Galuszka A (2016) Hand landmarks detection and localization in color images. Multimed Tools Appl 75:16363–16387

    Article  Google Scholar 

  20. Herrera D, Kannala J, Heikkilä J (2012) Joint depth and correction, color camera calibration with distortion. IEEE Trans Pattern Anal Mach Intell 34:2058–2064

    Article  Google Scholar 

  21. Hou G, Cui R, Zhang C (2015) A real-time hand pose estimation system with retrieval. In: IEEE international conference on systems, man, and cybernetics, pp 1738–1744

  22. Hu Z, Hu Y, Wu B, Liu J, Han D, Kurfess T (2017) Hand pose estimation with multi-scale network. Appl Intell, pp 1–15

  23. Ji P, Song A, Xiong P, Yi P, Xu X, Li H (2017) Egocentric-vision based hand posture control system for reconnaissance robots. J Intell Robotic Sys 87:583–599

    Article  Google Scholar 

  24. Keskin C, Kirac F, Kara YE, Akarun L (2011) Real time hand pose estimation using depth sensors. In: IEEE international conference on computer vision workshops, pp 1228–1234

  25. Keskin C, Krac F, Kara YE, Akarun L (2012) Hand pose estimation and hand shape classification using multi-layered randomized decision forests. In: European conference on computer vision, pp 852–863

  26. Kirac F, Kara YE, Akarun L (2014) Hierarchically constrained 3D hand pose estimation using regression forests from single frame depth data. Pattern Recogn Lett 50:91–100

    Article  Google Scholar 

  27. Krejov P, Gilbert A, Bowden R (2015) Combining discriminative and model based approaches for hand pose estimation. In: 11th IEEE international conference and workshops on automatic face and gesture recognition, pp 1–7

  28. Krejov P, Gilbert A, Bowden R (2017) Guided optimisation through classification and regression for hand pose estimation. Comput Vis Image Underst 155:124–138

    Article  Google Scholar 

  29. Kuznetsova A, Taixe LL, Rosenhahn B (2013) Real-time sign language recognition using a consumer depth camera. In: IEEE international conference on computer vision workshops, pp 83–90

  30. Li P, Ling H, Li X, Liao C (2015) 3D hand pose estimation using randomized decision forest with segmentation index points. In: IEEE international conference on computer vision, pp 819– 827

  31. Li YT, Wachs JP (2014) HEGM: a hierarchical elastic graph matching for hand gesture recognition. Pattern Recogn 47:80–88

    Article  Google Scholar 

  32. Liang H, Yuan J, Thalmann D (2014) Parsing the hand in depth images. IEEE Trans Multimed 16:1241–1253

    Article  Google Scholar 

  33. Lowe DG (1999) Object recognition from local scale-invariant features. In: IEEE international conference on computer vision, pp 1150–1157

  34. Makris A, Kyriazis N, Argyros AA (2015) Hierarchical particle filtering for 3D hand tracking. In: IEEE conference on computer vision and pattern recognition workshops, pp 8–17

  35. Malik J, Elhayek A, Nunnari F, Stricker D (2019) Simple and effective deep hand shape and pose regression from a single depth image. Computers & Graphics 85:85–91

    Article  Google Scholar 

  36. Martin E, Peter KH, Jörg S, Xiaowei X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Second international conference on knowledge discovery and data mining, pp 226–231

  37. Media and Communication Lab, China, HUST American Sign Language. http://mclab.eic.hust.edu.cn/1333MClabManage/ProjDemo.aspx Accessed 06 Feb 2018

  38. Mirehi N, Tahmasbi M, Targhi AT (2019) Hand gesture recognition using topological features. Multimed Tools Appl 78:13361–13386

    Article  Google Scholar 

  39. Modanwal G, Sarawadekar K (2016) Towards hand gesture based writing support system for blinds. Pattern Recogn 57:50–60

    Article  Google Scholar 

  40. Nai W, Liu Y, Rempel D, Wang Y (2017) Fast hand posture classification using depth features extracted from random line segments. Pattern Recogn 65:1–10

    Article  Google Scholar 

  41. Oberweger M, Lepetit V (2017) Deepprior++: improving fast and accurate 3D hand pose estimation. In: IEEE international conference on computer vision, pp 585–594

  42. Oberweger M, Wohlhart P, Lepetit V (2015) Hands deep in deep learning for hand pose estimation. In: 20th computer vision winter workshop

  43. Oberweger M, Wohlhart P, Lepetit V (2015) Training a feedback loop for hand pose estimation. In: IEEE international conference on computer vision, pp 3316–3324

  44. Oikonomidis I, Kyriazis N, Argyros AA (2011) Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: IEEE international conference on computer vision, pp 2088–2095

  45. Ozturk O, Aksac A, Ozyer T, Alhajj R (2015) Boosting real-time recognition of hand posture and gesture for virtual mouse operations with segmentation. Appl Intell 43:786–801

    Article  Google Scholar 

  46. Paulo SF, Relvas F, Nicolau H, Rekik Y, Machado V, Botelho J, Mendes JJ, Grisoni L, Jorge J, Lopes DS (2019) Touchless interaction with medical images based on 3D hand cursors supported by single-foot input: a case study in dentistry. J Biomed Inform 100:103316

    Article  Google Scholar 

  47. Pisharady PK, Saerbeck M (2015) Recent methods and databases in vision-based hand gesture recognition: a review. Comput Vis Image Underst 141:152–165

    Article  Google Scholar 

  48. Poier G, Roditakis K, Schulter S, Michel D, Bischof H, Argyros AA (2015) Hybrid one-shot 3D hand pose estimation by exploiting uncertainties. arXiv:1510.08039

  49. Priyal SP, Bora PK (2013) A robust static hand gesture recognition system using geometry based normalizations and Krawtchouk moments. Pattern Recogn 46:2202–2219

    Article  Google Scholar 

  50. Pugeault N, Bowden R (2011) Spelling it out: real-time ASL fingerspelling recognition. In: IEEE international conference on computer vision workshops, pp 1114–1119

  51. Qian C, Sun X, Wei Y, Tang X, Sun J (2014) Realtime and robust hand tracking from depth. In: IEEE conference on computer vision and pattern recognition, pp 1106–1113

  52. Remelli E, Tkach A, Tagliasacchi A, Pauly M (2017) Low-dimensionality calibration through local anisotropic scaling for robust hand model personalization. In: IEEE international conference on computer vision, pp 2535–2543

  53. Ren Y, Xie X, Li G, Wang Z (2016) Hand gesture recognition with multiscale weighted histogram of contour direction normalization for wearable applications. IEEE Trans Circ Sys Video Technol 28:364–377

    Article  Google Scholar 

  54. Ren Z, Yuan J, Meng J, Zhang Z (2013) Robust part-based hand gesture recognition using kinect sensor. IEEE Trans Multimed 15:1110–1120

    Article  Google Scholar 

  55. Rodriguez KO, Chavez GC (2013) Finger spelling recognition from RGB-d information using kernel descriptor. In: IEEE conference on graphics, patterns and images, pp 1–7

  56. Sharp T, Keskin C, Robertson D, Taylor J, Shotton J, Kim D, Rhemann C, Leichter I, Vinnikov A, Wei Y, Freedman D, Kohli P, Krupka E, Fitzgibbon A, Izadi S (2015) Accurate, robust, and flexible real-time hand tracking. In: 33rd annual ACM conference on human factors in computing systems, pp 3633–3642

  57. Shotton S, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth image. In: IEEE conference on computer vision and pattern recognition, pp 116–124

  58. Shotton J, Girshick R, Fitzgibbon A, Sharp T, Cook M, Finocchio M, Moore R, Kohli P, Criminisi A, Kipman A, Blake A (2013) Efficient human pose estimation from single depth images. IEEE Trans Pattern Anal Mach Intell 35:2821–2840

    Article  Google Scholar 

  59. Sridhar S, Mueller F, Oulasvirta A, Theobalt C (2015) Fast and robust hand tracking using detection-guided optimization. In: IEEE conference on computer vision and pattern recognition, pp 3213– 3221

  60. Suau X, Alcoverro M, López-Méndez A, Ruiz-Hidalgo J, Casas JR (2014) Real-time fingertip localization conditioned on hand gesture classification. Image Vis Comput 32:522–532

    Article  Google Scholar 

  61. Sun X, Wei Y, Liang S, Tang X, Sun J (2015) Cascaded hand pose regression. In: IEEE conference on computer vision and pattern recognition, pp 824–832

  62. Supancic JS, Rogez G, Yang Y, Shotton J, Ramanan D (2015) Depth-based hand pose estimation: data, methods, and challenges. In: IEEE international conference on computer vision, pp 1868–1876

  63. Tagliasacchi A, Schroeder M, Tkach A, Bouaziz S, Botsch M, Pauly M (2015) Robust articulated-ICP for real-time hand tracking. Computer Graphics Forum 34:101–114

    Article  Google Scholar 

  64. Tang D, Chang HJ, Tejani A, Kim TK (2014) Latent regression forest: structured estimation of 3D articulated hand posture. In: IEEE conference on computer vision and pattern recognition, pp 3786–3793

  65. Tang D, Taylor J, Kohli P, Keskin C, Kim TK, Shotton J (2015) Opening the black box: hierarchical sampling optimization for estimating human hand pose. In: IEEE international conference on computer vision, pp 3325–3333

  66. Taylor J, Shotton J, Sharp T, Fitzgibbon A (2012) The vitruvian manifold: inferring dense correspondences for one-shot human pose estimation. In: IEEE conference on computer vision and pattern recognition, pp 103–110

  67. Tkach A, Pauly M, Tagliasacchi A (2016) Sphere-meshes for real-time hand modeling and tracking. ACM Trans Graph 35:1–11

    Article  Google Scholar 

  68. Tkach A, Tagliasacchi A, Remelli E, Pauly M, Fitzgibbon A (2017) Online generative model personalization for hand tracking. ACM Trans Graph 36:1–11

    Article  Google Scholar 

  69. Tompson J, Stein M, Lecun Y, Perlin K (2014) Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans Graph 33:169

    Article  Google Scholar 

  70. Wan C, Yao A, Gool LV (2016) Hand pose estimation from local surface normals. In: European conference on computer vision, pp 554–569

  71. Xie B, He X, Li Y (2018) RGB-D static gesture recognition based on convolutional neural network. J Eng 16:1515–1520

    Google Scholar 

  72. Xu C, Cheng L (2013) Efficient hand pose estimation from a single depth image. In: IEEE international conference on computer vision, pp 3456–3462

  73. Xu C, Nanjappa A, Zhang X, Cheng L (2016) Estimate hand poses efficiently from single depth images. Int J Comput Vis 116:21–45

    Article  MathSciNet  Google Scholar 

  74. Yao Y, Fu Y (2012) Real-time hand pose estimation from RGB-D sensor. In: IEEE international conference on multimedia and expo, pp 705–710

  75. Ye Q, Yuan S, Kim TK (2016) Spatial attention deep net with partial PSO for hierarchical hybrid hand pose estimation. In: European conference on computer vision, pp 346–361

  76. Zhang Y, Meruvia-Pastor O (2017) Virtual panels with hand gestures in immersive VR games. In: International conference on augmented reality, virtual reality and computer graphics, pp 299–308

  77. Zhang C, Tian Y (2015) Histogram of 3D facets: a depth descriptor for human action and hand gesture recognition. Comput Vis Image Underst 139:29–39

    Article  Google Scholar 

  78. Zhou Y, Jiang G, Lin Y (2016) A novel finger and hand pose estimation technique for real-time hand gesture recognition. Pattern Recogn 49:102–114

    Article  Google Scholar 

  79. Zhou X, Wan Q, Zhang W, Xue X, Wei Y (2016) Model-based deep hand pose estimation. arXiv:1606.06854

Download references

Acknowledgements

The authors would like to thank the associate editors and the anonymous reviewers for their valuable and insightful comments and suggestions, which have contributed a lot towards improving the contents and presentation of this article. This work was supported by “Centre National pour la Recherche Scientifique et Technique (CNRST)” funded by Moroccan government under the Grant no: 14UIZ2015.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdessamad Elboushaki.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Elboushaki, A., Hannane, R., Afdel, K. et al. Improving articulated hand pose detection for static finger sign recognition in RGB-D images. Multimed Tools Appl 79, 28925–28969 (2020). https://doi.org/10.1007/s11042-020-09370-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-09370-y

Keywords

Navigation