Skip to main content

Advertisement

Log in

A selective region-based detection and tracking approach towards the recognition of dynamic bare hand gesture using deep neural network

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Complexities such as variations in pose, scale, speed, illumination, occlusion, etc., in detecting the gesture object at the first frame and tracking it in a dynamic environment make it challenging for gesture-based human–computer interactions. The kernel-based operation on the first frame requires additional computational resources for detecting the bare hand. Therefore, a region-based detection approach is proposed to detect the bare hand-like regions using skin color and movement information. Using transfer learning, the AlexNet is used for detecting the bare hand from this region of interest. To track it in the successive frames, a Detection and Tracking (DaT) algorithm is proposed. The smoothen trajectory is utilized for training a separate deep neural network to recognize 60 isolated dynamic gestures. An average accuracy of 97.80%, 96.20% is achieved for the bare hand detection on the Oxford hand database, NITS hand gesture VIII database. The experimental observation reveals that the proposed method is computationally efficient as compared to the existing models. For the recognition of 60 gestures, an average accuracy of 96.48% is achieved. The proposed system provides a relative improvement of 17% over the base models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

References

  1. Pan, X., Tang, F., Dong, W., Gu, Y., Song, Z., Meng, Y., Xu, P., Deussen, O., Xu, C.: Self-supervised feature augmentation for large image object detection. IEEE Trans. on Image Process. (2020). https://doi.org/10.1109/TIP.2020.2993403

    Article  Google Scholar 

  2. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031

    Article  Google Scholar 

  3. Huang, Y., Yang, X., Gao, J., Sang, J., Xu, C.: Knowledge-driven egocentric multimodal activity recognition. ACM Trans. Multimedia Comput. Commun. Appl. 16, 1–133 (2021). https://doi.org/10.1145/3409332

    Article  Google Scholar 

  4. Saboo, S., Singha, J., Laskar, R.H.: Dynamic hand gesture recognition using combination of two-level tracker and trajectory-guided features. Multimedia Syst. (2021). https://doi.org/10.1007/s00530-021-00811-8

    Article  Google Scholar 

  5. Schlemper, J., Caballero, J., Hajnal, J.V., Price, A.N., Rueckert, D.: A deep cascade of convolutional neural networks for dynamic MR image reconstruction. IEEE Trans. Med. Imaging. 37, 491–503 (2018). https://doi.org/10.1109/TMI.2017.2760978

    Article  Google Scholar 

  6. Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with Kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015). https://doi.org/10.1109/TPAMI.2014.2345390

    Article  Google Scholar 

  7. Putro, M. D., & Jo, K.-H.: Real-time Face Tracking for Human-Robot Interaction. 2018 International Conference on Information and Communication Technology Robotics (ICT-ROBOT), (2018), 1–4. https://doi.org/10.1109/ICT-ROBOT.2018.8549902.

  8. Farahi, F., Yazdi, H.S.: Probabilistic Kalman filter for moving object tracking. Signal Process.: Image Commun. 82, 115751 (2020). https://doi.org/10.1016/j.image.2019.115751

    Article  Google Scholar 

  9. Li, S.E., Li, G., Yu, J., Liu, C., Cheng, B., Wang, J., Li, K.: Kalman filter-based tracking of moving objects using linear ultrasonic sensor array for road vehicles. Mech. Syst. Signal Process. 98, 173–189 (2018). https://doi.org/10.1016/j.ymssp.2017.04.041

    Article  Google Scholar 

  10. Chan, Y.T., Hu, A.G.C., Plant, J.B.: A Kalman Filter Based Tracking Scheme with Input Estimation. IEEE Trans. Aerosp. Electron. Syst. AES-15(2), 237–244 (1979). https://doi.org/10.1109/TAES.1979.308710

    Article  Google Scholar 

  11. Nayak, T., & Bhoi, N.: Object Detection and Tracking using Watershed Segmentation and KLT Tracker. Glob. J. Comp. Sci. Technol. (2020).

  12. Han, Y., Kim, C., Jang, Y., & Kim, H. J.: Parametric analysis of KLT algorithm in autonomous driving. 2020 20th International Conference on Control, Automation and Systems (ICCAS), (2020), 184–189. https://doi.org/10.23919/ICCAS50221.2020.9268239.

  13. Yongyong, D., Xinhua, H., yujie, Y., & Zongling, W.: Image stabilization algorithm based on KLT motion tracking. 2020 International Conference on Computer Vision, Image and Deep Learning (CVIDL), (2020), 44–47. https://doi.org/10.1109/CVIDL51233.2020.00016.

  14. Masilang, R. A. A., Cabatuan, M. K., & Dadios, E. P.: Hand initialization and tracking using a modified KLT tracker for a computer vision-based breast self-examination system. 2014 International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM), 1–5. https://doi.org/10.1109/HNICEM.2014.7016244.

  15. Yadav, K.S., Singha, J.: Facial expression recognition using modified Viola-John’s algorithm and KNN classifier. Multimedia Tools Appl. 79(19–20), 13089–13107 (2020). https://doi.org/10.1007/s11042-019-08443-x

    Article  Google Scholar 

  16. Singha, J., Laskar, R.H.: Hand gesture recognition using two-level speed normalization, feature selection and classifier fusion. Multimedia Syst. 23, 499–514 (2017). https://doi.org/10.1007/s00530-016-0510-0

    Article  Google Scholar 

  17. Fan, B., Li, Z., Gao, J.: DwiMark: a multiscale robust deep watermarking framework for diffusion-weighted imaging images. Multimedia Syst. (2021). https://doi.org/10.1007/s00530-021-00835-0

    Article  Google Scholar 

  18. Yadav, K.S., Singha, J.: Facial expression recognition using modified Viola-John’s algorithm and KNN classifier. Multimed Tools Appl. 79, 13089–13107 (2020). https://doi.org/10.1007/s11042-019-08443-x

    Article  Google Scholar 

  19. Mukherjee, S., Ahmed, S.A., Dogra, D.P., Kar, S., Roy, P.P.: Fingertip detection and tracking for recognition of air-writing in videos. Expert Syst. Appl. 136, 217–229 (2019). https://doi.org/10.1016/j.eswa.2019.06.034

    Article  Google Scholar 

  20. Medjram, S.: Automatic Hand Detection in Color Images based on skin region verification. Multimed Tools Appl. (2018). https://doi.org/10.1007/s11042-017-4995-0

    Article  Google Scholar 

  21. McBride, T.J., Vandayar, N., Nixon, K.J.: A comparison of skin detection algorithms for hand gesture recognition. In: 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA). pp. 211–216. IEEE, Bloemfontein, South Africa (2019)

  22. Chen, X., Wang, G., Guo, H., Zhang, C.: Pose guided structured region ensemble network for cascaded hand pose estimation. Neurocomputing 395, 138–149 (2020). https://doi.org/10.1016/j.neucom.2018.06.097

    Article  Google Scholar 

  23. Misra, S., Hussain Laskar, R.: Comparative framework for vision-based gesturing modes and implementation of robust colour-marker detector for practical environments. IET Image Proc. 13, 1460–1469 (2019). https://doi.org/10.1049/iet-ipr.2018.5978

    Article  Google Scholar 

  24. Misra, S., Laskar, R.H.: Integrated features and GMM based hand detector applied to character recognition system under practical conditions. Multimed Tools Appl. 78, 34927–34961 (2019). https://doi.org/10.1007/s11042-019-08105-y

    Article  Google Scholar 

  25. Gao, Q., Liu, J., Ju, Z.: Robust real-time hand detection and localization for space human–robot interaction based on deep learning. Neurocomputing 390, 198–206 (2020). https://doi.org/10.1016/j.neucom.2019.02.066

    Article  Google Scholar 

  26. Gao, Q., Liu, J., Ju, Z., Zhang, X.: Dual-hand detection for human-robot interaction by a parallel network based on hand detection and body pose estimation. IEEE Trans. Ind. Electron. 66, 9663–9672 (2019). https://doi.org/10.1109/TIE.2019.2898624

    Article  Google Scholar 

  27. Zhao, Z.-Q., Zheng, P., Xu, S.-T., Wu, X.: Object detection with deep learning: a review. IEEE Trans. Neural Netw. Learn. Syst. 30, 3212–3232 (2019). https://doi.org/10.1109/TNNLS.2018.2876865

    Article  Google Scholar 

  28. Wu, X., Sahoo, D., Hoi, S.C.H.: Recent advances in deep learning for object detection. Neurocomputing 396, 39–64 (2020). https://doi.org/10.1016/j.neucom.2020.01.085

    Article  Google Scholar 

  29. Misra, S., Singha, J., Laskar, R.H.: Vision-based hand gesture recognition of alphabets, numbers, arithmetic operators and ASCII characters in order to develop a virtual text-entry interface system. Neural Comput. Appl. 29, 117–135 (2018). https://doi.org/10.1007/s00521-017-2838-6

    Article  Google Scholar 

  30. Skaria, S., Al-Hourani, A., Lech, M., Evans, R.J.: Hand-gesture recognition using two-antenna doppler radar with deep convolutional neural networks. IEEE Sens. J. 19, 3041–3048 (2019). https://doi.org/10.1109/JSEN.2019.2892073

    Article  Google Scholar 

  31. Ozcan, T., Basturk, A.: Transfer learning-based convolutional neural networks with heuristic optimization for hand gesture recognition. Neural Comput. Appl. 31, 8955–8970 (2019). https://doi.org/10.1007/s00521-019-04427-y

    Article  Google Scholar 

  32. Abavisani, M., Joze, H.R.V., Patel, V.M.: Improving the performance of unimodal dynamic hand-gesture recognition with multimodal training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 1165–1174. IEEE, Long Beach, CA, USA (2019)

  33. Singha, J., Das, K.: Automatic Indian Sign Language recognition for continuous video sequence. ADBU. J. Eng. Technol. 2(1), (2015)

  34. Mittal, A., Zisserman, A., Torr, P.: Hand detection using multiple proposals. In: Procedings of the British Machine Vision Conference 2(3), 5 (2011)

  35. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM. 60, 84–90 (2017). https://doi.org/10.1145/3065386

    Article  Google Scholar 

  36. Saboo, S., Singha, J.: Vision based two-level hand tracking system for dynamic hand gestures in indoor environment. Multimed Tools Appl. 80, 20579–20598 (2021). https://doi.org/10.1007/s11042-021-10669-7

    Article  Google Scholar 

Download references

Acknowledgements

This research work is funded by the SERB IMPRINT:(IMP/2018/000098) project. We thank the department of electronics and communication engineering of NITS for providing us with the necessary facilities to carry out this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kuldeep Singh Yadav.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by B-K Bao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yadav, K.S., Anish Monsley, K., Laskar, R.H. et al. A selective region-based detection and tracking approach towards the recognition of dynamic bare hand gesture using deep neural network. Multimedia Systems 28, 861–879 (2022). https://doi.org/10.1007/s00530-022-00890-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-022-00890-1

Keywords

Profiles

  1. Kuldeep Singh Yadav