Abstract
Complexities such as variations in pose, scale, speed, illumination, occlusion, etc., in detecting the gesture object at the first frame and tracking it in a dynamic environment make it challenging for gesture-based human–computer interactions. The kernel-based operation on the first frame requires additional computational resources for detecting the bare hand. Therefore, a region-based detection approach is proposed to detect the bare hand-like regions using skin color and movement information. Using transfer learning, the AlexNet is used for detecting the bare hand from this region of interest. To track it in the successive frames, a Detection and Tracking (DaT) algorithm is proposed. The smoothen trajectory is utilized for training a separate deep neural network to recognize 60 isolated dynamic gestures. An average accuracy of 97.80%, 96.20% is achieved for the bare hand detection on the Oxford hand database, NITS hand gesture VIII database. The experimental observation reveals that the proposed method is computationally efficient as compared to the existing models. For the recognition of 60 gestures, an average accuracy of 96.48% is achieved. The proposed system provides a relative improvement of 17% over the base models.















Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Pan, X., Tang, F., Dong, W., Gu, Y., Song, Z., Meng, Y., Xu, P., Deussen, O., Xu, C.: Self-supervised feature augmentation for large image object detection. IEEE Trans. on Image Process. (2020). https://doi.org/10.1109/TIP.2020.2993403
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017). https://doi.org/10.1109/TPAMI.2016.2577031
Huang, Y., Yang, X., Gao, J., Sang, J., Xu, C.: Knowledge-driven egocentric multimodal activity recognition. ACM Trans. Multimedia Comput. Commun. Appl. 16, 1–133 (2021). https://doi.org/10.1145/3409332
Saboo, S., Singha, J., Laskar, R.H.: Dynamic hand gesture recognition using combination of two-level tracker and trajectory-guided features. Multimedia Syst. (2021). https://doi.org/10.1007/s00530-021-00811-8
Schlemper, J., Caballero, J., Hajnal, J.V., Price, A.N., Rueckert, D.: A deep cascade of convolutional neural networks for dynamic MR image reconstruction. IEEE Trans. Med. Imaging. 37, 491–503 (2018). https://doi.org/10.1109/TMI.2017.2760978
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with Kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015). https://doi.org/10.1109/TPAMI.2014.2345390
Putro, M. D., & Jo, K.-H.: Real-time Face Tracking for Human-Robot Interaction. 2018 International Conference on Information and Communication Technology Robotics (ICT-ROBOT), (2018), 1–4. https://doi.org/10.1109/ICT-ROBOT.2018.8549902.
Farahi, F., Yazdi, H.S.: Probabilistic Kalman filter for moving object tracking. Signal Process.: Image Commun. 82, 115751 (2020). https://doi.org/10.1016/j.image.2019.115751
Li, S.E., Li, G., Yu, J., Liu, C., Cheng, B., Wang, J., Li, K.: Kalman filter-based tracking of moving objects using linear ultrasonic sensor array for road vehicles. Mech. Syst. Signal Process. 98, 173–189 (2018). https://doi.org/10.1016/j.ymssp.2017.04.041
Chan, Y.T., Hu, A.G.C., Plant, J.B.: A Kalman Filter Based Tracking Scheme with Input Estimation. IEEE Trans. Aerosp. Electron. Syst. AES-15(2), 237–244 (1979). https://doi.org/10.1109/TAES.1979.308710
Nayak, T., & Bhoi, N.: Object Detection and Tracking using Watershed Segmentation and KLT Tracker. Glob. J. Comp. Sci. Technol. (2020).
Han, Y., Kim, C., Jang, Y., & Kim, H. J.: Parametric analysis of KLT algorithm in autonomous driving. 2020 20th International Conference on Control, Automation and Systems (ICCAS), (2020), 184–189. https://doi.org/10.23919/ICCAS50221.2020.9268239.
Yongyong, D., Xinhua, H., yujie, Y., & Zongling, W.: Image stabilization algorithm based on KLT motion tracking. 2020 International Conference on Computer Vision, Image and Deep Learning (CVIDL), (2020), 44–47. https://doi.org/10.1109/CVIDL51233.2020.00016.
Masilang, R. A. A., Cabatuan, M. K., & Dadios, E. P.: Hand initialization and tracking using a modified KLT tracker for a computer vision-based breast self-examination system. 2014 International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM), 1–5. https://doi.org/10.1109/HNICEM.2014.7016244.
Yadav, K.S., Singha, J.: Facial expression recognition using modified Viola-John’s algorithm and KNN classifier. Multimedia Tools Appl. 79(19–20), 13089–13107 (2020). https://doi.org/10.1007/s11042-019-08443-x
Singha, J., Laskar, R.H.: Hand gesture recognition using two-level speed normalization, feature selection and classifier fusion. Multimedia Syst. 23, 499–514 (2017). https://doi.org/10.1007/s00530-016-0510-0
Fan, B., Li, Z., Gao, J.: DwiMark: a multiscale robust deep watermarking framework for diffusion-weighted imaging images. Multimedia Syst. (2021). https://doi.org/10.1007/s00530-021-00835-0
Yadav, K.S., Singha, J.: Facial expression recognition using modified Viola-John’s algorithm and KNN classifier. Multimed Tools Appl. 79, 13089–13107 (2020). https://doi.org/10.1007/s11042-019-08443-x
Mukherjee, S., Ahmed, S.A., Dogra, D.P., Kar, S., Roy, P.P.: Fingertip detection and tracking for recognition of air-writing in videos. Expert Syst. Appl. 136, 217–229 (2019). https://doi.org/10.1016/j.eswa.2019.06.034
Medjram, S.: Automatic Hand Detection in Color Images based on skin region verification. Multimed Tools Appl. (2018). https://doi.org/10.1007/s11042-017-4995-0
McBride, T.J., Vandayar, N., Nixon, K.J.: A comparison of skin detection algorithms for hand gesture recognition. In: 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA). pp. 211–216. IEEE, Bloemfontein, South Africa (2019)
Chen, X., Wang, G., Guo, H., Zhang, C.: Pose guided structured region ensemble network for cascaded hand pose estimation. Neurocomputing 395, 138–149 (2020). https://doi.org/10.1016/j.neucom.2018.06.097
Misra, S., Hussain Laskar, R.: Comparative framework for vision-based gesturing modes and implementation of robust colour-marker detector for practical environments. IET Image Proc. 13, 1460–1469 (2019). https://doi.org/10.1049/iet-ipr.2018.5978
Misra, S., Laskar, R.H.: Integrated features and GMM based hand detector applied to character recognition system under practical conditions. Multimed Tools Appl. 78, 34927–34961 (2019). https://doi.org/10.1007/s11042-019-08105-y
Gao, Q., Liu, J., Ju, Z.: Robust real-time hand detection and localization for space human–robot interaction based on deep learning. Neurocomputing 390, 198–206 (2020). https://doi.org/10.1016/j.neucom.2019.02.066
Gao, Q., Liu, J., Ju, Z., Zhang, X.: Dual-hand detection for human-robot interaction by a parallel network based on hand detection and body pose estimation. IEEE Trans. Ind. Electron. 66, 9663–9672 (2019). https://doi.org/10.1109/TIE.2019.2898624
Zhao, Z.-Q., Zheng, P., Xu, S.-T., Wu, X.: Object detection with deep learning: a review. IEEE Trans. Neural Netw. Learn. Syst. 30, 3212–3232 (2019). https://doi.org/10.1109/TNNLS.2018.2876865
Wu, X., Sahoo, D., Hoi, S.C.H.: Recent advances in deep learning for object detection. Neurocomputing 396, 39–64 (2020). https://doi.org/10.1016/j.neucom.2020.01.085
Misra, S., Singha, J., Laskar, R.H.: Vision-based hand gesture recognition of alphabets, numbers, arithmetic operators and ASCII characters in order to develop a virtual text-entry interface system. Neural Comput. Appl. 29, 117–135 (2018). https://doi.org/10.1007/s00521-017-2838-6
Skaria, S., Al-Hourani, A., Lech, M., Evans, R.J.: Hand-gesture recognition using two-antenna doppler radar with deep convolutional neural networks. IEEE Sens. J. 19, 3041–3048 (2019). https://doi.org/10.1109/JSEN.2019.2892073
Ozcan, T., Basturk, A.: Transfer learning-based convolutional neural networks with heuristic optimization for hand gesture recognition. Neural Comput. Appl. 31, 8955–8970 (2019). https://doi.org/10.1007/s00521-019-04427-y
Abavisani, M., Joze, H.R.V., Patel, V.M.: Improving the performance of unimodal dynamic hand-gesture recognition with multimodal training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 1165–1174. IEEE, Long Beach, CA, USA (2019)
Singha, J., Das, K.: Automatic Indian Sign Language recognition for continuous video sequence. ADBU. J. Eng. Technol. 2(1), (2015)
Mittal, A., Zisserman, A., Torr, P.: Hand detection using multiple proposals. In: Procedings of the British Machine Vision Conference 2(3), 5 (2011)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM. 60, 84–90 (2017). https://doi.org/10.1145/3065386
Saboo, S., Singha, J.: Vision based two-level hand tracking system for dynamic hand gestures in indoor environment. Multimed Tools Appl. 80, 20579–20598 (2021). https://doi.org/10.1007/s11042-021-10669-7
Acknowledgements
This research work is funded by the SERB IMPRINT:(IMP/2018/000098) project. We thank the department of electronics and communication engineering of NITS for providing us with the necessary facilities to carry out this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by B-K Bao.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yadav, K.S., Anish Monsley, K., Laskar, R.H. et al. A selective region-based detection and tracking approach towards the recognition of dynamic bare hand gesture using deep neural network. Multimedia Systems 28, 861–879 (2022). https://doi.org/10.1007/s00530-022-00890-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-022-00890-1
Keywords
Profiles
- Kuldeep Singh Yadav View author profile