Abstract
Hand gesture plays an important role in communication among the hearing and speech disorders people. Hand gesture recognition (HGR) is the backbone of human–computer interaction (HCI). Most of the reported hand gesture recognition techniques suffer due to the complex backgrounds. As per the literature, most of the existing HGR methods have only selected a few inter-class similar gestures for recognition performance. This paper proposes a two-phase deep learning-based HGR system to mitigate the complex background issue and consider all gesture classes. In the first phase, inception V3 architecture is improved and named mIV3Net: modified inception V3 network to reduce the computational resource requirement. In the second phase, mIV3Net has been fine-tuned to offer more attention to prominent features. As a result, better abstract knowledge has been used for gesture recognition. Hence, the proposed algorithm has more discrimination characteristics. The efficacy of the proposed two-phase-based HGR system is validated and generalized through experimentation using five publicly available standard datasets: MUGD, ISL, ArSL, NUS-I, and NUS-II. The accuracy values of the proposed system on five datasets in the above order are 97.14%, 99.3%, 97.4%, 99%, and 99.8%, which indicates significant improvement, i.e., 12.58%, 2.54%, 2.73%, 0.56%, and 2.02%, respectively, than the state-of-the-art HGR systems.
















Similar content being viewed by others
Data Availability
The datasets that support the findings of this study are available in: (1) MUGD: http://hdl.handle.net/10179/4514. (2) ISL: https://doi.org/10.1007/s13369-021-06456-z. (3) ArSL: https://doi.org/10.1016/j.dib.2019.103777. (4) NUS-I and II: https://doi.org/10.1007/s11263-012-0560-5.
References
Aly S, Aly W (2020) DeepArSLR: a novel signer-independent deep learning framework for isolated Arabic sign language gestures recognition. IEEE Access 8:83199–83212
Badi H (2016) Recent methods in vision-based hand gesture recognition. Int J Data Sci Anal 1(2):77–87
Bansal SR, Wadhawan S, Goel R (2022) mRMR-PSO: a hybrid feature selection technique with a multiobjective approach for sign language recognition. Arab J Sci Eng 47(8):10365–10380
Barczak ALC, Reyes NH, Abastillas M, Piccio A, Susnjak T (2011) A new 2D static hand gesture colour image dataset for ASL gestures. Research Letters in the Information and Mathematical Sciences 15:12–20. http://hdl.handle.net/10179/4514
Bhaumik G, Verma M, Govil MC, Vipparthi SK (2023) HyFiNet: hybrid feature attention network for hand gesture recognition. Multimed Tools Appl 82(4):4863–4882
Can C, Kaya Y, Kılıç F (2021) A deep convolutional neural network model for hand gesture recognition in 2d near-infrared images. Biomed Phys Eng Express 7(5):055005
Chevtchenko SF, Vale RF, Macario V, Cordeiro FR (2018) A convolutional neural network with feature fusion for real-time hand posture recognition. Appl Soft Comput 73:748–766
Dadashzadeh A, Targhi AT, Tahmasbi M, Mirmehdi M (2019) HGR-Net: a fusion network for hand gesture segmentation and recognition. IET Comput Vision 13(8):700–707
Gupta B, Shukla P, Mittal A (2016) K-nearest correlated neighbor classification for Indian sign language gesture recognition using feature fusion. In: 2016 International Conference on Computer Communication and Informatics (ICCCI). IEEE, pp 1–5
Hasan HS, Kareem SA (2012) Human computer interaction for vision based hand gesture recognition: a survey. In: 2012 international conference on Advanced Computer Science Applications and Technologies (ACSAT). IEEE, pp 55–60
He K., Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708
Huesser C, Schubiger S, Çöltekin A (2021) Gesture interaction in virtual reality: a low-cost machine learning system and a qualitative assessment of effectiveness of selected gestures vs. gaze and controller interaction. In: Human-Computer Interaction–INTERACT 2021: 18th IFIP TC 13 International Conference, Bari, Italy, August 30–September 3, 2021, Proceedings, Part III 18. Springer, pp 151–160
Jadooki S, Mohamad D, Saba T, Almazyad AS, Rehman A (2017) Fused features mining for depth-based hand gesture recognition to classify blind human communication. Neural Comput Appl 28:3285–3294
Jaramillo-Yánez A, Benalcázar ME, Mena-Maldonado E (2020) Real-time hand gesture recognition using surface electromyography and machine learning: a systematic literature review. Sensors 20(9):2467
Joshi G, Singh S, Vig R (2020) Taguchi-TOPSIS based HOG parameter selection for complex background sign language recognition. J Vis Commun Image Represent 71:102834
Kamruzzaman MM (2020) Arabic sign language recognition and generating Arabic speech using convolutional neural network. Wirel Commun Mob Comput pp 1–9. https://doi.org/10.1155/2020/3685614
Kowdiki M, Khaparde A (2022) Adaptive hough transform with optimized deep learning followed by dynamic time warping for hand gesture recognition. Multimed Tools Appl:1–32
Latif G, Mohammad N, Alghazo J, AlKhalaf R, AlKhalaf R (2019) Arasl: Arabic alphabets sign language dataset. Data Brief 23:103777
Li X, Deng Q (2021) Chinese position segmentation based on ALBERT-BiGRU-CRF model. In: 2021 International Symposium on Computer Technology and Information Science (ISCTIS). IEEE, pp 116–120
Li S-Z, Yu B, Wu W, Su S-Z, Ji R-R (2015) Feature learning based on SAE–PCA network for human gesture recognition in RGBD images. Neurocomputing 151:565–573
Li Y, Wang X, Liu W, Feng B (2018) Deep attention network for joint hand gesture localization and recognition using static RGB-D images. Inf Sci 441:66–78
Li G, Zhang L, Sun Y, Kong J (2019) Towards the sEMG hand: internet of things sensors and haptic feedback application. Multimed Tools Appl 78:29765–29782
Lin H-I, Hsu M-H, Chen W-K (2014) Human hand gesture recognition using a convolution neural network. In: 2014 IEEE international Conference on Automation Science and Engineering (CASE). IEEE, pp 1038–1043
Liu P, Li X, Cui H, Li S, Yuan Y (2019) Hand gesture recognition based on single-shot multibox detector deep learning. Mob Inf Syst 2019:1–7
Mujahid A, Awan MJ, Yasin A, Mohammed MA, Damaševičius R, Maskeliūunas R, Abdulkareem KH (2021) Real-time hand gesture recognition based on deep learning YOLOv3 model. Appl Sci 11(9):4164
Nagarajan S, Subashini T (2013) Static hand gesture recognition for sign language alphabets using edge oriented histogram and multi class SVM. Int J Comput Appl 82(4):28–35
Neethu P, Suguna R, Sathish D (2020) An efficient method for human hand gesture detection and recognition using deep learning convolutional neural networks. Soft Comput 24:15239–15248
Oudah M, Al-Naji A, Chahl J (2020) Hand gesture recognition based on computer vision: a review of techniques. J Imaging 6(8):73
Oyedotun OK, Khashman A (2017) Deep learning in vision-based static hand gesture recognition. Neural Comput Appl 28(12):3941–3951
Ozcan T, Basturk A (2019) Transfer learning-based convolutional neural networks with heuristic optimization for hand gesture recognition. Neural Comput Appl 31:8955–8970
Pabendon E, Nugroho H, Suheryadi A, Yunanto PE (2017) Hand gesture recognition system under complex background using spatio temporal analysis. In: 2017 5th International Conference on Instrumentation, Communications, Information Technology, and Biomedical Engineering (ICICI-BME). IEEE, pp 261–265
Pavlovic VI, Sharma R, Huang TS (1997) Visual interpretation of hand gestures for human-computer interaction: a review. IEEE Trans Pattern Anal Mach Intell 19(7):677–695
Pinto RF, Borges CD, Almeida AM, Paula IC (2019) Static hand gesture recognition based on convolutional neural networks. J Electr Comput Eng 2019:1–12
Pisharady PK, Vadakkepat P, Loh AP (2013) Attention based detection and recognition of hand postures against complex backgrounds. Int J Comput Vis 101:403–419
Ranga V, Yadav N, Garg P (2018) American sign language fingerspelling using hybrid discrete wavelet transform-Gabor filter and convolutional neural network. J Eng Sci Technol 13(9):2655–2669
Rastgoo R, Kiani K, Escalera S (2021) Sign language recognition: a deep survey. Expert Syst Appl 164:113794
Rathi P, Kuwar Gupta R, Agarwal S, Shukla A (2020) Sign language recognition using resnet50 deep neural network architecture. In: 5th International Conference on Next Generation Computing Technologies (NGCT-2019)
Rautaray SS, Agrawal A (2015) Vision based hand gesture recognition for human computer interaction: a survey. Artif Intell Rev 43:1–54
Rubin Bose S, Sathiesh Kumar V (2021) In-situ identification and recognition of multi-hand gestures using optimized deep residual network. J Intell Fuzzy Syst 41(6):6983–6997
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen L-C (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4510–4520
Shanthakumar VA, Peng C, Hansberger J, Cao L, Meacham S, Blakely V (2020) Design and evaluation of a hand gesture recognition approach for real-time interactions. Multimed Tools Appl 79:17707–17730
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Tan YS, Lim KM, Lee CP (2021) Hand gesture recognition via enhanced densely connected convolutional neural network. Expert Syst Appl 175:114797
Tharwat A, Gaber T, Hassanien AE, Shahin MK, Refaat B (2015) Sift-based Arabic sign language recognition system. In: Afro-European conference for industrial advancement: proceedings of the first international Afro-European Conference for Industrial Advancement AECIA 2014. Springer, pp 359–370
Tsai T-H, Huang C-C, Zhang K-L (2020) Design of hand gesture recognition system for human-computer interaction. Multimed Tools Appl 79:5989–6007
Von Hardenberg C, Bérard F (2001) Bare-hand human-computer interaction. In: Proceedings of the 2001 workshop on perceptive user interfaces, pp 1–8
Wadhawan A, Kumar P (2020) Deep learning-based sign language recognition system for static signs. Neural Comput Appl 32:7957–7968
Wang C, Liu Z, Chan S-C (2014) Superpixel-based hand gesture recognition with Kinect depth camera. IEEE Trans Multimed 17(1):29–39
Xie B, He X, Li Y (2018) RGB-D static gesture recognition based on convolutional neural network. J Eng 2018(16):1515–1520
Yasen M, Jusoh S (2019) A systematic review on hand gesture recognition techniques, challenges and applications. PeerJ Comput Sci 5:218
Zakariah M, Alotaibi YA, Koundal D, Guo Y, Mamun EM (2022) Sign language recognition for Arabic alphabets using transfer learning technique. Comput Intell Neurosci pp 1–15. https://doi.org/10.1155/2022/4567989
Zhang T, Lin H, Ju Z, Yang C (2020) Hand gesture recognition in complex background based on convolutional pose machine and fuzzy Gaussian mixture models. Int J Fuzzy Syst 22:1330–1341
Zhang W, Wang J, Lan F (2020) Dynamic hand gesture recognition based on short-term sampling neural networks. IEEE/CAA J Autom Sin 8(1):110–120
Zhao J, Allison RS (2020) Comparing head gesture, hand gesture and gamepad interfaces for answering yes/no questions in virtual environments. Virtual Real 24(3):515–524
Zhou W, Chen K (2022) A lightweight hand gesture recognition in complex backgrounds. Displays 74:102226
Author information
Authors and Affiliations
Contributions
BK was involved in experimentation, investigations, analysis, paper writing. RHL contributed to conceptualization, reviewing and editing, supervision. RKK contributed to analysis and reviewing.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Karsh, B., Laskar, R.H. & Karsh, R.K. mIV3Net: modified inception V3 network for hand gesture recognition. Multimed Tools Appl 83, 10587–10613 (2024). https://doi.org/10.1007/s11042-023-15865-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-15865-1