Abstract
In this work, we are proposing a new technique for visual recognition of fingerspelling of a sign language by fusing multiple spatial and spectral representations of manual gesture images using a convolutional neural network. This problem is gaining prominence in communication between hearing-impaired people and human-machine interaction. The proposed technique computes Gabor spectral representations of spatial images of hand sign gestures and uses an optimized convolutional neural network to classify the gestures in the joint space into corresponding classes. Various ways to combine both types of modalities are explored to identify the model that improves the robustness and recognition accuracy. The proposed system is evaluated using three databases (MNIST-ASL, ArSL, and MUASL) under different conditions and the attained results outperformed the state-of-the-art techniques.
Similar content being viewed by others
References
Ahmed MA, Zaidan BB, Zaidan AA, Salih MM, Lakulu MMb (2018) A review on systems-based sensory gloves for sign language recognition state of the art between 2007 and 2017. Sensors 18(7):2208
Aljahdali S, Ansari A, Hundewale N (2012) Classification of image database using svm with gabor magnitude. In: 2012 International Conference on Multimedia Computing and Systems, IEEE, pp 126–132
Aowal MA, Zaman AS, Rahman SM, Hatzinakos D (2014) Static hand gesture recognition using discriminative 2d zernike moments. In: TENCON 2014-2014 IEEE Region 10 Conference, IEEE, pp 1-5
Barczak A, Reyes N, Abastillas M, Piccio A, Susnjak T (2011) A new 2d static hand gesture colour image dataset for asl gestures. Research Letters in the Information and Mathematical Sciences 15:12–20
Bheda V, Radpour D (2017) Using deep convolutional networks for gesture recognition in american sign language. arXiv:171006836
Bilgin M, Mutludoğan K (2019) American sign language character recognition with capsule networks. In: Proc. 3rd International symposium on multidisciplinary studies and innovative technologies (ISMSIT), pp 1–6
BinMakhashen GM, El-Alfy ESM (2012) Fusion of multiple texture representations for palmprint recognition using neural networks. In: International conference on neural information processing, Springer, pp 410–417
Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on computational learning theory, ACM, pp 144–152
Chakraborty D, Garg D, Ghosh A, Chan JH (2018) Trigger detection system for american sign language using deep convolutional neural networks. In: Proceedings of the 10th international conference on advances in information technology, ACM, p 4
Chen Y, Zhu L, Ghamisi P, Jia X, Li G, Tang L (2017) Hyperspectral images classification with gabor filtering and convolutional neural network. IEEE Geosci Remote Sens Lett 14(12):2355–2359
Cheok MJ, Omar Z, Jaward MH (2019) A review of hand gesture and sign language recognition techniques. International Journal of Machine Learning and Cybernetics 10(1):131–153
Chevtchenko SF, Vale RF, Macario V (2018) Multi-objective optimization for hand posture recognition. Expert Syst Appl 92:170–181
Chu R, Lei Z, Han Y, He R, Li SZ (2007) Learning gabor magnitude features for palmprint recognition. In: Asian conference on computer vision, Springer, pp 22–31
Cortes C, Vapnik V (1995) Support-vector networks. Machine learning 20(3):273–297
Ding Y, Pang H, Wu X, Lan J (2011) Recognition of hand-gestures using improved local binary pattern. In: Proc. IEEE international conference on multimedia technology, pp 3171–3174
Ghazanfar L, Jaafar A, Nazeeruddin M, Roaa A, Rawan A (2018) Arabic alphabets sign language dataset (arasl). https://data.mendeley.com/datasets/y7pckrw6z2/1
Günther M, Haufe D, Würtz RP (2012) Face recognition with disparity corrected gabor phase differences. In: International conference on artificial neural networks, Springer, pp 411–418
Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:12070580
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:150203167
Islam MR, Mitu UK, Bhuiyan RA, Shin J (2018) Hand gesture feature extraction using deep convolutional neural network for recognizing american sign language. In: 2018 4Th international conference on frontiers of signal processing (ICFSP), IEEE, pp 115-119
Jain A, Healey G (1998) A multiscale representation including opponent color features for texture recognition. IEEE Trans Image Process 7(1):124–128
Jasim M, Hasanuzzaman M (2014) Sign language interpretation using linear discriminant analysis and local binary patterns. In: Proc. IEEE international conference on informatics, electronics & vision (ICIEV), pp 1-5
Li Y, Shan S, Zhang H, Lao S, Chen X (2012) Fusing magnitude and phase features for robust face recognition. In: Asian conference on computer vision, Springer, pp 601–612
Luqman H, Mahmoud SA (2018) Automatic translation of arabic text-to-arabic sign language. Universal Access in the Information Society, pp 1–13
Makarov I, Veldyaykin N, Chertkov M, Pokoev A (2019) American and russian sign language dactyl recognition. In: Proc. 12th ACM international conference on PErvasive technologies related to assistive environments, pp 204–210, https://doi.org/10.1145/3316782.3316786, (to appear in print)
Mehri M, Héroux P, Gomez-Krämer P, Mullot R (2017) Texture feature benchmarking and evaluation for historical document image analysis. International Journal on Document Analysis and Recognition (IJDAR) 20 (1):1–35
Mohandes M, Deriche M (2005) Image based arabic sign language recognition. In: Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005., IEEE, vol 1, pp 86-89
Mohandes M, A-Buraiky S, Halawani T, Al-Baiyat S (2004) Automation of the arabic sign language recognition. In: Proceedings. IEEE international conference on information and communication technologies: From theory to applications, 2004., pp 479–480
Mohandes M, Deriche M, Liu J (2014) Image-based and sensor-based approaches to arabic sign language recognition. IEEE Transactions on Human-machine Systems 44(4):551–557
Mohandes MA (2013) Recognition of two-handed arabic signs using the cyberglove. Arab J Sci Eng 38(3):669–677
Munib Q, Habeeb M, Takruri B, Al-Malik HA (2007) American sign language (asl) recognition based on hough transform and neural networks. Expert systems with Applications 32(1):24–37
Nair AV, Bindu V (2013) A review on indian sign language recognition. International Journal of Computer Applications 73(22):33–38
Pan TY, Lo LY, Yeh CW, Li JW, Liu HT, Hu MC (2016) Real-time sign language recognition in complex background scene based on a hierarchical clustering classification method. In: IEEE second international conference on multimedia big data (BigMM), pp 64–67
Paul S, Bhattacharyya A, Mollah AF, Basu S, Nasipuri M (2020) Hand segmentation from complex background for gesture recognition. In: Emerging technology in modelling and graphics, Springer, pp 775–782
Pisharady PK, Saerbeck M (2015) Recent methods and databases in vision-based hand gesture recognition: a review. Comput Vis Image Underst 141:152–165
Pisharady PK, Vadakkepat P, Loh AP (2013) Attention based detection and recognition of hand postures against complex backgrounds. Int J Comput Vis 101(3):403–419
Pugeault N, Bowden R (2011) Spelling it out: Real-time asl fingerspelling recognition. In: IEEE International conference on computer vision workshops (ICCV workshops), pp 1114–1119
Rajadell O, García-sevilla P, Pla F (2012) Spectral–spatial pixel characterization using gabor filters for hyperspectral image classification. IEEE Geosci Remote Sens Lett 10(4):860–864
Ranga V, Yadav N, Garg P (2018) American sign language fingerspelling using hybrid discrete wavelet transform-gabor filter and convolutional neural network. Journal of Engineering Science and Technology 13(9):2655–2669
Rastgoo R, Kiani K, Escalera S (2018) Multi-modal deep hand sign language recognition in still images using restricted boltzmann machine. Entropy 20(11):809
Rathi D (2018) Optimization of transfer learning for sign language recognition targeting mobile platform. arXiv:180506618
Ren Z, Yuan J, Zhang Z (2011) Robust hand gesture recognition based on finger-earth mover’s distance with a commodity depth camera. In: Proceedings of the 19th ACM international conference on Multimedia, ACM, pp 1093–1096
Sadek MI, Mikhael MN, Mansour HA (2017) A new approach for designing a smart glove for arabic sign language recognition system based on the statistical analysis of the sign language. In: Proc. 34th IEEE National Radio Science Conference (NRSC), pp 380–388
Shanableh T, Assaleh K (2007) Arabic sign language recognition in user-independent mode. Proc IEEE International Conference on Intelligent and Advanced Systems, pp 597–600. https://doi.org/10.1109/ICIAS.2007.4658457
Shivashankara S, Srinath S (2017) A comparative study of various techniques and outcomes of recognizing american sign language: a review. International Journal of Scientific Research Engineering & Technology (IJSRET) 6(9):1013–1023
Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. Journal of Big Data 6(1):60
Sidig AAI, Luqman H, Mahmoud SA (2017) Arabic sign language recognition using optical flow-based features and hmm. In: International conference of reliable information and communication technology, Springer, pp 297–305
Sidig AAI, Luqman H, Mahmoud SA (2017) Transform-based Arabic sign language recognition, vol 117, pp 2–9, https://doi.org/10.1016/j.procs.2017.10.087
Tao W, Leu MC, Yin Z (2018) American sign language alphabet recognition using convolutional neural networks with multiview augmentation and inference fusion. Eng Appl Artif Intell 76:202–213
Wadhawan A, Kumar P (2019) Sign language recognition systems: a decade systematic literature review. Archives of Computational Methods in Engineering, pp 1–29
Wang H, Raj B (2017) On the origin of deep learning. arXiv:170207800
Xu Y, Fang X, You J, Chen Y, Liu H (2015) Noise-free representation based classification and face recognition experiments, vol 147, pp 307–314
Xu Y, Zhang B, Zhong Z (2015) Multiple representations and sparse representation for image classification, vol 68, pp 9–14
Xu Y, Li Z, Tian C, Yang J (2019) Multiple vector representations of images and robust dictionary learning. Pattern Recogn Lett 128:131–136
Yao H, Chuyi L, Dan H, Weiyu Y (2016) Gabor feature based convolutional neural network for object recognition in natural scene. In: Proc. 3rd IEEE International conference on information science and control engineering (ICISCE), pp 386–390
Yun L, Lifeng Z, Shujun Z (2012) A hand gesture recognition method based on multi-feature fusion and template matching. Procedia Engineering 29:1678–1684
Zamani M, Kanan HR (2014) Saliency based alphabet and numbers of american sign language recognition using linear feature extraction. In: 4th IEEE International conference on computer and knowledge engineering (ICCKE), pp 398–403
Zhang D, Wong A, Indrawan M, Lu G (2000) Content-based image retrieval using gabor texture features. In: Proc. of First IEEE pacific-rim conference on multimedia (PCM’00)
Zhang X, Chen X, Li Y, Lantz V, Wang K, Yang J (2011) A framework for hand gesture recognition based on accelerometer and emg sensors. IEEE Transactions on Systems. Man, and Cybernetics-Part A:, Systems and Humans 41 (6):1064–1076
Acknowledgment
The authors would like to thank King Fahd University of Petroleum and Minerals for support during this work. Spacial thanks to the journal Editorial board and anonymous reviewers for their constructive comments that have significantly helped improve the content and presentation of the work.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Luqman, H., El-Alfy, ES.M. & BinMakhashen, G.M. Joint space representation and recognition of sign language fingerspelling using Gabor filter and convolutional neural network. Multimed Tools Appl 80, 10213–10234 (2021). https://doi.org/10.1007/s11042-020-09994-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09994-0