Abstract
To make a more accurate and robust deep learning model, more labeled data is required. Unfortunately, in many areas, it’s very difficult to manage properly labeled data. Sign language recognition is one of the challenging areas of computer vision, to make a successful deep learning model to recognize sign gestures in real-time, a huge amount of labeled data is needed. Authors have proposed a self-supervised learning approach to address this problem. The proposed architecture used Resnet50 v1 backbone-based simsiam encoder network to learn the similarity between two different images of the same class. Calculated cosine similarity passes to MLP head for further classification. The proposed study uses Indian and American Sign Language detests for simulation. The proposed methodology successfully achieve 74.59% of accuracy. Authors have also demonstrated the impact of other self-supervised deep learning models for sign language recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ranzato, M., Szummer, M.: Semi-supervised learning of compact document representations with deep networks. In: Proceedings of the 25th international conference on Machine learning, pp. 792–799 (2008)
Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26 (2013)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Kothadiya, D., Chaudhari, A., Macwan, R., Patel, K., Bhatt, C.: The convergence of deep learning and computer vision: Smart city applications and research challenges. In: 3rd International Conference on Integrated Intelligent Computing Communication & Security (ICIIC 2021), pp. 14–22, Atlantis Press (2021)
Kothadiya, D., Bhatt, C., Sapariya, K., Patel, K., Gil-González, A.-B., Corchado, J. M.: Deepsign: sign language detection and recognition using deep learning. Electronics 11(11) (2022)
de L’Epée, C.-M.: Institution des sourds et muets, par la voie des signes méthodiques: ouvrage qui contient le project d’une langue universelle, par l’entremise des signes naturels assujettis à une méthode, vol. 1. Chez Nyon l’ainé, 1776
Li, P., et al.: SelfDoc: self-supervised document representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5652–5660 (2021)
Preciado-Grijalva, A., Wehbe, B., Firvida, M.B., Valdenegro-Toro, M.: Self-supervised learning for sonar image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1499–1508 (2022)
Gao, Y., Sun, X., Liu, C.: A general self-supervised framework for remote sensing image classification. Remote Sensing 14(19), 4824 (2022)
Cheng, G., Han, J., Lu, X.: Remote sensing image scene classification: benchmark and state of the art. Proc. IEEE 105(10), 1865–1883 (2017)
Xia, G.-S., et al.: Aid: A benchmark data set for performance evaluation of aerial scene classification. IEEE Trans. Geosci. Remote Sens. 55(7), 3965–3981 (2017)
Zou, Q., Ni, L., Zhang, T., Wang, Q.: Deep learning based feature selection for remote sensing scene classification. IEEE Geosci. Remote Sens. Lett. 12(11), 2321–2325 (2015)
Liu, B., Gao, K., Yu, A., Ding, L., Qiu, C., Li, J.: ES2FL: ensemble self-supervised feature learning for small sample classification of hyperspectral images. Remote Sensing 14(17), 4236 (2022)
Song, L., Feng, Z., Yang, S., Zhang, X., Jiao, L.: Self-supervised assisted semi-supervised residual network for hyperspectral image classification. Remote Sensing 14(13), 2997 (2022)
Zhao, Y., Liu, J., Yang, J., Wu, Z.: Remote sensing image scene classification via self-supervised learning and knowledge distillation. Remote Sensing 14(19), 4813 (2022)
Patrona, F., Mademlis, I., Pitas, I.: Gesture recognition by self-supervised moving interest point completion for CNN-LSTMs. In: 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), pp. 1–5, IEEE (2022)
Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018)
Wu, Y., Huang, T.S.: Self-supervised learning for visual tracking and recognition of human hand. In: AAAI/IAAI, pp. 243–248 (2000)
Dietz, A., Pösch, A., Reithmeier, E.: Hand hygiene monitoring based on segmentation of interacting hands with convolutional networks. In: Medical Imaging 2018: Imaging Informatics for Healthcare, Research, and Applications, vol. 10579, pp. 273–278, SPIE (2018)
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607, PMLR (2020)
Caramalau, R., Bhattarai, B., Stoyanov, D., Kim, T.-K.: MoBYv2AL: self-supervised active learning for image classification. arXiv preprint arXiv:2301.01531 (2023)
Tian, Y., Sun, C., Poole, B., Krishnan, D., Schmid, C., Isola, P.: What makes for good views for contrastive learning? Adv. Neural. Inf. Process. Syst. 33, 6827–6839 (2020)
Grill, J.-B.,et al.: Bootstrap your own latent-a new approach to self-supervised learning. Adv. Neural Inf. Process. Syst. 33, 21271–21284 (2020)
Caron, M., et al.: Emerging properties in self-supervised vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer vision, pp. 9650–9660 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kothadiya, D.R., Bhatt, C.M., Rida, I. (2024). Simsiam Network Based Self-supervised Model for Sign Language Recognition. In: Bennour, A., Bouridane, A., Chaari, L. (eds) Intelligent Systems and Pattern Recognition. ISPR 2023. Communications in Computer and Information Science, vol 1941. Springer, Cham. https://doi.org/10.1007/978-3-031-46338-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-46338-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46337-2
Online ISBN: 978-3-031-46338-9
eBook Packages: Computer ScienceComputer Science (R0)