Skip to main content

Simsiam Network Based Self-supervised Model for Sign Language Recognition

  • Conference paper
  • First Online:
Intelligent Systems and Pattern Recognition (ISPR 2023)

Abstract

To make a more accurate and robust deep learning model, more labeled data is required. Unfortunately, in many areas, it’s very difficult to manage properly labeled data. Sign language recognition is one of the challenging areas of computer vision, to make a successful deep learning model to recognize sign gestures in real-time, a huge amount of labeled data is needed. Authors have proposed a self-supervised learning approach to address this problem. The proposed architecture used Resnet50 v1 backbone-based simsiam encoder network to learn the similarity between two different images of the same class. Calculated cosine similarity passes to MLP head for further classification. The proposed study uses Indian and American Sign Language detests for simulation. The proposed methodology successfully achieve 74.59% of accuracy. Authors have also demonstrated the impact of other self-supervised deep learning models for sign language recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ranzato, M., Szummer, M.: Semi-supervised learning of compact document representations with deep networks. In: Proceedings of the 25th international conference on Machine learning, pp. 792–799 (2008)

    Google Scholar 

  2. Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)

    Google Scholar 

  3. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26 (2013)

    Google Scholar 

  4. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)

    Article  Google Scholar 

  5. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  6. Kothadiya, D., Chaudhari, A., Macwan, R., Patel, K., Bhatt, C.: The convergence of deep learning and computer vision: Smart city applications and research challenges. In: 3rd International Conference on Integrated Intelligent Computing Communication & Security (ICIIC 2021), pp. 14–22, Atlantis Press (2021)

    Google Scholar 

  7. Kothadiya, D., Bhatt, C., Sapariya, K., Patel, K., Gil-González, A.-B., Corchado, J. M.: Deepsign: sign language detection and recognition using deep learning. Electronics 11(11) (2022)

    Google Scholar 

  8. de L’Epée, C.-M.: Institution des sourds et muets, par la voie des signes méthodiques: ouvrage qui contient le project d’une langue universelle, par l’entremise des signes naturels assujettis à une méthode, vol. 1. Chez Nyon l’ainé, 1776

    Google Scholar 

  9. Li, P., et al.: SelfDoc: self-supervised document representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5652–5660 (2021)

    Google Scholar 

  10. Preciado-Grijalva, A., Wehbe, B., Firvida, M.B., Valdenegro-Toro, M.: Self-supervised learning for sonar image classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1499–1508 (2022)

    Google Scholar 

  11. Gao, Y., Sun, X., Liu, C.: A general self-supervised framework for remote sensing image classification. Remote Sensing 14(19), 4824 (2022)

    Article  Google Scholar 

  12. Cheng, G., Han, J., Lu, X.: Remote sensing image scene classification: benchmark and state of the art. Proc. IEEE 105(10), 1865–1883 (2017)

    Article  Google Scholar 

  13. Xia, G.-S., et al.: Aid: A benchmark data set for performance evaluation of aerial scene classification. IEEE Trans. Geosci. Remote Sens. 55(7), 3965–3981 (2017)

    Article  Google Scholar 

  14. Zou, Q., Ni, L., Zhang, T., Wang, Q.: Deep learning based feature selection for remote sensing scene classification. IEEE Geosci. Remote Sens. Lett. 12(11), 2321–2325 (2015)

    Article  Google Scholar 

  15. Liu, B., Gao, K., Yu, A., Ding, L., Qiu, C., Li, J.: ES2FL: ensemble self-supervised feature learning for small sample classification of hyperspectral images. Remote Sensing 14(17), 4236 (2022)

    Article  Google Scholar 

  16. Song, L., Feng, Z., Yang, S., Zhang, X., Jiao, L.: Self-supervised assisted semi-supervised residual network for hyperspectral image classification. Remote Sensing 14(13), 2997 (2022)

    Article  Google Scholar 

  17. Zhao, Y., Liu, J., Yang, J., Wu, Z.: Remote sensing image scene classification via self-supervised learning and knowledge distillation. Remote Sensing 14(19), 4813 (2022)

    Article  Google Scholar 

  18. Patrona, F., Mademlis, I., Pitas, I.: Gesture recognition by self-supervised moving interest point completion for CNN-LSTMs. In: 2022 IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP), pp. 1–5, IEEE (2022)

    Google Scholar 

  19. Gidaris, S., Singh, P., Komodakis, N.: Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728 (2018)

  20. Wu, Y., Huang, T.S.: Self-supervised learning for visual tracking and recognition of human hand. In: AAAI/IAAI, pp. 243–248 (2000)

    Google Scholar 

  21. Dietz, A., Pösch, A., Reithmeier, E.: Hand hygiene monitoring based on segmentation of interacting hands with convolutional networks. In: Medical Imaging 2018: Imaging Informatics for Healthcare, Research, and Applications, vol. 10579, pp. 273–278, SPIE (2018)

    Google Scholar 

  22. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607, PMLR (2020)

    Google Scholar 

  23. Caramalau, R., Bhattarai, B., Stoyanov, D., Kim, T.-K.: MoBYv2AL: self-supervised active learning for image classification. arXiv preprint arXiv:2301.01531 (2023)

  24. Tian, Y., Sun, C., Poole, B., Krishnan, D., Schmid, C., Isola, P.: What makes for good views for contrastive learning? Adv. Neural. Inf. Process. Syst. 33, 6827–6839 (2020)

    Google Scholar 

  25. Grill, J.-B.,et al.: Bootstrap your own latent-a new approach to self-supervised learning. Adv. Neural Inf. Process. Syst. 33, 21271–21284 (2020)

    Google Scholar 

  26. Caron, M., et al.: Emerging properties in self-supervised vision transformers. In: Proceedings of the IEEE/CVF International Conference on Computer vision, pp. 9650–9660 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chintan M. Bhatt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kothadiya, D.R., Bhatt, C.M., Rida, I. (2024). Simsiam Network Based Self-supervised Model for Sign Language Recognition. In: Bennour, A., Bouridane, A., Chaari, L. (eds) Intelligent Systems and Pattern Recognition. ISPR 2023. Communications in Computer and Information Science, vol 1941. Springer, Cham. https://doi.org/10.1007/978-3-031-46338-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-46338-9_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-46337-2

  • Online ISBN: 978-3-031-46338-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics