Skip to main content

A Comparison of Neural Networks for Sign Language Recognition with LSA64

  • Conference paper
  • First Online:
  • 608 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1444))

Abstract

Sign Language Recognition models have been steadily increasing in performance in the last years, fueled by Neural Network models. Furthermore, generic Neural Network models have taken precedence over specialized models designed specifically for Sign Language. Despite this, the completeness and complexity of datasets has not scaled accordingly. This deficiency presents a significant challenge for deploying Sign Language Recognition models, specially given that Sign Languages are specific to countries or even regions. Following this trend, we experiment with three models built on standard recurrent and convolutional neural network layers. We evaluate the models on LSA64, the only Argentinian Sign Language dataset available. Coupled with simple but carefully chosen hyperparameters and preprocessing techniques, these models are all able to achieve near perfect accuracy on LSA64, surpassing all previous models, many specifically designed for this task. Furthermore, we perform ablation studies that indicate that temporal data augmentation can provide a significant boost to accuracy, unlike traditional spatial data augmentation techniques. Finally, we analyze the activation values of the three models to understand the types of features learned, and find they develop on hand-specific filters to classify signs.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Experiment code, including model details and other supplementary materials can be found at https://github.com/midusi/lsa64_nn.

  2. 2.

    We executed all experiments using an Intel(R) Xeon(R) CPU @ 2.30 GHz CPU, a single 12 GB NVIDIA Tesla K80 GPU, and 13 GB of RAM.

  3. 3.

    See the supplementary material in the code repository for more examples.

References

  1. Bragg, D., et al.: Sign language recognition, generation, and translation: an interdisciplinary perspective. In: The 21st International ACM SIGACCESS Conference on Computers and Accessibility, p. 16–31. ASSETS 2019, Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3308561.3353774

  2. Cooper, H., Holt, B., Bowden, R.: Sign language recognition. In: Moeslund, T.B., Hilton, A., Krüger, V., Sigal, L. (eds.) Visual Analysis of Humans: Looking at People, chap. 27, pp. 539–562. Springer, London (2011)

    Chapter  Google Scholar 

  3. Elsayed, E.K., Fathy, D.R.: Semantic deep learning to translate dynamic sign language. Int. J. Intell. Eng. Syst. 14 (2021). https://doi.org/10.22266/ijies2021.0228.3

  4. Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  5. Imran, J., Raman, B.: Deep motion templates and extreme learning machine for sign language recognition. Vis. Comput. 36(6), 1233–1246 (2020)

    Article  Google Scholar 

  6. Ji, S., Xu, W., Yang, M., Yu, K.: 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013). https://doi.org/10.1109/TPAMI.2012.59

    Article  Google Scholar 

  7. Koller, O.: Quantitative survey of the state of the art in sign language recognition. CoRR abs/2008.09918 https://arxiv.org/abs/2008.09918 (2020)

  8. Konstantinidis, D., Dimitropoulos, K., Daras, P.: A deep learning approach for analyzing video and skeletal features in sign language recognition. In: 2018 IEEE International Conference on Imaging Systems and Techniques (IST), pp. 1–6. IEEE (2018)

    Google Scholar 

  9. Konstantinidis, D., Dimitropoulos, K., Daras, P.: Sign language recognition based on hand and body skeletal data. In: 2018–3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video (3DTV-CON), pp. 1–4. IEEE (2018)

    Google Scholar 

  10. Masood, S., Srivastava, A., Thuwal, H.C., Ahmad, M.: Real-time sign language gesture (word) recognition from video sequences using CNN and RNN. In: Bhateja, V., Coello Coello, C.A., Satapathy, S.C., Pattnaik, P.K. (eds.) Intelligent Engineering Informatics, pp. 623–632. Springer Singapore, Singapore (2018)

    Chapter  Google Scholar 

  11. Rodríguez, J., Martínez, F.: Towards on-line sign language recognition using cumulative SD-VLAD descriptors. In: Serrano C., J.E., Martínez-Santos, J.C. (eds.) CCC 2018. CCIS, vol. 885, pp. 371–385. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98998-3_29

    Chapter  Google Scholar 

  12. Ronchetti, F., Quiroga, F., Estrebou, C., Lanzarini, L., Rosete, A.: sign languague recognition without frame-sequencing constraints: a proof of concept on the argentinian sign language. In: Montes-y-Gómez, M., Escalante, H.J., Segura, A., de Dios Murillo, J. (eds.) IBERAMIA 2016. LNCS (LNAI), vol. 10022, pp. 338–349. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47955-2_28

    Chapter  Google Scholar 

  13. Ronchetti, F., Quiroga, F., Estrebou, C.A., Lanzarini, L.C., Rosete, A.: Lsa64: an argentinian sign language dataset. In: XXII Congreso Argentino de Ciencias de la Computación (CACIC 2016) (2016)

    Google Scholar 

  14. Shah, J.A., et al.: Deepsign: a deep-learning architecture for sign language. Master’s thesis, University of Texas at Arlington (2018)

    Google Scholar 

  15. Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: A machine learning approach for precipitation nowcasting. arXiv preprint arXiv:1506.04214 (2015)

  16. Von Agris, U., Zieren, J., Canzler, U., Bauer, B., Kraiss, K.F.: Recent developments in visual sign language recognition. Univ. Access Inf. Soc. 6(4), 323–362 (2008)

    Article  Google Scholar 

  17. Zhang, X., Li, X.: Dynamic gesture recognition based on MEMP network. Future Internet 11, 91 (2019). https://doi.org/10.3390/fi11040091

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Facundo Quiroga .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mindlin, I. et al. (2021). A Comparison of Neural Networks for Sign Language Recognition with LSA64. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds) Cloud Computing, Big Data & Emerging Topics. JCC-BD&ET 2021. Communications in Computer and Information Science, vol 1444. Springer, Cham. https://doi.org/10.1007/978-3-030-84825-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-84825-5_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-84824-8

  • Online ISBN: 978-3-030-84825-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics