A Comparison of Neural Networks for Sign Language Recognition with LSA64

Mindlin, Iván; Quiroga, Facundo; Ronchetti, Franco; Bianco, Pedro Dal; Ríos, Gastón; Lanzarini, Laura; Hasperué, Waldo

doi:10.1007/978-3-030-84825-5_8

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1444))

Included in the following conference series:

Conference on Cloud Computing, Big Data & Emerging Topics

679 Accesses
1 Citations

Abstract

Sign Language Recognition models have been steadily increasing in performance in the last years, fueled by Neural Network models. Furthermore, generic Neural Network models have taken precedence over specialized models designed specifically for Sign Language. Despite this, the completeness and complexity of datasets has not scaled accordingly. This deficiency presents a significant challenge for deploying Sign Language Recognition models, specially given that Sign Languages are specific to countries or even regions. Following this trend, we experiment with three models built on standard recurrent and convolutional neural network layers. We evaluate the models on LSA64, the only Argentinian Sign Language dataset available. Coupled with simple but carefully chosen hyperparameters and preprocessing techniques, these models are all able to achieve near perfect accuracy on LSA64, surpassing all previous models, many specifically designed for this task. Furthermore, we perform ablation studies that indicate that temporal data augmentation can provide a significant boost to accuracy, unlike traditional spatial data augmentation techniques. Finally, we analyze the activation values of the three models to understand the types of features learned, and find they develop on hand-specific filters to classify signs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Pakistan sign language recognition: leveraging deep learning models with limited dataset

Article 17 July 2023

An Ensembled Scale-Space Model of Deep Convolutional Neural Networks for Sign Language Recognition

Hybrid Sign Language Learning Approach Using Multi-scale Hierarchical Deep Convolutional Neural Network (MDCnn)

Notes

1.
Experiment code, including model details and other supplementary materials can be found at https://github.com/midusi/lsa64_nn.
2.
We executed all experiments using an Intel(R) Xeon(R) CPU @ 2.30 GHz CPU, a single 12 GB NVIDIA Tesla K80 GPU, and 13 GB of RAM.
3.
See the supplementary material in the code repository for more examples.

References

Bragg, D., et al.: Sign language recognition, generation, and translation: an interdisciplinary perspective. In: The 21st International ACM SIGACCESS Conference on Computers and Accessibility, p. 16–31. ASSETS 2019, Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3308561.3353774
Cooper, H., Holt, B., Bowden, R.: Sign language recognition. In: Moeslund, T.B., Hilton, A., Krüger, V., Sigal, L. (eds.) Visual Analysis of Humans: Looking at People, chap. 27, pp. 539–562. Springer, London (2011)
Chapter Google Scholar
Elsayed, E.K., Fathy, D.R.: Semantic deep learning to translate dynamic sign language. Int. J. Intell. Eng. Syst. 14 (2021). https://doi.org/10.22266/ijies2021.0228.3
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Imran, J., Raman, B.: Deep motion templates and extreme learning machine for sign language recognition. Vis. Comput. 36(6), 1233–1246 (2020)
Article Google Scholar
Ji, S., Xu, W., Yang, M., Yu, K.: 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013). https://doi.org/10.1109/TPAMI.2012.59
Article Google Scholar
Koller, O.: Quantitative survey of the state of the art in sign language recognition. CoRR abs/2008.09918 https://arxiv.org/abs/2008.09918 (2020)
Konstantinidis, D., Dimitropoulos, K., Daras, P.: A deep learning approach for analyzing video and skeletal features in sign language recognition. In: 2018 IEEE International Conference on Imaging Systems and Techniques (IST), pp. 1–6. IEEE (2018)
Google Scholar
Konstantinidis, D., Dimitropoulos, K., Daras, P.: Sign language recognition based on hand and body skeletal data. In: 2018–3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video (3DTV-CON), pp. 1–4. IEEE (2018)
Google Scholar
Masood, S., Srivastava, A., Thuwal, H.C., Ahmad, M.: Real-time sign language gesture (word) recognition from video sequences using CNN and RNN. In: Bhateja, V., Coello Coello, C.A., Satapathy, S.C., Pattnaik, P.K. (eds.) Intelligent Engineering Informatics, pp. 623–632. Springer Singapore, Singapore (2018)
Chapter Google Scholar
Rodríguez, J., Martínez, F.: Towards on-line sign language recognition using cumulative SD-VLAD descriptors. In: Serrano C., J.E., Martínez-Santos, J.C. (eds.) CCC 2018. CCIS, vol. 885, pp. 371–385. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98998-3_29
Chapter Google Scholar
Ronchetti, F., Quiroga, F., Estrebou, C., Lanzarini, L., Rosete, A.: sign languague recognition without frame-sequencing constraints: a proof of concept on the argentinian sign language. In: Montes-y-Gómez, M., Escalante, H.J., Segura, A., de Dios Murillo, J. (eds.) IBERAMIA 2016. LNCS (LNAI), vol. 10022, pp. 338–349. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47955-2_28
Chapter Google Scholar
Ronchetti, F., Quiroga, F., Estrebou, C.A., Lanzarini, L.C., Rosete, A.: Lsa64: an argentinian sign language dataset. In: XXII Congreso Argentino de Ciencias de la Computación (CACIC 2016) (2016)
Google Scholar
Shah, J.A., et al.: Deepsign: a deep-learning architecture for sign language. Master’s thesis, University of Texas at Arlington (2018)
Google Scholar
Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: A machine learning approach for precipitation nowcasting. arXiv preprint arXiv:1506.04214 (2015)
Von Agris, U., Zieren, J., Canzler, U., Bauer, B., Kraiss, K.F.: Recent developments in visual sign language recognition. Univ. Access Inf. Soc. 6(4), 323–362 (2008)
Article Google Scholar
Zhang, X., Li, X.: Dynamic gesture recognition based on MEMP network. Future Internet 11, 91 (2019). https://doi.org/10.3390/fi11040091

Download references

Author information

Authors and Affiliations

Instituto de Investigación en Informática LIDI (Centro CICPBA), Facultad de Informática, Universidad Nacional de La Plata, La Plata, Argentina
Iván Mindlin, Facundo Quiroga, Franco Ronchetti, Pedro Dal Bianco, Gastón Ríos, Laura Lanzarini & Waldo Hasperué
Comisión de Investigaciones Científicas de la Pcia. De Bs. As. (CIC-PBA), La Plata, Argentina
Franco Ronchetti & Waldo Hasperué

Authors

Iván Mindlin
View author publications
You can also search for this author in PubMed Google Scholar
Facundo Quiroga
View author publications
You can also search for this author in PubMed Google Scholar
Franco Ronchetti
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Dal Bianco
View author publications
You can also search for this author in PubMed Google Scholar
Gastón Ríos
View author publications
You can also search for this author in PubMed Google Scholar
Laura Lanzarini
View author publications
You can also search for this author in PubMed Google Scholar
Waldo Hasperué
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Facundo Quiroga .

Editor information

Editors and Affiliations

III-LIDI, Facultad de Informática, Universidad Nacional de La Plata, La Plata, Argentina
Marcelo Naiouf
III-LIDI, Facultad de Informática, Universidad Nacional de La Plata and CIC, La Plata, Argentina
Enzo Rucci
III-LIDI, Facultad de Informática, Universidad Nacional de La Plata, La Plata, Argentina
Franco Chichizola
III-LIDI, Facultad de Informática, Universidad Nacional de La Plata and CIC, La Plata, Argentina
Laura De Giusti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mindlin, I. et al. (2021). A Comparison of Neural Networks for Sign Language Recognition with LSA64. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds) Cloud Computing, Big Data & Emerging Topics. JCC-BD&ET 2021. Communications in Computer and Information Science, vol 1444. Springer, Cham. https://doi.org/10.1007/978-3-030-84825-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-84825-5_8
Published: 16 August 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-84824-8
Online ISBN: 978-3-030-84825-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Comparison of Neural Networks for Sign Language Recognition with LSA64