Abstract
Sign Language Recognition models have been steadily increasing in performance in the last years, fueled by Neural Network models. Furthermore, generic Neural Network models have taken precedence over specialized models designed specifically for Sign Language. Despite this, the completeness and complexity of datasets has not scaled accordingly. This deficiency presents a significant challenge for deploying Sign Language Recognition models, specially given that Sign Languages are specific to countries or even regions. Following this trend, we experiment with three models built on standard recurrent and convolutional neural network layers. We evaluate the models on LSA64, the only Argentinian Sign Language dataset available. Coupled with simple but carefully chosen hyperparameters and preprocessing techniques, these models are all able to achieve near perfect accuracy on LSA64, surpassing all previous models, many specifically designed for this task. Furthermore, we perform ablation studies that indicate that temporal data augmentation can provide a significant boost to accuracy, unlike traditional spatial data augmentation techniques. Finally, we analyze the activation values of the three models to understand the types of features learned, and find they develop on hand-specific filters to classify signs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Experiment code, including model details and other supplementary materials can be found at https://github.com/midusi/lsa64_nn.
- 2.
We executed all experiments using an Intel(R) Xeon(R) CPU @ 2.30 GHz CPU, a single 12 GB NVIDIA Tesla K80 GPU, and 13 GB of RAM.
- 3.
See the supplementary material in the code repository for more examples.
References
Bragg, D., et al.: Sign language recognition, generation, and translation: an interdisciplinary perspective. In: The 21st International ACM SIGACCESS Conference on Computers and Accessibility, p. 16–31. ASSETS 2019, Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3308561.3353774
Cooper, H., Holt, B., Bowden, R.: Sign language recognition. In: Moeslund, T.B., Hilton, A., Krüger, V., Sigal, L. (eds.) Visual Analysis of Humans: Looking at People, chap. 27, pp. 539–562. Springer, London (2011)
Elsayed, E.K., Fathy, D.R.: Semantic deep learning to translate dynamic sign language. Int. J. Intell. Eng. Syst. 14 (2021). https://doi.org/10.22266/ijies2021.0228.3
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Imran, J., Raman, B.: Deep motion templates and extreme learning machine for sign language recognition. Vis. Comput. 36(6), 1233–1246 (2020)
Ji, S., Xu, W., Yang, M., Yu, K.: 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013). https://doi.org/10.1109/TPAMI.2012.59
Koller, O.: Quantitative survey of the state of the art in sign language recognition. CoRR abs/2008.09918 https://arxiv.org/abs/2008.09918 (2020)
Konstantinidis, D., Dimitropoulos, K., Daras, P.: A deep learning approach for analyzing video and skeletal features in sign language recognition. In: 2018 IEEE International Conference on Imaging Systems and Techniques (IST), pp. 1–6. IEEE (2018)
Konstantinidis, D., Dimitropoulos, K., Daras, P.: Sign language recognition based on hand and body skeletal data. In: 2018–3DTV-Conference: The True Vision-Capture, Transmission and Display of 3D Video (3DTV-CON), pp. 1–4. IEEE (2018)
Masood, S., Srivastava, A., Thuwal, H.C., Ahmad, M.: Real-time sign language gesture (word) recognition from video sequences using CNN and RNN. In: Bhateja, V., Coello Coello, C.A., Satapathy, S.C., Pattnaik, P.K. (eds.) Intelligent Engineering Informatics, pp. 623–632. Springer Singapore, Singapore (2018)
Rodríguez, J., Martínez, F.: Towards on-line sign language recognition using cumulative SD-VLAD descriptors. In: Serrano C., J.E., Martínez-Santos, J.C. (eds.) CCC 2018. CCIS, vol. 885, pp. 371–385. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98998-3_29
Ronchetti, F., Quiroga, F., Estrebou, C., Lanzarini, L., Rosete, A.: sign languague recognition without frame-sequencing constraints: a proof of concept on the argentinian sign language. In: Montes-y-Gómez, M., Escalante, H.J., Segura, A., de Dios Murillo, J. (eds.) IBERAMIA 2016. LNCS (LNAI), vol. 10022, pp. 338–349. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47955-2_28
Ronchetti, F., Quiroga, F., Estrebou, C.A., Lanzarini, L.C., Rosete, A.: Lsa64: an argentinian sign language dataset. In: XXII Congreso Argentino de Ciencias de la Computación (CACIC 2016) (2016)
Shah, J.A., et al.: Deepsign: a deep-learning architecture for sign language. Master’s thesis, University of Texas at Arlington (2018)
Shi, X., Chen, Z., Wang, H., Yeung, D.Y., Wong, W.K., Woo, W.C.: Convolutional LSTM network: A machine learning approach for precipitation nowcasting. arXiv preprint arXiv:1506.04214 (2015)
Von Agris, U., Zieren, J., Canzler, U., Bauer, B., Kraiss, K.F.: Recent developments in visual sign language recognition. Univ. Access Inf. Soc. 6(4), 323–362 (2008)
Zhang, X., Li, X.: Dynamic gesture recognition based on MEMP network. Future Internet 11, 91 (2019). https://doi.org/10.3390/fi11040091
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Mindlin, I. et al. (2021). A Comparison of Neural Networks for Sign Language Recognition with LSA64. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds) Cloud Computing, Big Data & Emerging Topics. JCC-BD&ET 2021. Communications in Computer and Information Science, vol 1444. Springer, Cham. https://doi.org/10.1007/978-3-030-84825-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-84825-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-84824-8
Online ISBN: 978-3-030-84825-5
eBook Packages: Computer ScienceComputer Science (R0)