Sign Language Recognition Based on 3D Convolutional Neural Networks

Neto, Geovane M. Ramos; Junior, Geraldo Braz; de Almeida, João Dallyson Sousa; de Paiva, Anselmo Cardoso

doi:10.1007/978-3-319-93000-8_45

Geovane M. Ramos Neto¹⁶,
Geraldo Braz Junior¹⁶,
João Dallyson Sousa de Almeida¹⁶ &
…
Anselmo Cardoso de Paiva¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10882))

Included in the following conference series:

International Conference Image Analysis and Recognition

5356 Accesses
13 Citations

Abstract

The inclusion of disabled people is still a recurring problem throughout the world. For the hearing impaired, the barrier imposed by the sign language spoken by a small part of the population imposes limitations that interfere in the quality of life of these people. The popularization or even automation of sign language recognition can take their lives to a higher level. Understanding the importance of sign language recognition for the hearing impaired we propose a 3D CNN architecture for the recognition of 64 classes of gestures from Argentinian Sign Language (LSA64). We demonstrate the efficiency of the method when compared to traditional methods based on hand-crafted features and that its results outperform most deep learning-based work reaching 93.9% of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Moeller, M.P.: Early intervention and language development in children who are deaf and hard of hearing. Pediatrics 106(3), e43 (2000)
Article MathSciNet Google Scholar
Dalton, D.S., Cruickshanks, K.J., Klein, B.E., Klein, R., Wiley, T.L., Nondahl, D.M.: The impact of hearing loss on quality of life in older adults. Gerontologist 43(5), 661–668 (2003)
Article Google Scholar
Marin, G., Dominio, F., Zanuttigh, P.: Hand gesture recognition with leap motion and kinect devices. In: IEEE International Conference on Image Processing (ICIP), pp. 1565–1569. IEEE (2014)
Google Scholar
Ronchetti, F.: Reconocimiento de gestos dinámicos y su aplicación al lenguaje de señas. Ph.D. thesis, Facultad de Informática (2017)
Google Scholar
Estrebou, C., Lanzarini, L., Hasperué, W.: Voice recognition based on probabilistic SOM. In: Proceedings of the Conference: XXXVI Conferencia Latinoamericana en Informática, At Asunción, Paraguay (2010)
Google Scholar
Ronchetti, F., Quiroga, F., Estrebou, C., Lanzarini, L., Rosete, A.: LSA64: a dataset of Argentinian sign language. In: XX II Congreso Argentino de Ciencias de la Computación (CACIC) (2016)
Google Scholar
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Graves, A., Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013)
Google Scholar
Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012)
Article Google Scholar
Wang, H., Wang, N., Yeung, D.Y.: Collaborative deep learning for recommender systems. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1235–1244. ACM (2015)
Google Scholar
Wang, X., Wang, Y.: Improving content-based and hybrid music recommendation using deep learning. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 627–636. ACM (2014)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Cireşan, D.C., Giusti, A., Gambardella, L.M., Schmidhuber, J.: Mitosis detection in breast cancer histology images with deep neural networks. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8150, pp. 411–418. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40763-5_51
Chapter Google Scholar
Ciregan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (2012)
Google Scholar
Pigou, L., Dieleman, S., Kindermans, P.-J., Schrauwen, B.: Sign language recognition using convolutional neural networks. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 572–578. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_40
Chapter Google Scholar
Huang, J., Zhou, W., Li, H., Li, W.: Sign language recognition using 3D convolutional neural networks. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2015)
Google Scholar
Molchanov, P., Gupta, S., Kim, K., Kautz, J.: Hand gesture recognition with 3D convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–7 (2015)
Google Scholar
Ohn-Bar, E., Trivedi, M.M.: Hand gesture recognition in real time for automotive interfaces: a multimodal vision-based approach and evaluations. IEEE Trans. Intell. Transp. Syst. 15(6), 2368–2377 (2014)
Article Google Scholar
Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Oyedotun, O.K., Khashman, A.: Deep learning in vision-based static hand gesture recognition. Neural Comput. Appl. 28(12), 3941–3951 (2017)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Article Google Scholar
Giusti, A., Ciresan, D.C., Masci, J., Gambardella, L.M., Schmidhuber, J.: Fast image scanning with deep max-pooling convolutional neural networks. In: 20th IEEE International Conference on Image Processing (ICIP), pp. 4034–4038. IEEE (2013)
Google Scholar
Tieleman, T., Hinton, G.: Lecture 6.5-RmsProp: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw. Mach. Learn. 4(2), 26–31 (2012)
Google Scholar
Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: the RProp algorithm. In: IEEE International Conference on Neural Networks, pp. 586–591. IEEE (1993)
Google Scholar
Molchanov, P., Gupta, S., Kim, K., Pulli, K.: Multi-sensor system for driver’s hand-gesture recognition. In: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1, pp. 1–8. IEEE (2015)
Google Scholar
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)
Google Scholar
Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)
Google Scholar
Chollet, F., et al.: Keras (2015). https://github.com/keras-team/keras
Theano Development Team: Theano: a Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688, May 2016

Download references

Author information

Authors and Affiliations

Computing Applied Group, Federal University of Maranhão, São Luís, Brazil
Geovane M. Ramos Neto, Geraldo Braz Junior, João Dallyson Sousa de Almeida & Anselmo Cardoso de Paiva

Authors

Geovane M. Ramos Neto
View author publications
You can also search for this author in PubMed Google Scholar
Geraldo Braz Junior
View author publications
You can also search for this author in PubMed Google Scholar
João Dallyson Sousa de Almeida
View author publications
You can also search for this author in PubMed Google Scholar
Anselmo Cardoso de Paiva
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Geovane M. Ramos Neto , Geraldo Braz Junior , João Dallyson Sousa de Almeida or Anselmo Cardoso de Paiva .

Editor information

Editors and Affiliations

University of Porto, Porto, Portugal
Aurélio Campilho
University of Waterloo, Waterloo, Ontario, Canada
Fakhri Karray
Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
Bart ter Haar Romeny

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Neto, G.M.R., Junior, G.B., de Almeida, J.D.S., de Paiva, A.C. (2018). Sign Language Recognition Based on 3D Convolutional Neural Networks. In: Campilho, A., Karray, F., ter Haar Romeny, B. (eds) Image Analysis and Recognition. ICIAR 2018. Lecture Notes in Computer Science(), vol 10882. Springer, Cham. https://doi.org/10.1007/978-3-319-93000-8_45

Download citation

DOI: https://doi.org/10.1007/978-3-319-93000-8_45
Published: 06 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92999-6
Online ISBN: 978-3-319-93000-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics