Skip to main content

Sign Language Recognition Based on 3D Convolutional Neural Networks

  • Conference paper
  • First Online:
Image Analysis and Recognition (ICIAR 2018)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10882))

Included in the following conference series:

Abstract

The inclusion of disabled people is still a recurring problem throughout the world. For the hearing impaired, the barrier imposed by the sign language spoken by a small part of the population imposes limitations that interfere in the quality of life of these people. The popularization or even automation of sign language recognition can take their lives to a higher level. Understanding the importance of sign language recognition for the hearing impaired we propose a 3D CNN architecture for the recognition of 64 classes of gestures from Argentinian Sign Language (LSA64). We demonstrate the efficiency of the method when compared to traditional methods based on hand-crafted features and that its results outperform most deep learning-based work reaching 93.9% of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Moeller, M.P.: Early intervention and language development in children who are deaf and hard of hearing. Pediatrics 106(3), e43 (2000)

    Article  MathSciNet  Google Scholar 

  2. Dalton, D.S., Cruickshanks, K.J., Klein, B.E., Klein, R., Wiley, T.L., Nondahl, D.M.: The impact of hearing loss on quality of life in older adults. Gerontologist 43(5), 661–668 (2003)

    Article  Google Scholar 

  3. Marin, G., Dominio, F., Zanuttigh, P.: Hand gesture recognition with leap motion and kinect devices. In: IEEE International Conference on Image Processing (ICIP), pp. 1565–1569. IEEE (2014)

    Google Scholar 

  4. Ronchetti, F.: Reconocimiento de gestos dinámicos y su aplicación al lenguaje de señas. Ph.D. thesis, Facultad de Informática (2017)

    Google Scholar 

  5. Estrebou, C., Lanzarini, L., Hasperué, W.: Voice recognition based on probabilistic SOM. In: Proceedings of the Conference: XXXVI Conferencia Latinoamericana en Informática, At Asunción, Paraguay (2010)

    Google Scholar 

  6. Ronchetti, F., Quiroga, F., Estrebou, C., Lanzarini, L., Rosete, A.: LSA64: a dataset of Argentinian sign language. In: XX II Congreso Argentino de Ciencias de la Computación (CACIC) (2016)

    Google Scholar 

  7. Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)

    Article  Google Scholar 

  8. Graves, A., Mohamed, A.r., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013)

    Google Scholar 

  9. Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012)

    Article  Google Scholar 

  10. Wang, H., Wang, N., Yeung, D.Y.: Collaborative deep learning for recommender systems. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1235–1244. ACM (2015)

    Google Scholar 

  11. Wang, X., Wang, Y.: Improving content-based and hybrid music recommendation using deep learning. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 627–636. ACM (2014)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  13. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  14. Cireşan, D.C., Giusti, A., Gambardella, L.M., Schmidhuber, J.: Mitosis detection in breast cancer histology images with deep neural networks. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013. LNCS, vol. 8150, pp. 411–418. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40763-5_51

    Chapter  Google Scholar 

  15. Ciregan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (2012)

    Google Scholar 

  16. Pigou, L., Dieleman, S., Kindermans, P.-J., Schrauwen, B.: Sign language recognition using convolutional neural networks. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 572–578. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_40

    Chapter  Google Scholar 

  17. Huang, J., Zhou, W., Li, H., Li, W.: Sign language recognition using 3D convolutional neural networks. In: IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2015)

    Google Scholar 

  18. Molchanov, P., Gupta, S., Kim, K., Kautz, J.: Hand gesture recognition with 3D convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–7 (2015)

    Google Scholar 

  19. Ohn-Bar, E., Trivedi, M.M.: Hand gesture recognition in real time for automotive interfaces: a multimodal vision-based approach and evaluations. IEEE Trans. Intell. Transp. Syst. 15(6), 2368–2377 (2014)

    Article  Google Scholar 

  20. Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)

    Article  Google Scholar 

  21. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  22. Oyedotun, O.K., Khashman, A.: Deep learning in vision-based static hand gesture recognition. Neural Comput. Appl. 28(12), 3941–3951 (2017)

    Article  Google Scholar 

  23. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)

    Article  Google Scholar 

  24. Giusti, A., Ciresan, D.C., Masci, J., Gambardella, L.M., Schmidhuber, J.: Fast image scanning with deep max-pooling convolutional neural networks. In: 20th IEEE International Conference on Image Processing (ICIP), pp. 4034–4038. IEEE (2013)

    Google Scholar 

  25. Tieleman, T., Hinton, G.: Lecture 6.5-RmsProp: divide the gradient by a running average of its recent magnitude. COURSERA Neural Netw. Mach. Learn. 4(2), 26–31 (2012)

    Google Scholar 

  26. Riedmiller, M., Braun, H.: A direct adaptive method for faster backpropagation learning: the RProp algorithm. In: IEEE International Conference on Neural Networks, pp. 586–591. IEEE (1993)

    Google Scholar 

  27. Molchanov, P., Gupta, S., Kim, K., Pulli, K.: Multi-sensor system for driver’s hand-gesture recognition. In: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1, pp. 1–8. IEEE (2015)

    Google Scholar 

  28. Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937 (2016)

    Google Scholar 

  29. Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3128–3137 (2015)

    Google Scholar 

  30. Chollet, F., et al.: Keras (2015). https://github.com/keras-team/keras

  31. Theano Development Team: Theano: a Python framework for fast computation of mathematical expressions. arXiv e-prints abs/1605.02688, May 2016

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Geovane M. Ramos Neto , Geraldo Braz Junior , João Dallyson Sousa de Almeida or Anselmo Cardoso de Paiva .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Neto, G.M.R., Junior, G.B., de Almeida, J.D.S., de Paiva, A.C. (2018). Sign Language Recognition Based on 3D Convolutional Neural Networks. In: Campilho, A., Karray, F., ter Haar Romeny, B. (eds) Image Analysis and Recognition. ICIAR 2018. Lecture Notes in Computer Science(), vol 10882. Springer, Cham. https://doi.org/10.1007/978-3-319-93000-8_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-93000-8_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-92999-6

  • Online ISBN: 978-3-319-93000-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics