skip to main content
10.1145/3389189.3398006acmotherconferencesArticle/Chapter ViewAbstractPublication PagespetraConference Proceedingsconference-collections
research-article

SL-ReDu: greek sign language recognition for educational applications. Project description and early results

Published:30 June 2020Publication History

ABSTRACT

We present SL-ReDu, a recently commenced innovative project that aims to exploit deep-learning progress to advance the state-of-the-art in video-based automatic recognition of Greek Sign Language (GSL), while focusing on the use-case of GSL education as a second language. We first briefly overview the project goals, focal areas, and timeline. We then present our initial deep learning-based approach for GSL recognition that employs efficient visual tracking of the signer hands, convolutional neural networks for feature extraction, and attention-based encoder-decoder sequence modeling for sign prediction. Finally, we report experimental results for small-vocabulary, isolated GSL recognition on the single-signer "Polytropon" corpus. To our knowledge, this work constitutes the first application of deep-learning techniques to GSL.

Skip Supplemental Material Section

Supplemental Material

a59-potamianos.mp4

mp4

9.6 MB

References

  1. 2019. ELAN (Version 5.8) [Computer software]. Nijmegen: Max Planck Institute for Psycholinguistics, The Language Archive. https://archive.mpi.nl/tla/elan.Google ScholarGoogle Scholar
  2. Epameinondas Antonakos, Vassilis Pitsikalis, and Petros Maragos. 2014. Classification of extreme facial events in sign language videos. EURASIP Journal on Image and Video Processing 14 (2014).Google ScholarGoogle Scholar
  3. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and translate. Computing Research Repository (2014). arXiv:abs/1409.0473v7.Google ScholarGoogle Scholar
  4. Kshitij Bantupalli and Ying Xie. 2018. American sign language recognition using deep learning and computer vision. In Proc. IEEE International Conference on Big Data. 4896--4899.Google ScholarGoogle ScholarCross RefCross Ref
  5. Necati Cihan Camgoz, Simon Hadfield, Oscar Koller, Hermann Ney, and Richard Bowen. 2018. Neural sign language translation. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7784--7793.Google ScholarGoogle ScholarCross RefCross Ref
  6. Kyunghyun Cho, Bart van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proc. Conference on Empirical Methods in Natural Language Processing (EMNLP). 1724--1734.Google ScholarGoogle ScholarCross RefCross Ref
  7. Onno Crasborn and Han Sloetjes. 2008. Enhanced ELAN functionality for sign language corpora. In Proc. Workshop on the Representation and Processing of Sign Languages: Construction and Exploitation of Sign Language Corpora. 39--43.Google ScholarGoogle Scholar
  8. Maartje De Meulder. 2016. The Power of Language Policy: The Legal Recognition of Sign Languages and the Aspirations of Deaf Communities. Ph.D. Thesis, Faculty of Humanities, University of Juväskylä, Finland.Google ScholarGoogle Scholar
  9. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A large-scale hierarchical image database. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 248--255.Google ScholarGoogle ScholarCross RefCross Ref
  10. John Duchi, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12 (2011), 2121--2159.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Eleni Efthimiou, Kiki Vasilaki, Stavroula-Evita Fotinea, Anna Vacalopoulou, Theodoros Goulas, and Athanasia-Lida Dimou. 2018. The POLYTROPON parallel corpus. In Proc. International Conference on Language Resources and Evaluation (LREC).Google ScholarGoogle Scholar
  12. Xavier Glorot and Yoshua Bengio. 2010. Understanding the difficulty of training deep feedforward neural networks. In Proc. International Conference on Artificial Intelligence and Statistics (AISTATS), Vol. PMLR 9. 249--256.Google ScholarGoogle Scholar
  13. Tobias Haug, Wolfgang Mann, Eveline Boers-Visker, Jessica Contreras, Charlotte Enns, Ros Herman, and Katherine Rowley. 2016. Guidelines for Sign Language Test Development, Evaluation, and Use. Unpublished document (upd. 2018), retrieved from http://www.signlang-assessment.info/.Google ScholarGoogle Scholar
  14. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778.Google ScholarGoogle ScholarCross RefCross Ref
  15. Siming He. 2019. Research of a sign language translation system based on deep learning. In Proc. International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM). 392--396.Google ScholarGoogle ScholarCross RefCross Ref
  16. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computing 9 (1997), 1735--1780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jie Huang, Wengang Zhou, Houqiang Li, and Weiping Li. 2015. Sign language recognition using 3D convolutional neural networks. In Proc. IEEE International Conference on Multimedia and Expo (ICME).Google ScholarGoogle ScholarCross RefCross Ref
  18. Jong-Min Jeong, Tae-Sung Yoon, and Jin-Bae Park. 2014. Kalman filter based multiple objects detection-tracking algorithm robust to occlusion. In Proc. SICE Annual Conference. 941--946.Google ScholarGoogle ScholarCross RefCross Ref
  19. Byeongkeun Kang, Subarna Tripathi, and Truong Q. Nguyen. 2015. Real-time sign language fingerspelling recognition using convolutional neural networks from depth map. In Proc. IAPR Asian Conference on Pattern Recognition (ACPR). 136--140.Google ScholarGoogle Scholar
  20. Diederik P. Kingma and Jimmy Lei Ba. 2014. Adam: A method for stochastic optimization. Computing Research Repository (2014). arXiv:abs/1412.6980v9.Google ScholarGoogle Scholar
  21. Oscar Koller, Jens Forster, and Hermann Ney. 2015. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Computer Vision and Image Understanding 141 (2015), 108--125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Dimitrios Konstantinidis, Kosmas Dimitropoulos, and Petros Daras. 2018. A deep learning approach for analyzing video and skeletal features in sign language recognition. In Proc. IEEE International Conference on Imaging Systems and Techniques (IST).Google ScholarGoogle ScholarCross RefCross Ref
  23. Ioannis Koulierakis, Georgios Siolas, Eleni Efthimiou, Stavroula-Evita Fotinea, and Andreas-Georgios Stafylopatis. 2019. Gesture recognition using keypoints detection in the context of sign language translation. In Proc. Workshop on Sign Language Translation and Avatar Technologies (SLTAT).Google ScholarGoogle Scholar
  24. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (NeurIPS) 25. 1097--1105.Google ScholarGoogle Scholar
  25. Silke Matthes, Thomas Hanke, Anja Regen, Jakob Storz, Satu Worseck, Eleni Efthimiou, Athanasia-Lida Dimou, Annelies Braffort, John Glauert, and Eva Safar. 2012. Dicta-Sign - Building a multilingual sign language corpus. In Proc. Workshop on the Representation and Processing of Sign Languages: Interactions Between Corpus and Lexicon.Google ScholarGoogle Scholar
  26. Arpit Mittal, Andrew Zisserman, and Philip H. S. Torr. 2011. Hand detection using multiple proposals. In Proc. British Machine Vision Conference (BMVC).Google ScholarGoogle Scholar
  27. Jill P. Morford and Martina L. Carlson. 2011. Sign perception and recognition in non-native signers of ASL. Language Learning and Development 7 (2011), 149--168.Google ScholarGoogle ScholarCross RefCross Ref
  28. Katerina Papadimitriou and Gerasimos Potamianos. 2018. A hybrid approach to hand detection and type classification in upper-body videos. In Proc. European Workshop on Visual Information Processing (EUVIP).Google ScholarGoogle ScholarCross RefCross Ref
  29. Katerina Papadimitriou and Gerasimos Potamianos. 2019. End-to-end convolutional sequence learning for ASL fingerspelling recognition. In Proc. Annual Conference of the International Speech Communication Association (Interspeech). 2315--2319.Google ScholarGoogle ScholarCross RefCross Ref
  30. Vassilia N. Pashaloudi and Konstantinos G. Margaritis. 2004. A performance study of a recognition system for Greek sign language alphabet letters. In Proc. International Conference on Speech and Computer (SPECOM). 545--551.Google ScholarGoogle Scholar
  31. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proc. Neural Information Processing Systems Workshops (NeurIPS-W).Google ScholarGoogle Scholar
  32. Lionel Pigou, Sander Dieleman, Pieter-Jan Kindermans, and Benjamin Schrauwen. 2015. Sign language recognition using convolutional neural networks. In Proc. European Conference on Computer Vision Workshops (ECCVW), Vol. LNCS 8925. 572--578.Google ScholarGoogle ScholarCross RefCross Ref
  33. G. Anantha Rao, K. Syamala, P. V. V. Kishore, and A. S. C. S. Sastry. 2018. Deep convolutional neural networks for sign language recognition. In Proc. Conference on Signal Processing and Communication Engineering Systems (SPACES). 194--197.Google ScholarGoogle ScholarCross RefCross Ref
  34. Anastasios Roussos, Stavros Theodorakis, Vassilis Pitsikalis, and Petros Maragos. 2013. Dynamic-affine invariant shape-appearance handshape features and classification in sign language videos. Journal of Machine Learning Research 14 (2013), 1627--1663.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Khamar Basha Shaik, P. Ganesan, V. Kalist, B. S. Sathish, and J. Merlin Mary Jenitha. 2015. Comparative study of skin color detection and segmentation in HSV and YCbCr color space. Procedia Computer Science 57 (2015), 41--48.Google ScholarGoogle ScholarCross RefCross Ref
  36. Bowen Shi and Karen Livescu. 2017. Multitask training with unlabeled data for end-to-end sign language fingerspelling recognition. In Proc. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). 389--396.Google ScholarGoogle ScholarCross RefCross Ref
  37. Bowen Shi, Aurora Martinez Del Rio, Jonathan Keane, Jonathan Michaux, Diane Brentari, Greg Shakhnarovich, and Karen Livescu. 2018. American sign language fingerspelling recognition in the wild. In Proc. IEEE Spoken Language Technology Workshop (SLT). 145--152.Google ScholarGoogle ScholarCross RefCross Ref
  38. David H. Smith and Jeffrey E. Davis. 2014. Formative assessment for student progress and program improvement in sign language as L2 programs. In Teaching and Learning Signed Languages, David McKee, Russell S. Rosen, and Rachel McKee (Eds.). Palgrave Macmillan, London, 253--280.Google ScholarGoogle Scholar
  39. Wenjin Tao, Ming C. Leu, and Zhaozheng Yin. 2018. American sign language alphabet recognition using convolutional neural networks with multiview augmentation and inference fusion. Engineering Applications of Artificial Intelligence 76 (2018), 202--213.Google ScholarGoogle ScholarCross RefCross Ref
  40. Stavros Theodorakis, Vassilis Pitsikalis, and Petros Maragos. 2014. Dynamic-static unsupervised sequentiality, statistical subunits and lexicon for sign language recognition. Image and Vision Computing 32 (2014), 533--549.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems (NeurIPS) 30. 5998--6008.Google ScholarGoogle Scholar
  42. Paul Viola and Michael Jones. 2001. Rapid object detection using a boosted cascade of simple features. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  43. Zhaoyang Yang, Zhenmei Shi, Xiaoyong Shen, and Yu-Wing Tai. 2019. SF-Net: Structured feature network for continuous sign language recognition. Computing Research Repository (2019). arXiv:abs/1908.01341v1.Google ScholarGoogle Scholar

Index Terms

  1. SL-ReDu: greek sign language recognition for educational applications. Project description and early results

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          PETRA '20: Proceedings of the 13th ACM International Conference on PErvasive Technologies Related to Assistive Environments
          June 2020
          574 pages
          ISBN:9781450377737
          DOI:10.1145/3389189

          Copyright © 2020 ACM

          © 2020 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 30 June 2020

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader