ABSTRACT
Recently, a phonology-based sign language assessment approach has been proposed using sign language production acquired in 3D space using Kinect sensor. In order to scale the sign language assessment system to realistic application, there is need to reduce the dependency on Kinect, which is not accessible to wider community, and develop solutions that can potentially work with web-cameras. This paper takes a step in that direction by investigating sign language recognition and sign language assessment in 2D space either by dropping the depth coordinate in Kinect or using methods for skeleton estimation from videos. Experimental studies on Swiss German Sign Language corpus SMILE show that, while loss of depth information leads to considerable drop in sign language recognition performance, high level of sign language assessment performance can still be obtained.
- ISARA application. 2016. ISARA app. Retrieved May 2022 from https://isara.app/featuresGoogle Scholar
- G. Aradilla, H. Bourlard, and M. Magimai.-Doss. 2008. Using KL-Based Acoustic Models in a Large Vocabulary Recognition Task. In Proceedings of Interspeech. 928–931.Google Scholar
- G. Aradilla, J. Vepa, and H. Bourlard. 2007. An Acoustic Model Based on Kullback-Leibler Divergence for Posterior Features. In ICASSP. 657–660.Google Scholar
- O. Aran 2009. SignTutor: An Interactive System for Sign Language Tutoring. IEEE MultiMedia 16, 1 (2009), 81–93.Google ScholarDigital Library
- J. Arendsen 2008. Acceptability ratings by humans and automatic gesture recognition for variations in sign productions. In Proc. of IEEE International Conference on Automatic Face and Gesture Recognition. 1–6.Google ScholarCross Ref
- H. Brashear 2006. American Sign Language Recognition in Game Development for Deaf Children. In Proc. of the International ACM SIGACCESS Conference on Computers and Accessibility. 79–86.Google ScholarDigital Library
- Necati Cihan Camgoz, Simon Hadfield, Oscar Koller, and Richard Bowden. 2017. SubUNets: End-to-End Hand Shape and Continuous Sign Language Recognition. In 2017 IEEE International Conference on Computer Vision (ICCV). 3075–3084. https://doi.org/10.1109/ICCV.2017.332Google Scholar
- Necati Cihan Camgoz, Simon Hadfield, Oscar Koller, Hermann Ney, and Richard Bowden. 2018. Neural Sign Language Translation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7784–7793. https://doi.org/10.1109/CVPR.2018.00812Google Scholar
- Necati Cihan Camgoz, Oscar Koller, Simon Hadfield, and Richard Bowden. 2020. Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
- J. Christopher. 2012. SignAssess – Online Sign Language Training Assignments via the Browser, Desktop and Mobile. In Computers Helping People with Special Needs, Klaus Miesenberger, Arthur Karshmer, Petr Penaz, and Wolfgang Zagler (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 253–260.Google Scholar
- H. Cooper, B. Holt, and R. Bowden. 2011. Sign Language Recognition. In Visual Analysis of Humans, 2011. https://doi.org/10.1007/978-0-85729-997-0_27Google Scholar
- S. Ebling, N. C. Camgöz, P. Boyes Braem, K. Tissi, S. Sidler-Miserez, S. Stoll, S. Hadfield, T. Haug, R. Bowden, S. Tornay, M. Razavi, and M. Magimai-Doss. 2018. SMILE Swiss German sign language dataset. In Proc. of the Language Resources and Evaluation Conference.Google Scholar
- K. He, G. Gkioxari, P. Dollár, and R. Girshick. 2017. Mask R-CNN. In Proc. of ICCV. 2980–2988.Google Scholar
- Matt Huenerfauth, Elaine Gale, Brian Penly, Sree Pillutla, Mackenzie Willard, and Dhananjai Hariharan. 2017. Evaluation of Language Feedback Methods for Student Videos of American Sign Language. ACM Trans. Access. Comput. 10, 1, Article 2 (apr 2017), 30 pages. https://doi.org/10.1145/3046788Google ScholarDigital Library
- Mohammed Kadous. 1996. Machine Recognition of Auslan Signs Using PowerGloves: Towards Large-Lexicon Recognition of Sign Language. In Procs. of Wkshp : Integration of Gesture in Language and Speech.Google Scholar
- M. Kocabas, N. Athanasiou, and M. J. Black. 2020. VIBE: Video Inference for Human Body Pose and Shape Estimation. In Proc. of CVPR.Google ScholarCross Ref
- Oscar Koller, Jens Forster, and Hermann Ney. 2015. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Computer Vision and Image Understanding 141 (Dec. 2015), 108–125.Google Scholar
- O. Koller, H. Ney, and R. Bowden. 2016. Deep Hand: How to Train a CNN on 1 Million Hand Images When Your Data Is Continuous and Weakly Labelled. In Proc. of CVPR.Google Scholar
- O Koller, O Zargaran, H Ney, and R Bowden. 2016. Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition. In Proceedings of the British Machine Vision Conference 2016.Google ScholarCross Ref
- Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A Skinned Multi-Person Linear Model. ACM Trans. Graph. 34, 6, Article 248 (oct 2015), 16 pages. https://doi.org/10.1145/2816795.2818013Google ScholarDigital Library
- Katerina Papadimitriou and Gerasimos Potamianos. 2020. Multimodal Sign Language Recognition via Temporal Deformable Convolutional Sequence Learning. In INTERSPEECH.Google Scholar
- VN Pashaloudi and KG Margaritis. 2002. Hidden Markov model for sign language recognition: A review. In Proc. 2nd Hellenic Conf. AI, SETN-2002. 11–12.Google Scholar
- D. Pavllo, C. Feichtenhofer, D. Grangier, and M. Auli. 2019. 3D human pose estimation in video with temporal convolutions and semi-supervised training. In Proc. of CVPR.Google Scholar
- Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1 (Montreal, Canada) (NIPS’15). MIT Press, Cambridge, MA, USA, 91–99.Google ScholarDigital Library
- Ben Saunders, Necati Cihan Camgoz, and Richard Bowden. 2021. Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks. Int. J. Comput. Vision 129, 7 (jul 2021), 2113–2135.Google Scholar
- G. Spaai 2005. Elo: An electronic learning environment for practising sign vocabulary by young deaf children. In Proc. of International Congress for Education of the Deaf.Google Scholar
- Stephanie Stoll, Necati Cihan Camgöz, Simon Hadfield, and Richard Bowden. 2020. Text2Sign: Towards Sign Language Production Using Neural Machine Translation and Generative Adversarial Networks. Int. J. Comput. Vis. 128, 4 (2020), 891–908.Google ScholarDigital Library
- Sandrine Tornay, Oya Aran, and Mathew Magimai.-Doss. 2020. An HMM Approach with Inherent Model Selection for Sign Language and Gesture Recognition. In Proc. of the International Conference on Language Resources and Evaluation LREC 2020.Google Scholar
- S. Tornay, N. C. Camgoz, R. Bowden, and M. Magimai.-Doss. 2020. A Phonology-based Approach for Isolated Sign Production Assessment in Sign Language. In Companion Publication of the 2020 International Conference on Multimodal Interaction (ICMI ’20 Companion).Google Scholar
- S. Tornay and M. Magimai.-Doss. 2019. Subunits Inference and Lexicon Development Based on Pairwise Comparison of Utterances and Signs. Information 10(2019). https://doi.org/10.3390/info10100298Google Scholar
- S. Tornay, M. Razavi, N. C. Camgoz, R. Bowden, and M. Magimai.-Doss. 2019. HMM-based Approaches to Model Multichannel Information in Sign Language inspired from Articulatory Features-based Speech Processing. In Proc. in the IEEE ICASSP.Google Scholar
- SignAll Technologies Inc. (USA). 2021. A communication bridge between deaf and hearing - SIGNALL. Retrieved May 2022 from https://www.signall.usGoogle Scholar
- C. Vogler and D. Metaxas. 1998. ASL recognition based on a coupling between HMMs and 3D motion analysis. Procs. of ICCV, 363–369.Google Scholar
- C. Vogler and D. Metaxas. 1999. Parallel hidden Markov models for American sign language recognition. In Proc. of the Seventh IEEE International Conference on Computer Vision (ICCV), Vol. 1. 116–122 vol.1. https://doi.org/10.1109/ICCV.1999.791206Google ScholarCross Ref
- M.B. Waldron and Soowon Kim. 1995. Isolated ASL sign recognition system for deaf persons. IEEE Transactions on Rehabilitation Engineering 3, 3(1995), 261–271. https://doi.org/10.1109/86.413199Google ScholarCross Ref
- Louisa Willoughby, Stephanie Linder, Kirsten Ellis, and Julie Fisher. 2015. Errors and Feedback in the Beginner Auslan Classroom. Sign Language Studies 15(2015), 322 – 347.Google ScholarCross Ref
- S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He. 2016. Aggregated Residual Transformations for Deep Neural Networks. arXiv preprint arXiv:1611.05431(2016).Google Scholar
- Z. Zafrulla 2011. CopyCat: An American Sign Language game for deaf children. In IEEE International Conference on Automatic Face and Gesture Recognition.Google ScholarCross Ref
- C. Zhe, S. Tomas, W. Shih-En, and S. Yaser. 2017. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. In Proc. of CVPR.Google Scholar
Index Terms
- Towards Accessible Sign Language Assessment and Learning
Recommendations
A Phonology-based Approach for Isolated Sign Production Assessment in Sign Language
ICMI '20 Companion: Companion Publication of the 2020 International Conference on Multimodal InteractionInteractive learning platforms are in the top choices to acquire new languages. Such applications or platforms are more easily available for spoken languages, but rarely for sign languages. Assessment of the production of signs is a challenging problem ...
SignWriting -Based Sign Language Processing
GW '01: Revised Papers from the International Gesture Workshop on Gesture and Sign Languages in Human-Computer InteractionThis paper proposes an approach to the computer processing of deaf sign languages that uses SignWriting as the writing system for deaf sign languages, and SWML (SignWriting Markup Language) as its computer encoding. Every kind of language and document ...
JUMLA-QSL-22: Creation and Annotation of a Qatari Sign Language Corpus for Sign Language Processing
PETRA '23: Proceedings of the 16th International Conference on PErvasive Technologies Related to Assistive EnvironmentsSign language processing (SLP) is essential to creating digital assistive technology for deaf people and requires the availability of large-scale corpus for accurate systems. However, large-scale publicly available datasets for sign languages, ...
Comments