research-article

Towards a Transcription System of Sign Language Video Resources via Motion Trajectory Factorisation

Authors:
Mark Borg

University of Malta, Msida, Malta

University of Malta, Msida, Malta
View Profile

,
Kenneth P. Camilleri

University of Malta, Msida, Malta

University of Malta, Msida, Malta
View Profile

DocEng '17: Proceedings of the 2017 ACM Symposium on Document EngineeringAugust 2017Pages 163–172https://doi.org/10.1145/3103010.3103020

Published:31 August 2017Publication History

DocEng '17: Proceedings of the 2017 ACM Symposium on Document Engineering

Pages 163–172

ABSTRACT

Sign languages are visual languages used by the Deaf community for communication purposes. Whilst recent years have seen a high growth in the quantity of sign language video collections available online, much of this material is hard to access and process due to the lack of associated text-based tagging information and because 'extracting' content directly from video is currently still a very challenging problem. Also limited is the support for the representation and documentation of sign language video resources in terms of sign writing systems. In this paper, we start with a brief survey of existing sign language technologies and we assess their state of the art from the perspective of a sign language digital information processing system. We then introduce our work, focusing on vision-based sign language recognition. We apply the factorisation method to sign language videos in order to factor out the signer's motion from the structure of the hands. We then model the motion of the hands in terms of a weighted combination of linear trajectory basis and apply a set of classifiers on the basis weights for the purpose of recognising meaningful phonological elements of sign language. We demonstrate how these classification results can be used for transcribing sign videos into a written representation for annotation and documentation purposes. Results from our evaluation process indicate the validity of our proposed framework.

References

Steven Aerts, Bart Braem, Katrien Van Mulders, and Kristof De Weerdt. 2004. Searching SignWriting Signs. In Proc. LREC 2004. ELRA, 79--81.Google Scholar
Ijaz Akhter, Yaser Sheikh, Sohaib Khan, and Takeo Kanade. 2011. Trajectory Space: A Dual Representation for Nonrigid Structure from Motion. IEEE Trans. Pattern Anal. Mach. Intell. 33, 7 (2011), 1442--1456. Google ScholarDigital Library
David Miguel Antunes et al. 2011. A Library for Implementing the Multiple Hypothesis Tracking Algorithm. CoRR abs/1106.2 (2011).Google Scholar
V. Ayumi. 2016. Pose-based human action recognition with Extreme Gradient Boosting. In IEEE Conf. SCOReD. 1--5. Google ScholarCross Ref
Guylhem Aznar. 2005. Sign Writing Unicode Support: Using an Assisted Entry Process to Neutralize Sign and Symbol Variability. In Interacting Bodies (ISGS)).Google Scholar
Katarzyna Barczewska et al. 2016. Using Components of Corpus Linguistics and Annotation Tools in Sign Language Teaching. IJMECS 2 (2016), 14--21. Google ScholarCross Ref
Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer- Verlag New York, Inc.Google Scholar
Mark Borg and Kenneth P. Camilleri. 2015. Multiple Hypothesis Tracking with Sign Language Hand Motion Constraints. In CAIP 2015, Proc., Part II. 207--219. Google ScholarDigital Library
Mehrez Boulares and Mohamed Jemni. 2012. 3D Motion Trajectory Analysis Approach to Improve Sign Language 3D-based Content Recognition. Procedia Computer Science 13 (2012), 133--143. Google ScholarCross Ref
Alice Caplier, Sébastien Stillittano, et al. 2008. Image and Video for Hearing Impaired People. EURASIP Journal on Image and Video Processing (2008), 1--14.Google Scholar
Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. 2, 3, Article 27 (2011), 27 pages.Google ScholarDigital Library
J. Charles et al. 2013. Automatic and Efficient Human Pose Estimation for Sign Language Videos. Int. Journal of Computer Vision (2013).Google Scholar
J. Charles, T. Pfister, D. Magee, D. Hogg, and A. Zisserman. 2014. Upper Body Pose Estimation with Temporal Sequential Forests. In BMVC 2014. Google ScholarCross Ref
Nitesh V. Chawla et al. 2002. SMOTE: Synthetic Minority Over-sampling Tech- nique. J. Artificial Intelligence Research 16, 1 (June 2002), 321--357.Google ScholarCross Ref
Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining. Google ScholarDigital Library
Wang Chunli, Gao Wen, and Ma Jiyong. 2002. A real-time large vocabulary recognition system for Chinese Sign Language. In Proc. GW'01. Springer, 86--95.Google ScholarCross Ref
William S. Cleveland, Eric Grosse, and William M. Shyu. 1992. Local regression models. Statistical models in S 2 (1992), 309--376.Google Scholar
Onno Crasborn and Han Sloetjes. 2008. Enhanced ELAN functionality for sign language corpora. In Proc. of LREC 2008.Google Scholar
Onno Crasborn, E. van der Kooij, A. Nonhebel, and W. Emmerik. 2004. ECHO Data Set for Sign Language of the Netherlands (NGT). (2004).Google Scholar
Antônio Carlos da Rocha Costa et al. 2004. A sign matching technique to support searches in sign language texts. In LREC 2004.Google Scholar
Antônio Carlos da Rocha Costa and Graçaliz Pereira Dimuro. 2003. SignWriting and SWML: Paving the way to sign language processing. In TALN'03.Google Scholar
Konstantinos G. Derpanis, Richard P. Wildes, and John K. Tsotsos. 2008. Definition and Recovery of Kinematic Features for Recognition of American Sign Language Movements. Image Vision Comput. 26, 12 (Dec. 2008), 1650--1662. Google ScholarDigital Library
Mark Dilsizian, Zhiqiang Tang, et al. 2016. The Importance of 3D Motion Trajec- tories for Computer-based Sign Recognition. In Proc. LREC'16. ELRA.Google Scholar
Liya Ding and Aleix M. Martinez. 2009. Modelling and recognition of the linguistic components in American Sign Language. Image Vision Comput. (2009).Google Scholar
Wanessa Machado do Amaral and José Mario De Martino. 2010. Towards a Transcription System of Sign Language for 3D Virtual Agents. Springer, 85--90.Google Scholar
Philippe Dreuw and Hermann Ney. 2008. Towards automatic sign language annotation for the ELAN tool. In Proc. LREC'08. 50--53.Google Scholar
Jerome H. Friedman. 2001. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics 29 (2001), 1189--1232. Google ScholarCross Ref
Matilde Gonzalez et al. 2012. Semi-Automatic Sign Language Corpora Annotation using Lexical Representations of Signs. In Proc. LREC'12.Google Scholar
C. Guimarães, J. F. Guardezi, et al. 2014. Deaf Culture and Sign Language Writ- ing System -- A Database for a New Approach to Writing System Recognition Technology. In 47th Hawaii Int. Conf. on System Sciences. 3368--3377.Google Scholar
Thomas Hanke. 2004. HamNoSys - Representing Sign Language Data in Language Resources and Language Processing Contexts. In LREC 2004. 1--6.Google Scholar
Thomas Hanke and Jakob Storz. 2008. iLex - A Database Tool for Integrating Sign Language Corpus Linguistics and Sign Language Lexicography. In LREC'08.Google Scholar
Marek Hrúz et al. 2011. Towards Automatic Annotation of Sign Language Dic- tionary Corpora. Text, Speech and Dialogue (TSD) (2011), 331--339.Google Scholar
Rob J. Hyndman and Yanan Fan. 1996. Sample Quantiles in Statistical Packages. The American Statistician 50, 4 (1996), 361--365.Google ScholarCross Ref
Kabil Jaballah and Mohamed Jemni. 2010. Toward Automatic Sign Language Recognition from Web3D Based Scenes. In Computers Helping People with Special Needs (ICCHP 2010), Klaus Miesenberger et al. (Eds.). 205--212. Google ScholarCross Ref
V. Karappa et al. 2014. Detection of sign-language content in video through polar motion profiles. In ICASSP'14. 1290--1294. Google ScholarCross Ref
Khushdeep Kaur and Parteek Kumar. 2016. HamNoSys to SiGML Conversion System for Sign Language Automation. Procedia Computer Science 89 (2016). Google ScholarCross Ref
Oscar Koller, R Bowden, and H Ney. 2016. Automatic Alignment of HamNoSys Subunits for Continuous Sign Language Recognition. In LREC 2016 Proc.Google Scholar
Oscar Koller, Jens Forster, and Hermann Ney. 2015. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Comput. Vision Image Understanding 141 (2015), 108--125. Google ScholarDigital Library
Markus Koskela et al. 2008. Content-Based Video Analysis and Access for Finnish Sign Languag e - A Multidisciplinary Research Project. In Proc. LREC'08.Google Scholar
Gan Lu et al. 2010. Hand Motion Recognition and Visualisation for Direct Sign Writing. In Proc. Information Visualisation (IV 2010). IEEE, 467--472. Google ScholarDigital Library
C. D. D. Monteiro et al. 2016. Detecting and Identifying Sign Languages through Visual Features. In 2016 IEEE Int. Symposium on Multimedia (ISM). Google ScholarCross Ref
Alexey Natekin and Alois Knoll. 2013. Gradient boosting machines, a tutorial. Front. Neurorobot. (2013).Google Scholar
Sylvie C W Ong and Surendra Ranganath. 2005. Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE PAMI 27, 6 (2005), 873--91.Google ScholarDigital Library
A. L. Prihodko et al. 2016. Approach to the analysis and synthesis of the sign language. In Proc. APEIE, Vol. 02. 502--505. Google ScholarCross Ref
R Core Team. 2017. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. www.R-project.orgGoogle Scholar
Carl Scheffler, Konrad H. Scheffler, and Christian W. Omlin. 2003. Articulated Tree Structure from Motio n -- A Matrix Factorisation Approach. In Proc. Annual Symposium of the Pattern Recognition Association of South Africa (PRASA).Google Scholar
Oliver Schreer, Stefano Masneri, et al. 2014. Coding Hand Movement Behavior and Gesture with NEUROGES Supported by Automatic Video Analysis. In Int. Conf. on Methods and Techniques in Behavioral Research.Google Scholar
Jianbo Shi and Carlo Tomasi. 1994. Good Features to Track. In 1994 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR'94). 593--600.Google Scholar
Frank Shipman, Ricardo Gutierrez-Osuna, et al. 2015. Towards a Distributed Digital Library for Sign Language Content. In JCDL '15. ACM. Google ScholarDigital Library
Robert Smith. 2013. HamNoSys 4.0 User Guide. Technical Report. Institute of Technology Blanchardstown Ireland.Google Scholar
D. Stiehl et al. 2015. Towards a SignWriting recognition system. In Proc. Int. Conf. on Document Analysis and Recognition (ICDAR). 26--30. Google ScholarDigital Library
Valerie Sutton. 1980. A way to analyze American Sign Language and any other Sign Language without translation into any spoken language. In National Sym- posium on Sign Language Research and Teaching.Google Scholar
S. Tamura and S. Kawasaki. 1988. Recognition of sign language motion images. Pattern Recognit. 21, 4 (1988), 343--353. Google ScholarDigital Library
S. Theodorakis, V. Pitsikalis, I. Rodomagoulakis, and P. Maragos. 2012. Recogni- tion with raw canonical phonetic movement and handshape subunits on videos of continuous Sign Language. In IEEE Int. Conf. on Image Processing. 1413--1416.Google Scholar
Carlo Tomasi and Takeo Kanade. 1992. Shape and Motion from Image Streams Under Orthography: A Factorization Method. Int. J. Comput. Vision 9, 2 (1992). Google ScholarDigital Library
P. Vijayalakshmi and M. Aarthi. 2016. Sign language to speech conversion. In 2016 Int. Conf. on Recent Trends in Information Technology (ICRTIT). 1--6. Google ScholarCross Ref
Paul Viola and Michael J. Jones. 2001. Robust Real-time Object Detection. Int. J. Comput. Vision (2001).Google Scholar
Ulrich von Agris et al. 2008. Rapid signer adaptation for continuous sign language recognition using a combined approach of eigenvoices, MLLR, and MAP. In CPR 2008. IEEE, 1--4.Google Scholar
Ulrich von Agris, Jörg Zieren, et al. 2007. Recent developments in visual sign language recognition. Universal Access in the Information Society (2007).Google Scholar
M. Wimmer and B. Radig. 2005. Adaptive skin color classificator. In Proc GVIP'05.Google Scholar
Shilin Zhang and Bo Zhang. 2010. Trajectory based sign language video retrieval using revised string edit distance. In Proc. MINES'10. IEEE, 17--22. Google ScholarDigital Library

Index Terms

Towards a Transcription System of Sign Language Video Resources via Motion Trajectory Factorisation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Activity recognition and understanding
  2. Machine learning
    1. Machine learning approaches
      1. Factorization methods
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction techniques
      1. Gestural input

Recommendations

Accessible American Sign Language Recognition with the Leap Motion Controller
SIGCSE '19: Proceedings of the 50th ACM Technical Symposium on Computer Science Education

Communities that use visual languages like American Sign Language (ASL) to communicate are underrepresented in the domain of translation and language learning tools. Translating between a visual, gestural language like ASL and a spoken, written language ...
Read More
Hand Skeleton Graph Feature for Indonesian Sign Language (BISINDO) Recognition Based on Computer Vision
IC3INA '22: Proceedings of the 2022 International Conference on Computer, Control, Informatics and Its Applications

Sign language is a means of communication for The Deaf. Indonesian Sign language or BISINDO is one of the sign languages that is used in Indonesia. For The Deaf with The Deaf sign language is a means of communicating effectively, but not for The Deaf ...
Read More
Sign language recognition system for communicating to people with disabilities
Abstract
Sign language is one of the most reliable ways of communicating with special needs people, as it can be done anywhere. However, most people do not understand sign language. Therefore, we have devised an idea to make a desktop application that can ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
DocEng '17: Proceedings of the 2017 ACM Symposium on Document Engineering
August 2017
242 pages
ISBN:9781450346894
DOI:10.1145/3103010
General Chair:
Kenneth Camilleri
University of Malta, Malta
,
Program Chair:
Alexandra Bonnici
University of Malta, Malta
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 August 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Best Paper
Author Tags
annotation tools
computer vision
factorisation method
sign language recognition
sign language transcription
Qualifiers
- research-article
Conference

Acceptance Rates
DocEng '17 Paper Acceptance Rate13of71submissions,18%Overall Acceptance Rate178of537submissions,33%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 172
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Towards a Transcription System of Sign Language Video Resources via Motion Trajectory Factorisation

DocEng '17: Proceedings of the 2017 ACM Symposium on Document Engineering

ABSTRACT

References

Cited By

Index Terms

Recommendations

Accessible American Sign Language Recognition with the Leap Motion Controller

Hand Skeleton Graph Feature for Indonesian Sign Language (BISINDO) Recognition Based on Computer Vision

Sign language recognition system for communicating to people with disabilities