skip to main content
10.1145/3103010.3103020acmconferencesArticle/Chapter ViewAbstractPublication PagesdocengConference Proceedingsconference-collections
research-article
Best Paper

Towards a Transcription System of Sign Language Video Resources via Motion Trajectory Factorisation

Published:31 August 2017Publication History

ABSTRACT

Sign languages are visual languages used by the Deaf community for communication purposes. Whilst recent years have seen a high growth in the quantity of sign language video collections available online, much of this material is hard to access and process due to the lack of associated text-based tagging information and because 'extracting' content directly from video is currently still a very challenging problem. Also limited is the support for the representation and documentation of sign language video resources in terms of sign writing systems. In this paper, we start with a brief survey of existing sign language technologies and we assess their state of the art from the perspective of a sign language digital information processing system. We then introduce our work, focusing on vision-based sign language recognition. We apply the factorisation method to sign language videos in order to factor out the signer's motion from the structure of the hands. We then model the motion of the hands in terms of a weighted combination of linear trajectory basis and apply a set of classifiers on the basis weights for the purpose of recognising meaningful phonological elements of sign language. We demonstrate how these classification results can be used for transcribing sign videos into a written representation for annotation and documentation purposes. Results from our evaluation process indicate the validity of our proposed framework.

References

  1. Steven Aerts, Bart Braem, Katrien Van Mulders, and Kristof De Weerdt. 2004. Searching SignWriting Signs. In Proc. LREC 2004. ELRA, 79--81.Google ScholarGoogle Scholar
  2. Ijaz Akhter, Yaser Sheikh, Sohaib Khan, and Takeo Kanade. 2011. Trajectory Space: A Dual Representation for Nonrigid Structure from Motion. IEEE Trans. Pattern Anal. Mach. Intell. 33, 7 (2011), 1442--1456. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. David Miguel Antunes et al. 2011. A Library for Implementing the Multiple Hypothesis Tracking Algorithm. CoRR abs/1106.2 (2011).Google ScholarGoogle Scholar
  4. V. Ayumi. 2016. Pose-based human action recognition with Extreme Gradient Boosting. In IEEE Conf. SCOReD. 1--5. Google ScholarGoogle ScholarCross RefCross Ref
  5. Guylhem Aznar. 2005. Sign Writing Unicode Support: Using an Assisted Entry Process to Neutralize Sign and Symbol Variability. In Interacting Bodies (ISGS)).Google ScholarGoogle Scholar
  6. Katarzyna Barczewska et al. 2016. Using Components of Corpus Linguistics and Annotation Tools in Sign Language Teaching. IJMECS 2 (2016), 14--21. Google ScholarGoogle ScholarCross RefCross Ref
  7. Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer- Verlag New York, Inc.Google ScholarGoogle Scholar
  8. Mark Borg and Kenneth P. Camilleri. 2015. Multiple Hypothesis Tracking with Sign Language Hand Motion Constraints. In CAIP 2015, Proc., Part II. 207--219. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Mehrez Boulares and Mohamed Jemni. 2012. 3D Motion Trajectory Analysis Approach to Improve Sign Language 3D-based Content Recognition. Procedia Computer Science 13 (2012), 133--143. Google ScholarGoogle ScholarCross RefCross Ref
  10. Alice Caplier, Sébastien Stillittano, et al. 2008. Image and Video for Hearing Impaired People. EURASIP Journal on Image and Video Processing (2008), 1--14.Google ScholarGoogle Scholar
  11. Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. 2, 3, Article 27 (2011), 27 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Charles et al. 2013. Automatic and Efficient Human Pose Estimation for Sign Language Videos. Int. Journal of Computer Vision (2013).Google ScholarGoogle Scholar
  13. J. Charles, T. Pfister, D. Magee, D. Hogg, and A. Zisserman. 2014. Upper Body Pose Estimation with Temporal Sequential Forests. In BMVC 2014. Google ScholarGoogle ScholarCross RefCross Ref
  14. Nitesh V. Chawla et al. 2002. SMOTE: Synthetic Minority Over-sampling Tech- nique. J. Artificial Intelligence Research 16, 1 (June 2002), 321--357.Google ScholarGoogle ScholarCross RefCross Ref
  15. Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Wang Chunli, Gao Wen, and Ma Jiyong. 2002. A real-time large vocabulary recognition system for Chinese Sign Language. In Proc. GW'01. Springer, 86--95.Google ScholarGoogle ScholarCross RefCross Ref
  17. William S. Cleveland, Eric Grosse, and William M. Shyu. 1992. Local regression models. Statistical models in S 2 (1992), 309--376.Google ScholarGoogle Scholar
  18. Onno Crasborn and Han Sloetjes. 2008. Enhanced ELAN functionality for sign language corpora. In Proc. of LREC 2008.Google ScholarGoogle Scholar
  19. Onno Crasborn, E. van der Kooij, A. Nonhebel, and W. Emmerik. 2004. ECHO Data Set for Sign Language of the Netherlands (NGT). (2004).Google ScholarGoogle Scholar
  20. Antônio Carlos da Rocha Costa et al. 2004. A sign matching technique to support searches in sign language texts. In LREC 2004.Google ScholarGoogle Scholar
  21. Antônio Carlos da Rocha Costa and Graçaliz Pereira Dimuro. 2003. SignWriting and SWML: Paving the way to sign language processing. In TALN'03.Google ScholarGoogle Scholar
  22. Konstantinos G. Derpanis, Richard P. Wildes, and John K. Tsotsos. 2008. Definition and Recovery of Kinematic Features for Recognition of American Sign Language Movements. Image Vision Comput. 26, 12 (Dec. 2008), 1650--1662. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Mark Dilsizian, Zhiqiang Tang, et al. 2016. The Importance of 3D Motion Trajec- tories for Computer-based Sign Recognition. In Proc. LREC'16. ELRA.Google ScholarGoogle Scholar
  24. Liya Ding and Aleix M. Martinez. 2009. Modelling and recognition of the linguistic components in American Sign Language. Image Vision Comput. (2009).Google ScholarGoogle Scholar
  25. Wanessa Machado do Amaral and José Mario De Martino. 2010. Towards a Transcription System of Sign Language for 3D Virtual Agents. Springer, 85--90.Google ScholarGoogle Scholar
  26. Philippe Dreuw and Hermann Ney. 2008. Towards automatic sign language annotation for the ELAN tool. In Proc. LREC'08. 50--53.Google ScholarGoogle Scholar
  27. Jerome H. Friedman. 2001. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics 29 (2001), 1189--1232. Google ScholarGoogle ScholarCross RefCross Ref
  28. Matilde Gonzalez et al. 2012. Semi-Automatic Sign Language Corpora Annotation using Lexical Representations of Signs. In Proc. LREC'12.Google ScholarGoogle Scholar
  29. C. Guimarães, J. F. Guardezi, et al. 2014. Deaf Culture and Sign Language Writ- ing System -- A Database for a New Approach to Writing System Recognition Technology. In 47th Hawaii Int. Conf. on System Sciences. 3368--3377.Google ScholarGoogle Scholar
  30. Thomas Hanke. 2004. HamNoSys - Representing Sign Language Data in Language Resources and Language Processing Contexts. In LREC 2004. 1--6.Google ScholarGoogle Scholar
  31. Thomas Hanke and Jakob Storz. 2008. iLex - A Database Tool for Integrating Sign Language Corpus Linguistics and Sign Language Lexicography. In LREC'08.Google ScholarGoogle Scholar
  32. Marek Hrúz et al. 2011. Towards Automatic Annotation of Sign Language Dic- tionary Corpora. Text, Speech and Dialogue (TSD) (2011), 331--339.Google ScholarGoogle Scholar
  33. Rob J. Hyndman and Yanan Fan. 1996. Sample Quantiles in Statistical Packages. The American Statistician 50, 4 (1996), 361--365.Google ScholarGoogle ScholarCross RefCross Ref
  34. Kabil Jaballah and Mohamed Jemni. 2010. Toward Automatic Sign Language Recognition from Web3D Based Scenes. In Computers Helping People with Special Needs (ICCHP 2010), Klaus Miesenberger et al. (Eds.). 205--212. Google ScholarGoogle ScholarCross RefCross Ref
  35. V. Karappa et al. 2014. Detection of sign-language content in video through polar motion profiles. In ICASSP'14. 1290--1294. Google ScholarGoogle ScholarCross RefCross Ref
  36. Khushdeep Kaur and Parteek Kumar. 2016. HamNoSys to SiGML Conversion System for Sign Language Automation. Procedia Computer Science 89 (2016). Google ScholarGoogle ScholarCross RefCross Ref
  37. Oscar Koller, R Bowden, and H Ney. 2016. Automatic Alignment of HamNoSys Subunits for Continuous Sign Language Recognition. In LREC 2016 Proc.Google ScholarGoogle Scholar
  38. Oscar Koller, Jens Forster, and Hermann Ney. 2015. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Comput. Vision Image Understanding 141 (2015), 108--125. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Markus Koskela et al. 2008. Content-Based Video Analysis and Access for Finnish Sign Languag e - A Multidisciplinary Research Project. In Proc. LREC'08.Google ScholarGoogle Scholar
  40. Gan Lu et al. 2010. Hand Motion Recognition and Visualisation for Direct Sign Writing. In Proc. Information Visualisation (IV 2010). IEEE, 467--472. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. C. D. D. Monteiro et al. 2016. Detecting and Identifying Sign Languages through Visual Features. In 2016 IEEE Int. Symposium on Multimedia (ISM). Google ScholarGoogle ScholarCross RefCross Ref
  42. Alexey Natekin and Alois Knoll. 2013. Gradient boosting machines, a tutorial. Front. Neurorobot. (2013).Google ScholarGoogle Scholar
  43. Sylvie C W Ong and Surendra Ranganath. 2005. Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE PAMI 27, 6 (2005), 873--91.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. A. L. Prihodko et al. 2016. Approach to the analysis and synthesis of the sign language. In Proc. APEIE, Vol. 02. 502--505. Google ScholarGoogle ScholarCross RefCross Ref
  45. R Core Team. 2017. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. www.R-project.orgGoogle ScholarGoogle Scholar
  46. Carl Scheffler, Konrad H. Scheffler, and Christian W. Omlin. 2003. Articulated Tree Structure from Motio n -- A Matrix Factorisation Approach. In Proc. Annual Symposium of the Pattern Recognition Association of South Africa (PRASA).Google ScholarGoogle Scholar
  47. Oliver Schreer, Stefano Masneri, et al. 2014. Coding Hand Movement Behavior and Gesture with NEUROGES Supported by Automatic Video Analysis. In Int. Conf. on Methods and Techniques in Behavioral Research.Google ScholarGoogle Scholar
  48. Jianbo Shi and Carlo Tomasi. 1994. Good Features to Track. In 1994 IEEE Conf. on Computer Vision and Pattern Recognition (CVPR'94). 593--600.Google ScholarGoogle Scholar
  49. Frank Shipman, Ricardo Gutierrez-Osuna, et al. 2015. Towards a Distributed Digital Library for Sign Language Content. In JCDL '15. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Robert Smith. 2013. HamNoSys 4.0 User Guide. Technical Report. Institute of Technology Blanchardstown Ireland.Google ScholarGoogle Scholar
  51. D. Stiehl et al. 2015. Towards a SignWriting recognition system. In Proc. Int. Conf. on Document Analysis and Recognition (ICDAR). 26--30. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Valerie Sutton. 1980. A way to analyze American Sign Language and any other Sign Language without translation into any spoken language. In National Sym- posium on Sign Language Research and Teaching.Google ScholarGoogle Scholar
  53. S. Tamura and S. Kawasaki. 1988. Recognition of sign language motion images. Pattern Recognit. 21, 4 (1988), 343--353. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. S. Theodorakis, V. Pitsikalis, I. Rodomagoulakis, and P. Maragos. 2012. Recogni- tion with raw canonical phonetic movement and handshape subunits on videos of continuous Sign Language. In IEEE Int. Conf. on Image Processing. 1413--1416.Google ScholarGoogle Scholar
  55. Carlo Tomasi and Takeo Kanade. 1992. Shape and Motion from Image Streams Under Orthography: A Factorization Method. Int. J. Comput. Vision 9, 2 (1992). Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. P. Vijayalakshmi and M. Aarthi. 2016. Sign language to speech conversion. In 2016 Int. Conf. on Recent Trends in Information Technology (ICRTIT). 1--6. Google ScholarGoogle ScholarCross RefCross Ref
  57. Paul Viola and Michael J. Jones. 2001. Robust Real-time Object Detection. Int. J. Comput. Vision (2001).Google ScholarGoogle Scholar
  58. Ulrich von Agris et al. 2008. Rapid signer adaptation for continuous sign language recognition using a combined approach of eigenvoices, MLLR, and MAP. In CPR 2008. IEEE, 1--4.Google ScholarGoogle Scholar
  59. Ulrich von Agris, Jörg Zieren, et al. 2007. Recent developments in visual sign language recognition. Universal Access in the Information Society (2007).Google ScholarGoogle Scholar
  60. M. Wimmer and B. Radig. 2005. Adaptive skin color classificator. In Proc GVIP'05.Google ScholarGoogle Scholar
  61. Shilin Zhang and Bo Zhang. 2010. Trajectory based sign language video retrieval using revised string edit distance. In Proc. MINES'10. IEEE, 17--22. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards a Transcription System of Sign Language Video Resources via Motion Trajectory Factorisation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          DocEng '17: Proceedings of the 2017 ACM Symposium on Document Engineering
          August 2017
          242 pages
          ISBN:9781450346894
          DOI:10.1145/3103010

          Copyright © 2017 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 31 August 2017

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          DocEng '17 Paper Acceptance Rate13of71submissions,18%Overall Acceptance Rate178of537submissions,33%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader