Skip to main content

Understanding Motion in Sign Language: A New Structured Translation Dataset

  • Conference paper
  • First Online:
Computer Vision – ACCV 2020 (ACCV 2020)

Abstract

Sign languages are the main mechanism of communication and interaction in the Deaf community. These languages are highly variable in communication with divergences between gloss representation, sign configuration, and multiple variants, among others, due to cultural and regional aspects. Current methods for automatic and continuous sign translation include robust and deep-learning models that encode the visual signs representation. Despite the significant progress, the convergence of such models requires huge amounts of data to exploit sign representation, resulting in very complex models. This fact is associated to the highest variability but also to the shortage exploration of many language components that support communication. For instance, gesture motion and grammatical structure are fundamental components in communication, which can deal with visual and geometrical sign misinterpretations during video analysis. This work introduces a new Colombian sign language translation dataset (CoL-SLTD), that focuses on motion and structural information, and could be a significant resource to determine the contribution of several language components. Additionally, an encoder-decoder deep strategy is herein introduced to support automatic translation, including attention modules that capture short, long, and structural kinematic dependencies and their respective relationships with sign recognition. The evaluation in CoL-SLTD proves the relevance of the motion representation, allowing compact deep architectures to represent the translation. Also, the proposed strategy shows promising results in translation, achieving Bleu-4 scores of 35.81 and 4.65 in signer independent and unseen sentences tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For training, Adam optimizer was selected with a learning rate of 0.0001 and decay of 0.1 every 10 epochs. Also, batches of 1 sample and a dropout of 0.2 in dense and recurrent layers were herein configured. The convolutional weight decay was set to 0.0005 and gradient clipping with a threshold of 5 was also used.

References

  1. WM Centre: Deafness and hearing loss (2020) Visited 28 April 2020

    Google Scholar 

  2. WM Centre: Our work (2020) Visited 28 April 2020

    Google Scholar 

  3. Joze, H.R.V., Koller, O.: Ms-asl: A large-scale data set and benchmark for understanding american sign language. arXiv preprint arXiv:1812.01053 (2018)

  4. Li, D., Rodriguez, C., Yu, X., Li, H.: Word-level deep sign language recognition from video: a new large-scale dataset and methods comparison. In: The IEEE Winter Conference on Applications of Computer Vision, pp. 1459–1469 (2020)

    Google Scholar 

  5. Koller, O., Forster, J., Ney, H.: Continuous sign language recognition: towards large vocabulary statistical recognition systems handling multiple signers. Comput. Vis. Image Underst. 141, 108–125 (2015)

    Article  Google Scholar 

  6. Cihan Camgoz, N., Hadfield, S., Koller, O., Ney, H., Bowden, R.: Neural sign language translation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7784–7793 (2018)

    Google Scholar 

  7. Ko, S.K., Kim, C.J., Jung, H., Cho, C.: Neural sign language translation based on human keypoint estimation. arXiv preprint arXiv:1811.11436 (2018)

  8. Guo, D., Zhou, W., Li, A., Li, H., Wang, M.: Hierarchical recurrent deep fusion using adaptive clip summarization for sign language translation. IEEE Trans. Image Process. 29, 1575–1590 (2019)

    Article  MathSciNet  Google Scholar 

  9. Athitsos, V., Neidle, C., Sclaroff, S., Nash, J., Stefan, A., Yuan, Q., Thangali, A.: The american sign language lexicon video dataset. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, IEEE, pp. 1–8 (2008)

    Google Scholar 

  10. Ronchetti, F., Quiroga, F., Estrebou, C., Lanzarini, L., Rosete, A.: Lsa64: a dataset of argentinian sign language. In: XX II Congreso Argentino de Ciencias de la Computación (CACIC) (2016)

    Google Scholar 

  11. Von Agris, U., Kraiss, K.F.: Towards a video corpus for signer-independent continuous sign language recognition. In: Gesture in Human-Computer Interaction and Simulation, Lisbon, May 2007

    Google Scholar 

  12. Forster, J., Schmidt, C., Hoyoux, T., Koller, O., Zelle, U., Piater, J.H., Ney, H.: Rwth-phoenix-weather: a large vocabulary sign language recognition and translation corpus. LREC 9, 3785–3789 (2012)

    Google Scholar 

  13. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp. 3104–3112 (2014)

    Google Scholar 

  14. Huang, J., Zhou, W., Zhang, Q., Li, H., Li, W.: Video-based sign language recognition without temporal segmentation. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  15. Guo, D., Zhou, W., Li, H., Wang, M.: Hierarchical lstm for sign language translation. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  16. Guo, D., Wang, S., Tian, Q., Wang, M.: Dense temporal convolution network for sign language translation. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, AAAI Press, pp. 744–750 (2019)

    Google Scholar 

  17. Song, P., Guo, D., Xin, H., Wang, M.: Parallel temporal encoder for sign language translation. In: 2019 IEEE International Conference on Image Processing (ICIP), IEEE, pp. 1915–1919 (2019)

    Google Scholar 

  18. Wei, C., Zhou, W., Pu, J., Li, H.: Deep grammatical multi-classifier for continuous sign language recognition. In: 2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM), IEEE, pp. 435–442 (2019)

    Google Scholar 

  19. Martínez, A.M., Wilbur, R.B., Shay, R., Kak, A.C.: Purdue rvl-slll asl database for automatic recognition of american sign language. In: Proceedings. Fourth IEEE International Conference on Multimodal Interfaces, IEEE, pp. 167–172 (2002)

    Google Scholar 

  20. Dreuw, P., Rybach, D., Deselaers, T., Zahedi, M., Ney, H.: Speech recognition techniques for a sign language recognition system. In: Interspeech, Antwerp, Belgium, pp. 2513–2516 (2007). ISCA best student paper award Interspeech 2007

    Google Scholar 

  21. Stokoe, W.C.: Sign language structure. Annu. Rev. Anthropol. 9, 365–390 (1980)

    Article  Google Scholar 

  22. Sandler, W.: The phonological organization of sign languages. Lang. Linguist. compass 6, 162–182 (2012)

    Article  Google Scholar 

  23. Supalla, T.: The classifier system in american sign language. Noun Classes Categorization 7, 181–214 (1986)

    Article  Google Scholar 

  24. Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 500–513 (2010)

    Article  Google Scholar 

  25. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting on Association for computational linguistics, Association for Computational Linguistics, pp. 311–318 (2002)

    Google Scholar 

  26. Lin, C.Y.: ROUGE: A package for automatic evaluation of summaries. In: Out, T.S.B. (ed.) Barcelona, pp. 74–81. Association for Computational Linguistics, Spain (2004)

    Google Scholar 

  27. Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008 (2017)

    Google Scholar 

Download references

Acknowledgments

This work was partially funded by the Universidad Industrial de Santander. The authors acknowledge the Vicerrectoriá de Investigación y Extensión (VIE) of the Universidad Industrial de Santander for supporting this research registered by the project: Reconocimiento continuo de expresiones cortas del lenguaje de señas, with SIVIE code 1293. Also, we gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan V GPU used for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fabio Martínez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rodríguez, J. et al. (2021). Understanding Motion in Sign Language: A New Structured Translation Dataset. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12627. Springer, Cham. https://doi.org/10.1007/978-3-030-69544-6_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69544-6_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69543-9

  • Online ISBN: 978-3-030-69544-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics