Abstract
The development of Neural Machine Translation (NMT) systems has attained prominent position in language translation tasks. However, it faces huge challenges in translating the new words and out-of-vocabularies. This problem is identified as a major drawback of conventional NMT systems in language translation results more copied outputs. In addition to that, it places the risks in understanding multilingual language structures and word relationships. In this paper, we propose novel deep stacked GRU algorithm based NMT System to address the aforementioned challenges and handles multilingual sentences based translation tasks efficiently. We aimed to develop the proposed model for translating the spoken sentences into sign words. The generated sign words (glosses) are mapped with sign gesture images to automate the sign gesture video generation process using deep generative models. The proposed hybrid NMT model has been evaluated qualitatively and quantitatively using different benchmark sign language datasets. The improved BLEU Score shows the outperformance of our model compared with earlier approaches. We also evaluated the proposed model using our self created Indian sign language corpus (ISL-CSLTR). The final result shows the achievement of greater translation results with minimal processing cost.










source sentence ‘can you repeat that please’, target sign gloss ‘YOU REPEAT PLEASE’

Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Amodei D, Ananthanarayanan S, Anubhai R, Bai J, Battenberg E, Case C, Zhu Z (2016) Deep speech 2: end-to-end speech recognition in English and Mandarin. In: International conference on machine learning, pp 173–182. PMLR
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv: arXiv: 1409.0473
Bantupalli K, Xie Y (2018) American sign language recognition using deep learning and computer vision. In: 2018 IEEE international conference on big data (big data), pp 4896–4899. IEEE
Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155
Bheda, V, Radpour D (2017) Using deep convolutional networks for gesture recognition in American sign language. arXiv preprint arXiv: arXiv: 1710.06836
Camgoz NC, Hadfield S, Koller O, Ney H, Bowden R (2018) Neural sign language translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7784–7793
Carpuat M, Wu D (2007) Improving statistical machine translation using word sense disambiguation. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 61–72
Chiang D (2005) A hierarchical phrase-based model for statistical machine translation. In: Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL’05), pp 263–270
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv: arXiv: 1406.1078
Cho K, Van Merriënboer B, Bahdanau D, Bengio Y (2014) On the properties of neural machine translation: encoder-decoder approaches. arXiv preprint arXiv: arXiv: 1409.1259
Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2625–2634
Duarte A, Palaskar S, Ventura L, Ghadiyaram D, DeHaan K, Metze F, Giro-i-Nieto X (2021) How2Sign: a large-scale multimodal dataset for continuous American sign language. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition pp 2735–2744
Elakkiya R (2021) Machine learning based sign language recognition: a review and its research frontier. J Ambient Intell Hum Comput 12(7):7205–7224
Elakkiya R, Natarajan B (2021) ISL-CSLTR: Indian sign language dataset for continuous sign language translation and recognition. Mendeley Data. https://doi.org/10.17632/kcmpdxky7p.1
Elakkiya R, Selvamani K (2018) Enhanced dynamic programming approach for subunit modelling to handle segmentation and recognition ambiguities in sign language. J Parallel Distributed Comput 117:246–255
Elakkiya R, Selvamani K (2019) Subunit sign modeling framework for continuous sign language recognition. Comput Electr Eng 74:379–390
Graves A, Fernández S, Gomez F, & Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: Proceedings of the 23rd international conference on machine learning, pp 369–376
Graves A, Mohamed AR, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE international conference on acoustics, speech and signal processing, pp 6645–6649. IEEE
Guo D, Zhou W, Li H, Wang M (2018) Hierarchical LSTM for sign language translation. In: Proceedings of the AAAI conference on artificial intelligence, vol 32(1)
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Kalchbrenner N, Blunsom P (2013) Recurrent continuous translation models. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1700–1709
Ko SK, Kim CJ, Jung H, Cho C (2019) Neural sign language translation based on human keypoint estimation. Appl Sci 9(13):2683
Koehn P, Och FJ, Marcu D (2003) Statistical phrase-based translation. University of Southern California Marina del Rey Information Sciences Institute
Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th annual meeting of the association for computational linguistics companion volume proceedings of the demo and poster sessions, pp 177–180
Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. In: MT summit, vol 5, pp 79–86
Koller O, Forster J, Ney H (2015b) Continuous sign language recognition: towards large vocabulary statistical recognition systems handling multiple signers. Comput vis Image Underst 141:108–125
Koller O, Ney H, Bowden R (2015) Deep learning of mouth shapes for sign language. In: Proceedings of the IEEE international conference on computer vision workshops, pp 85–91
Koller O, Zargaran O, Ney H, Bowden R (2016) Deep sign: hybrid CNN-HMM for continuous sign language recognition. In: Proceedings of the British machine vision conference 2016
Konstantinidis D, Dimitropoulos K, Daras P (2018) A deep learning approach for analyzing video and skeletal features in sign language recognition. In: 2018 IEEE international conference on imaging systems and techniques (IST), pp 1–6. IEEE
Kudo T, Richardson J (2018) Sentencepiece: a simple and language independent subword tokenizer and detokenizer for neural text processing. arXiv preprint arXiv: arXiv: 1808.06226
Kudo T (2018) Subword regularization: Improving neural network translation models with multiple subword candidates. arXiv preprint arXiv: arXiv: 1804.10959
Luong MT, Pham, H, Manning CD (2015) Effective approaches to attention-based neural machine translation. arXiv preprint arXiv: arXiv: 1508.04025
Neubig G (2017) Neural machine translation and sequence-to-sequence models: a tutorial. arXiv preprint arXiv: arXiv: 1703.01619
Ong SC, Ranganath S (2005) Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE Trans Pattern Anal Mach Intell 27(06):873–891
Provilkov I, Emelianenko D, Voita E (2019) Bpe-dropout: simple and effective subword regularization. arXiv preprint arXiv: arXiv: 1910.13267
Pu J, Zhou W, Li H (2019) Iterative alignment network for continuous sign language recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4165–4174
Pust M, Hermjakob U, Knight K, Marcu D, May J (2015) Parsing English into abstract meaning representation using syntax-based machine translation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1143–1154
Sennrich R, Haddow B, Birch A (2015) Neural machine translation of rare words with subword units. arXiv preprint arXiv: arXiv: 1508.07909
Simard M, Ueffing N, Isabelle P, Kuhn R (2007) Rule-based translation with statistical phrase-based post-editing. In: Proceedings of the second workshop on statistical machine translation, pp 203–206
Stoll S, Camgoz NC, Hadfield S, Bowden R (2020) Text2Sign: towards sign language production using neural machine translation and generative adversarial networks. Int J Comput vis 128(4):891–908
Stoll S, Camgöz NC, Hadfield S, Bowden R (2018) Sign language production using neural machine translation and generative adversarial networks. In: Proceedings of the 29th British machine vision conference (BMVC 2018). University of Surrey
Sutskever I, Vinyal O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Tu Z, Lu Z, Liu Y, Liu X, Li H (2016) Modeling coverage for neural machine translation. arXiv preprint arXiv: arXiv: 1601.04811
Utiyama M, Isahara H (2007) A comparison of pivot methods for phrase-based statistical machine translation. In: Human language technologies 2007: the conference of the North American chapter of the association for computational linguistics; proceedings of the main conference, pp 484–491
Vaswani A, Bengio S, Brevdo E, Chollet F, Gomez AN, Gouws S, Uszkoreit J (2018) Tensor2tensor for neural machine translation. arXiv preprint arXiv: 1803.07416
Wang W, Knight K, Marcu D (2007) Binarizing syntax trees to improve syntax-based machine translation accuracy. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp 746–754
Wang X, Lu Z, Tu Z, Li H, Xiong D, Zhang M (2017) Neural machine translation advised by statistical machine translation. In: Thirty-first AAAI conference on artificial intelligence
Wang S, Guo D, Zhou WG, Zha ZJ, Wang M (2018) Connectionist temporal fusion for sign language translation. In: Proceedings of the 26th ACM international conference on multimedia, pp 1483–1491
Wołk K, Marasek K (2015) Neural-based machine translation for medical text domain. based on European Medicines Agency leaflet texts. Proc Comput Sci 64:2–9
Wu Y, Schuster M, Chen Z, Le QV, Norouzi M, Macherey W, Dean J (2016) Google's neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv: 1609.08144
Acknowledgements
The research was funded by the Science and Engineering Research Board (SERB), India under Start-up Research Grant (SRG)/2019–2021 (Grant no. SRG/2019/001338). We would like to thank Navajeevan, Residential School for the Deaf, College of Spl. D.Ed & B.Ed, Vocational Centre, and Child Care & Learning Centre, Ayyalurimetta, Nandyal, Andhra Pradesh, India for their support and also, we thank all the students for their contribution in collecting the sign videos and the successful completion of the ISL-CSLTR corpus.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Natarajan, B., Elakkiya, R. & Prasad, M.L. Sentence2SignGesture: a hybrid neural machine translation network for sign language video generation. J Ambient Intell Human Comput 14, 9807–9821 (2023). https://doi.org/10.1007/s12652-021-03640-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-021-03640-9