Abstract
In this paper we present a recommendation system for (semi-)automatic annotation of sign language videos exploiting deep learning techniques, which handle handshape recognition in continuous signing data. Major tools in our approach have been the keypoint output of OpenPose and the use of HamNoSys in sign annotation of the training data. Prior to application on signed phrases, we tested our method with recognition of hand shape, hand location and palm orientation in isolated signs using two lexical datasets. The system has been trained on the Danish Sign Language lexicon and has also been applied to POLYTROPON, a lexicon of the Greek Sign Language (GSL), for which we received satisfactory recognition results. Experimentation with the POLYTROPON corpus of GSL phrases, has provided results which verify that our approach exhibits satisfactory accuracy rates. Thus, it can be exploited in a recommendation system for semi-automatic annotation of isolated signs and signed phrases in big SL video data, also contributing towards the development of further datasets for machine learning training.
Similar content being viewed by others
References
Bauer B, Hienz H (2000, March) Relevant features for video-based continuous sign language recognition. In Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580). IEEE, pp 440–445
Buehler P, Zisserman A, Everingham M (2009, June) Learning sign language by watching TV (using weakly aligned subtitles). In: Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 2961–2968
Camgoz NC, Hadfield S, Koller O, Bowden R (2017, October) Subunets: End-to-end hand shape and continuous sign language recognition. In: 2017 IEEE International conference on computer vision (ICCV). IEEE, pp 3075–3084
Camgoz NC, Koller O, Hadfield S, Bowden R (2020) Sign language transformers: Joint end-to-end sign language recognition and translation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10023–10033
Cao Z, Hidalgo G, Simon T, Wei SE, Sheikh Y (2019) OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans Pattern Anal Mach Intell 43(1):172–186
Cao Z, Simon T, Wei SE, Sheikh Y (2018) OpenPose: real-time multi-person keypoint detection library for body, face, and hands estimation
Choudhury A, Talukdar AK, Bhuyan MK, Sarma KK (2017) Movement epenthesis detection for continuous sign language recognition. J Intell Syst 26(3):471–481
Cui R, Liu H, Zhang C (2017) Recurrent convolutional neural networks for continuous sign language recognition by staged optimization. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7361–7369
Cui R, Liu H, Zhang C (2019) A deep neural framework for continuous sign language recognition by iterative training. IEEE Trans Multimed 21(7):1880–1891
de Amorim CC, Macêdo D, Zanchettin C (2019, September) Spatial-temporal graph convolutional networks for sign language recognition. In: International conference on artificial neural networks. Springer, Cham, pp 646–657
De Coster M, Van Herreweghe M, Dambre J (2019) Towards automatic sign language corpus annotation using deep learning. In: 6th workshop on sign language translation and avatar technology
Efthimiou E, Dimou AL, Fotinea SE, Goulas T, Pissaris M (2014) SiS-builder: a tool to support sign synthesis. In: Proceedings of the second international conference on the use of new technologies for inclusive learning, York, UK, pp 26–36
Efthimiou E, Fotinea SE, Dimou AL, Goulas T, Karioris P, Vasilaki K, Vacalopoulou A, Pissaris M, Korakakis D (2016, May) From a sign lexical database to an SL golden corpus–the POLYTROPON SL resource. In: Proceedings of 7th workshop on the representation and processing of sign languages: Corpus Mining (LREC-2016), Portorož, Slovenia, pp 63–68
Efthimiou E, Vasilaki K, Fotinea SE, Vacalopoulou A, Goulas T, Dimou AL (2018) The POLYTROPON parallel corpus. In: Proceedings of international conference on language resources and evaluation (LREC)
Goulas T, Fotinea SE, Efthimiou E, Pissaris M (2010, May) SiS-Builder: a sign synthesis support tool. In: LREC, pp 102–105
Hanke T (2004) May) HamNoSys-representing sign language data in language resources and language processing contexts. LREC 4:1–6
Ko SK, Kim CJ, Jung H, Cho C (2019) Neural sign language translation based on human keypoint estimation. Appl Sci 9(13):2683
Ko SK, Son JG, Jung H (2018, October) Sign language recognition with recurrent neural network using human keypoint detection. In: Proceedings of the 2018 conference on research in adaptive and convergent systems, pp 326–328
Koller O, Forster J Ney H (2015) Continuous sign language recognition: towards large vocabulary statistical recognition systems handling multiple signers. Computer Vision and Image Understanding, vol 141, pp 108–125
Koller O, Zargaran O, Ney H, Bowden R (2016) Deep sign: Hybrid CNN-HMM for continuous sign language recognition. In: Proceedings of the British Machine Vision Conference 2016
Konstantinidis D, Dimitropoulos K, Daras, P. (2018, June). Sign language recognition based on hand and body skeletal data. In: 2018–3DTV-conference: the true vision-capture, transmission and display of 3D video (3DTV-CON). IEEE, pp 1–4
Li D, Rodriguez C, Yu X, Li H (2020) Word-level deep sign language recognition from video: a new large-scale dataset and methods comparison. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1459–1469
Neves, C, Coheur L, Nicolau H (2020, May) HamNoSyS2SiGML: translating HamNoSys into SiGML. In: Proceedings of The 12th language resources and evaluation conference, pp 6035–6039
Pitsikalis V, Theodorakis S, Vogler C, Maragos P (2011, June) Advances in phonetics-based sub-unit modeling for transcription alignment and sign language recognition. In: CVPR 2011 WORKSHOPS. IEEE, pp 1–6
Sang H, Wu H (2016, October) A sign language recognition system in complex background. In: Chinese conference on biometric recognition. Springer, Cham. pp 453–461
Simon T, Joo H, Matthews I, Sheikh Y (2017) Hand keypoint detection in single images using multiview bootstrapping. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1145–1153
Starner T, Weaver J, Pentland A (1998) Real-time american sign language recognition using desk and wearable computer based video. IEEE Trans Pattern Anal Mach Intell 20(12):1371–1375
Theodorakis S, Pitsikalis V, Maragos P (2014) Dynamic–static unsupervised sequentiality, statistical subunits and lexicon for sign language recognition. Image Vis Comput 32(8):533–549
Wei SE, Ramakrishna V, Kanade T, Sheikh Y (2016) Convolutional pose machines. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4724–4732
Yan C, Li X, Li G (2020) A new action recognition framework for video highlights summarization in sporting events. arXiv:2012.00253
Zaki MM, Shaheen SI (2011) Sign language recognition using a combination of new vision based features. Pattern Recogn Lett 32(4):572–577
Zhang J, Zhou W, Xie C, Pu J, Li H (2016, July) Chinese sign language recognition with adaptive HMM. In: 2016 IEEE international conference on multimedia and expo (ICME). IEEE, pp 1–6
Acknowledgement
This work was supported by the project “Computational Science and Technologies: Data, Content and Interaction” (MIS 5002437), implemented under the Action “Reinforcement of the Research and Innovation Infrastructure”, funded by the Operational Programme “Competitiveness Entrepreneurship and Innovation” (NSRF 2014-2020) and co-financed by Greece and the European Union (European Regional Development Fund). Deep thanks to Center for Tegnsprog for providing the Danish Sign Language dataset we used.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary file1 (MP4 2421 kb)
Rights and permissions
About this article
Cite this article
Koulierakis, I., Siolas, G., Efthimiou, E. et al. Sign boundary and hand articulation feature recognition in Sign Language videos. Machine Translation 35, 323–343 (2021). https://doi.org/10.1007/s10590-021-09271-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10590-021-09271-3