Abstract
Sign language recognition is considered the most important and challenging application in gesture recognition, involving the fields of pattern recognition, machine learning and computer vision. This is mainly due to the complex visual–gestural nature of sign languages and the availability of few databases and studies related to automatic recognition. This work presents the development and validation of a Brazilian sign language (Libras) public database. The recording protocol describes (1) the chosen signs, (2) the signaller characteristics, (3) the sensors and software used for video acquisition, (4) the recording scenario and (5) the data structure. Provided that these steps are well defined, a database with more than 1000 videos of 20 Libras signs recorded by twelve different people is created using an RGB-D sensor and an RGB camera. Each sign was recorded five times by each signaller. This corresponds to a database with 1200 samples of the following data: (1) RGB video frames, (2) depth, (3) body points and (4) face information. Some approaches using deep learning-based models were applied to classify these signs based on 3D and 2D convolutional neural networks. The best result shows an average accuracy of 93.3%. This paper presents an important contribution for the research community by providing a publicly available sign language dataset and baseline results for comparison.

Adapted from [50]





















Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Al-Hammadi M, Muhammad G, Abdul W, Alsulaiman M, Hossain MS (2019) Hand gesture recognition using 3D-CNN model. IEEE Consum Electron Mag 9(1):95–101
Al-Rousan M, Assaleh K, Tala’a A (2009) Video-based signer-independent Arabic sign language recognition using hidden Markov models. Appl Soft Comput 9(3):990–999
Almeida SGM (2014) Extração de características em reconhecimento de parâmetros fonológicos da Língua Brasileira de Sinais utilizando sensores RGB-D, Ph.D. thesis. Universidade Federal de Minas Gerais. https://www.ppgee.ufmg.br/defesas/303D.PDF. Accessed 03 Mar 2020. (in Portuguese)
Almeida SGM (2014) Libras-34 Dataset (Kinect v1). Zenodo. https://doi.org/10.5281/zenodo.4451526. Accessed 03 Sept 2020
Almeida SGM, Guimarães FG, Ramírez JA (2014) Feature extraction in Brazilian sign language recognition based on phonological structure and using RGB-D sensors. Expert Syst Appl 41(16):7259–7271
Almeida SGM, Guimarães FG, Ramírez JA (2015) Um método para sumarização de vídeos baseado no problema da diversidade máxima e em algoritmos evolucionários. In: XII Simpósio Brasileiro de Automação Inteligente (SBAI), Natal, Rio Grande do Norte, Brasil, pp 1298–1303 (in Portuguese)
Almeida SGM, Rezende TM, Toffolo ACR, Castro CL (2016) Libras-10 Dataset. https://doi.org/10.5281/zenodo.3229958
Aran O, Akarun L (2010) A multi-class classification strategy for Fisher scores: application to signer independent sign language recognition. Pattern Recogn 43(5):1776–1788
Aran O, Ari I, Akarun L, Sankur B, Benoit A, Caplier A, Campr P, Carrillo AH et al (2009) Signtutor: An interactive system for sign language tutoring. IEEE Multimed 81–93
Assaleh K (2005) Al-Rousan M (2005) Recognition of Arabic sign language alphabet using polynomial classifiers. EURASIP J Adv Signal Process 13:507614
Athira P, Sruthi C, Lijiya A (2019) A signer independent sign language recognition with co-articulation elimination from live videos: an Indian scenario. J King Saud Unive-Comput Inf Sci
Athitsos V, Neidle C, Sclaroff S (2008) American sign language Lexicon video dataset (ASLLVD). http://vlm1.uta.edu/~athitsos/asl_lexicon/. Accessed 24 July 2020
Athitsos V, Neidle C, Sclaroff S, Nash J, Stefan A, Yuan Q, Thangali A (2008) The American sign language Lexicon video dataset. In: 2008 IEEE Computer Society conference on computer vision and pattern recognition workshops. IEEE, pp 1–8
Azar SG, Seyedarabi H (2020) Trajectory-based recognition of dynamic Persian sign language using hidden Markov model. Comput Speech Lang 61:101053
Beena M, Namboodiri A, Thottungal R (2020) Hybrid approaches of convolutional network and support vector machine for American sign language prediction. Multimed Tools Appl 79(5):4027–4040
Ben Tamou A, Ballihi L, Aboutajdine D (2017) Automatic learning of articulated skeletons based on mean of 3D joints for efficient action recognition. Int J Pattern Recogn Artif Intell 31(04):1750008
Bloom V, Argyriou V, Makris D (2016) Hierarchical transfer learning for online recognition of compound actions. Computer Vis Image Underst 144:62–72
Capovilla FC, Raphael WD, Temoteo JG, Martins AC (2017) Dicionário da Língua Brasileira do Brasil: A Libras em suas mãos, Volume I: Sinais de A a D., vol 1, 1st edn. Edusp, São Paulo (in Portuguese)
Capovilla FC, Raphael WD, Temoteo JG, Martins AC (2017) Dicionário da Língua Brasileira do Brasil: A Libras em suas mãos, Volume I: Sinais de A a D., vol 2, 1st edn. Edusp, São Paulo, Brasil (in Portuguese)
Capovilla FC, Raphael WD, Temoteo JG, Martins AC (2017) Dicionário da Língua Brasileira do Brasil: A Libras em suas mãos, Volume I: Sinais de P a Z., vol 3, 1st edn. Edusp, São Paulo, Brasil (in Portuguese)
Cardenas EE, Chavez GC (2020) Multimodal hand gesture recognition combining temporal and pose information based on CNN descriptors and histogram of cumulative magnitudes. J Vis Commun Image Represent 102772
Caselli NK, Sehyr ZS, Cohen-Goldberg AM, Emmorey K (2017) ASL-LEX: a lexical database of American sign language. http://asl-lex.org/. Accessed 13 May 2020
Castro GZ, Guerra RR, Assis MM, Rezende TM, Almeida GTB, Almeida SGM, Castro CL, Guimarães FG (2019) Desenvolvimento de uma base de dados de sinais de LIBRAS para aprendizado de máquina: estudo de caso com CNN 3D. In: 14\(^{a}\) Simpósio Brasileiro de Automação Inteligente, SBA, https://doi.org/10.17648/sbai-2019-111451(in Portuguese)
Chadha A, Andreopoulos Y (2019) Improved techniques for adversarial discriminative domain adaptation. IEEE Trans Image Process 29:2622–2637
Chen FS, Fu CM, Huang CL (2003) Hand gesture recognition using a real-time tracking method and hidden Markov models. Image Vis Comput 21(8):745–758
Conly C, Doliotis P, Jangyodsuk P, Alonzo R, Athitsos V (2013) Toward a 3D body part detection video dataset and hand tracking benchmark. In: Proceedings of the 6th international conference on PErvasive technologies related to assistive environments. ACM, p 2
Cui R, Liu H, Zhang C (2019) A deep neural framework for continuous sign language recognition by iterative training. IEEE Trans Multimed 21(7):1880–1891
Cui Y, Weng J (2000) Appearance-based hand sign recognition from intensity image sequences. Comput Vis Image Underst 78(2):157–176
Dorner B, Hagen E (1994) Towards an American sign language interface. In: Integration of natural language and vision processing. Springer, Berlin, pp 143–161
Dreuw P, Rybach D, Deselaers T, Zahedi M, Ney H (2007) RWTH-BOSTON-104 Database. http://www-i6.informatik.rwth-aachen.de/~dreuw/database-rwth-boston-104.php. Accessed 01 April 2020
Elakkiya R, Selvamani K (2017) Extricating manual and non-manual features for subunit level medical sign modelling in automatic sign language classification and recognition. J Med Syst 41(11):175
Elons AS, Abull-ela M, Tolba MF (2013) Neutralizing lighting non-homogeneity and background size in PCNN image signature for arabic sign language recognition. Neural Comput Appl 22(1):47–53
Escalera S, Gonzàlez J, Baró X, Reyes M, Lopes O, Guyon I, Athitsos V, Escalante H (2013) Multi-modal gesture recognition challenge 2013: dataset and results. In: Proceedings of the 15th ACM on international conference on multimodal interaction. ACM, pp 445–452
Escalera S, Athitsos V, Guyon I (2017) Challenges in multi-modal gesture recognition. In: Gesture recognition. Springer, Berlin, pp 1–60
Escobedo-Cardenas E, Camara-Chavez G (2015) A robust gesture recognition using hand local data and skeleton trajectory. In: 2015 IEEE international conference on image processing (ICIP). IEEE, pp 1240–1244
Fagiani M, Principi E, Squartini S, Piazza F (2012) A new Italian sign language database. In: International conference on brain inspired cognitive systems. Springer, Berlin, pp 164–173
Fagiani M, Principi E, Squartini S, Piazza F (2015) Signer independent isolated Italian sign recognition based on hidden Markov models. Pattern Anal Appl 18(2):385–402
Felipe TA (2009) Libras em Contexto: Curso Básico: Livro do Estudante, 9th edn. WalPrint Gráfica Editora, Rio de Janeiro (in Portuguese)
Filho CFFC, de Souza RS, dos Santos JR, dos Santos BL, Costa MGF (2017) A fully automatic method for recognizing hand configurations of Brazilian sign language. Res Biomed Eng 33(1):78–89. https://doi.org/10.1590/2446-4740.03816
Forster J, Schmidt C, Hoyoux T, Koller O, Zelle U, Piater JH, Ney H (2012) RWTH-PHOENIX-Weather. http://www-i6.informatik.rwth-aachen.de/~forster/database-rwth-phoenix.php. Accessed 13 May 2020
Freitas F, Barbosa F, Peres S (2014) Grammatical facial expressions data set. https://archive.ics.uci.edu/ml/datasets/Grammatical+Facial+Expressions. Accessed 16 Aug 2020
Freitas FA, Peres SM, Lima CAM, Barbosa FV (2014) Grammatical facial expressions recognition with machine learning. In: The Twenty-seventh international flairs conference
Guerra RR, Rezende TM, Guimaraes FG, Almeida SGM (2018) Facial expression analysis in brazilian sign language for sign recognition. In: Anais do XV Encontro Nacional de Inteligência Artificial e Computacional, SBC, pp 216–227 (in Portuguese)
Guo D, Zhou W, Li A, Li H, Wang M (2019) Hierarchical recurrent deep fusion using adaptive clip summarization for sign language translation. IEEE Trans Image Process 29:1575–1590
Hadfield S, Bowden R (2013) Scene particles: unregularized particle-based scene flow estimation. IEEE Trans Pattern Anal Machine Intell 36(3):564–576
Hasan MM, Misra PK (2011) Brightness factor matching for gesture recognition system using scaled normalization. Int J Comput Sci Inf Technol 3(2):35–46
Hassan M, Assaleh K, Shanableh T (2019) Multiple proposals for continuous Arabic sign language recognition. Sens Imaging 20(1):4
Hisham B, Hamouda A (2019) Supervised learning classifiers for Arabic gestures recognition using Kinect V2. SN Appl Sci 1(7):768
Holden EJ, Lee G, Owens R (2005) Australian sign language recognition. Mach Vis Appl 16(5):312
Honora M, Frizanco MLE (2010) Livro Ilustrado de Língua Brasileira de Sinais: Desvendando a Comunicação Usada Pelas Pessoas com Surdez. Ciranda Cultural, São Paulo (in Portuguese)
Ibrahim NB, Selim MM, Zayed HH (2018) An automatic Arabic sign language recognition system (ArSLRS). J King Saud Univ-Comput Inf Sci 30(4):470–477
Imran J, Raman B (2020) Deep motion templates and extreme learning machine for sign language recognition. Vis Comput 36(6):1233–1246
Infantino I, Rizzo R, Gaglio S (2007) A framework for sign language sentence recognition by commonsense context. IEEE Trans Syst Man Cybern Part C (Appl Rev) 37(5):1034–1039
Jadooki S, Mohamad D, Saba T, Almazyad AS, Rehman A (2017) Fused features mining for depth-based hand gesture recognition to classify blind human communication. Neural Comput Appl 28(11):3285–3294
Júnior PRM, De Souza RM, Werneck RdO, Stein BV, Pazinato DV, de Almeida WR, Penatti OA, Torres RdS, Rocha A (2017) Nearest neighbors distance ratio open-set classifier. Machine Learn 106(3):359–386
Kadous MW (2002) Australian sign language signs (high quality) Data Set. http://archive.ics.uci.edu/ml/datasets/Australian+Sign+Language+signs+(High+Quality). Accessed 19 July 2020
Kakoty NM, Sharma MD (2018) Recognition of sign language alphabets and numbers based on hand kinematics using A Data Glove. Proc Comput Sci 133:55–62
Kapuscinski T, Oszust M, Wysocki M, Warchol D (2015) Recognition of hand gestures observed by depth cameras. Int J Adv Robot Syst 12(4):36
Kawamoto A, Bertolini D, Barreto M (2018) A dataset for electromyography-based dactylology recognition. In: 2018 IEEE International conference on systems, man, and cybernetics (SMC). IEEE, pp 2376–2381
Kelly D, Mc Donald J, Markham C (2010) Weakly supervised training of a sign language recognition system using multiple instance learning density matrices. IEEE Trans Syst Man Cybern Part B (Cybern) 41(2):526–541
Koller O, Zargaran O, Ney H, Bowden R (2016) Deep sign: hybrid CNN-HMM for continuous sign language recognition. In: Proceedings of the British machine vision conference 2016
Kong W, Ranganath S (2014) Towards subject independent continuous sign language recognition: a segment and merge approach. Pattern Recogn 47(3):1294–1308
Kumar DA, Sastry A, Kishore P, Kumar EK (2018) 3D sign language recognition using spatio temporal graph kernels. J King Saud Univ-Comput Inf Sci
Li W (2017) Webpage of Dr Wanqing Li. http://www.uow.edu.au/~wanqing/#MSRAction3DDatasets. Accessed 10 July 2020
Li W, Zhang Z, Liu Z (2010) Action recognition based on a bag of 3D points. In: 2010 IEEE Computer Society conference on computer vision and pattern recognition-workshops. IEEE, pp 9–14
Liao Y, Xiong P, Min W, Min W, Lu J (2019) Dynamic sign language recognition based on video sequence with BLSTM-3D residual networks. IEEE Access 7:38044–38054
Lim KM, Tan AW, Tan SC (2016) A feature covariance matrix with serial particle filter for isolated sign language recognition. Expert Syst Appl 54:208–218
Liu L, Shao L (2013) Learning discriminative representations from RGB-D video data. In: Twenty-third international joint conference on artificial intelligence
Lucas BD, Kanade T, et al. (1981) An iterative image registration technique with an application to stereo vision. In: Proceedings DARPA image understanding workshop
Machine Intelligence and Data Science Laboratory (Minds Lab) (2019) Brazilian sign language recognition. http://minds.eng.ufmg.br/project/4. Accessed 04 Sept 2020
Machine Vision Lab (2018) IITR Sign Language Thermal Dataset 2018 (ISLTD2018). https://www.iitr.ac.in/mvlab/documents/ISLTD2018_Download_Form.pdf. Accessed 03 Sept 2020
Masood S, Srivastava A, Thuwal HC, Ahmad M (2018) Real-time sign language gesture (word) recognition from video sequences using CNN and RNN. In: Intelligent engineering informatics. Springer, pp 623–632
MCC Lab (2020) SLR Dataset. http://mccipc.ustc.edu.cn/mediawiki/index.php/SLR_Dataset. Accessed 03 Sept 2020
Mohandes M, Deriche M, Johar U, Ilyas S (2012) A signer-independent Arabic Sign Language recognition system using face detection, geometric features, and a hidden Markov model. Comput Electr Eng 38(2):422–433
Mohandes MA (2013) Recognition of two-handed Arabic signs using the CyberGlove. Arabian J Sci Eng 38(3):669–677
Molchanov P, Gupta S, Kim K, Kautz J (2015) Hand gesture recognition with 3D convolutional neural networks. Proceedings of the IEEE conference on computer vision and pattern recognition workshops, vol 1. IEEE, pp 1–7
Nguyen TD, Ranganath S (2012) Facial expressions in American sign language: tracking and recognition. Pattern Recogn 45(5):1877–1891
Oszust M, Wysocki M (2013) Polish sign language words recognition with kinect. In: 2013 6th international conference on human system Interactions (HSI). IEEE, pp 219–226
Oszust M, Wysocki M (2016) Point clouds corresponding to dynamic gestures registered by Kinect. http://vision.kia.prz.edu.pl/dynamickinect.php. Accessed 13 May 2020
Oszust M, Wysocki M (2016) Point clouds corresponding to dynamic gestures registered by time-of-flight (ToF) camera. http://vision.kia.prz.edu.pl/dynamictof.php. Accessed 15 May 2020
Oz C, Leu MC (2011) American Sign Language word recognition with a sensory glove using artificial neural networks. Eng Appl Artif Intell 24(7):1204–1213
Ozcan T, Basturk A (2019) Transfer learning-based convolutional neural networks with heuristic optimization for hand gesture recognition. Neural Comput Appl 31(12):8955–8970
Raghuveera T, Deepthi R, Mangalashri R, Akshaya R (2020) A depth-based Indian Sign Language recognition using Microsoft Kinect. Sādhanā 45(1):34
Rastgoo R, Kiani K, Escalera S (2020) Hand sign language recognition using multi-view hand skeleton. Expert Syst Appl 150:113336
Ravi S, Suman M, Kishore P, Kumar K, Kumar A et al (2019) Multi modal spatio temporal co-trained CNNs with single modal testing on RGB-D based sign language gesture recognition. J Comput Lang 52:88–102
Rezende TM (2016) Aplicação de Técnicas de Inteligência Computacional para Análise da Expressão Facial em Reconhecimento de Sinais de Libras. Master’s thesis, Universidade Federal de Minas Gerais. https://www.ppgee.ufmg.br/defesas/1393M.PDF. Accessed 03 Sept 2020. (in Portuguese)
Rezende TM, de Castro CL, Moreira SG, Preto CO (2017) Análise da expressao facial em reconhecimento de sinais de libras. In: VI Simpósio Brasileiro de Automação inteligente, pp 465–470 (in Portuguese)
Ronchetti F, Quiroga F, Estrebou C, Lanzarini L, Rosete A (2016) LSA64: a dataset for Argentinian sign language. http://facundoq.github.io/unlp/lsa64/index.html. Accessed 03 Aug 2020
Ruffieux S, Lalanne D, Mugellini E, Khaled OA (2014) A survey of datasets for human gesture recognition. In: International conference on human–computer interaction. Springer, Berlin, pp 337–348
Shi J et al (1994) Good features to track. In: 1994 Proceedings of IEEE conference on computer vision and pattern recognition. IEEE, pp 593–600
Simons GF, Fennig CD (2018) Ethnologue: languages of the World. SIL International, Dalas, Texas. https://www.ethnologue.com/subgroups/sign-language. Accessed 18 Aug 2020
Stokoe WC (1960) Sign language structure: an outline of the visual communication systems of American deaf. University of Buffalo Press, New York
Tamura S, Kawasaki S (1988) Recognition of sign language motion images. Pattern Recogn 21(4):343–353. https://doi.org/10.1016/0031-3203(88)90048-9
Terven JR, Córdova-Esparza DM (2016) Kin2. A Kinect 2 toolbox for MATLAB. Sci Comput Program 130:97–106
Tolba MF, Samir A, Aboul-Ela M (2013) Arabic sign language continuous sentences recognition using PCNN and graph matching. Neural Comput Appl 23(3–4):999–1010
Tran D, Bourdev LD, Fergus R, Torresani L, Paluri M (2014) C3D: Generic Features for Video Analysis. CoRR. arXiv:arXiv:1412.0767
Tubaiz N, Shanableh T, Assaleh K (2015) Glove-based continuous Arabic sign language recognition in user-dependent mode. IEEE Trans Hum-Mach Syst 45(4):526–533
Vogler C, Goldenstein S (2008) Facial movement analysis in ASL. Univers Access Inf Soc 6(4):363–374
Von Agris U (2008) Database for signer-independent continuous sign language recognition. https://www.phonetik.uni-muenchen.de/forschung/Bas/SIGNUM/. Accessed 13 May 2020
Wadhawan A, Kumar P (2020) Deep learning-based sign language recognition system for static signs. Neural Comput Appl 32(12):7957–7968
Wang H, Chai X, Hong X, Zhao G, Chen X (2016) Isolated sign language recognition with grassmann covariance matrices. ACM Trans Access Comput (TACCESS) 8(4):1–21
Wang H, Chai X, Chen X (2019) A novel sign language recognition framework using hierarchical Grassmann covariance matrix. IEEE Trans Multimed 21(11):2806–2814
Xia L, Chen CC, Aggarwal JK (2011) Human detection using depth information by kinect. In: CVPR 2011 workshops. IEEE, pp 15–22
Zahedi M, Keysers D, Deselaers T, Ney H (2005) RWTH-BOSTON-50 Database. https://www-i6.informatik.rwth-aachen.de/web/Software/Databases/Signlanguage/details/rwth-boston-50/index.php. Accessed 13 May 2020
Zhang L, Zhu G, Shen P, Song J, Afaq Shah S, Bennamoun M (2017) Learning spatiotemporal features using 3DCNN and convolutional LSTM for gesture recognition. In: Proceedings of the IEEE international conference on computer vision, pp 3120–3128
Zhao R, Martinez AM (2015) Labeled graph kernel for behavior analysis. IEEE Trans Pattern Anal Mach Intell 38(8):1640–1650
Acknowledgements
The authors would like to thank Marcos Antônio Alves and Aline Xavier Fidêncio for the textual revision, and everyone who participated in the construction of the MINDS-Libras, especially all the people who volunteered their time to execute the signs. This work was partially financed by the Foundation for Research of the State of Minas Gerais [Fundação de Amparo à Pesquisa do Estado de Minas Gerais—FAPEMIG (Grant No. PPM-00587-16)], by the Coordination for the Improvement of Higher Education Personnel (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—CAPES), which is a Brazilian federal government agency under the Ministry of Education, by Federal Intitute of Minas Gerais (Instituto Federal de Minas Gerais—IFMG), and by the National Council for Scientific and Technological Development (Conselho Nacional de Desenvolvimento Científico e Tecnológico—CNPq), Brazil, Grants Nos. 306850/2016-8, 167016/2017-2 and 312991/2020-7, and Notice No. 169/2015.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rezende, T.M., Almeida, S.G.M. & Guimarães, F.G. Development and validation of a Brazilian sign language database for human gesture recognition. Neural Comput & Applic 33, 10449–10467 (2021). https://doi.org/10.1007/s00521-021-05802-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-05802-4