Empowering deaf communication: a novel LSTM model for recognizing Indonesian sign language

Caraka, Rezzy Eko; Supardi, Khairunnisa; Kurniawan, Robert; Kim, Yunho; Gio, Prana Ugiana; Yuniarto, Budi; Mubarok, Faiq Zakki; Pardamean, Bens

doi:10.1007/s10209-024-01095-1

Empowering deaf communication: a novel LSTM model for recognizing Indonesian sign language

Long Paper
Published: 11 March 2024

Volume 24, pages 771–783, (2025)
Cite this article

Universal Access in the Information Society Aims and scope Submit manuscript

Rezzy Eko Caraka^1,2,3,
Khairunnisa Supardi⁴,
Robert Kurniawan⁵,
Yunho Kim³,
Prana Ugiana Gio⁶,
Budi Yuniarto⁵,
Faiq Zakki Mubarok⁵ &
…
Bens Pardamean^7,8

489 Accesses
Explore all metrics

Abstract

Sign language plays a pivotal role in facilitating communication for the deaf community, bridging the gap with the broader society. Nevertheless, mastering sign language poses significant challenges due to the intricate nuances of body movements, hand gestures, and facial expressions. Sign language recognition technology is a pivotal solution aimed at enabling clear communication between deaf individuals and the wider community, thereby reducing the risk of miscommunication. This study introduces an innovative approach to address these challenges. We focus on the recognition of Indonesian Sign Language using a skeleton-based method, harnessing the capabilities of MediaPipe to extract critical hand and pose key points from sign language videos. The core of our approach involves the implementation of a long short-term memory (LSTM) model, which has showcased exceptional promise in accurately interpreting BISINDO. The proposed LSTM architecture excels with a remarkable validation accuracy of 92.857%, surpassing the accuracy and computational efficiency of previously proposed LSTM models. This significant advancement in technology propels us closer to bridging the communication gap between the deaf community and the broader population.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Chinese Sign Language Recognition with Batch Sampling ResNet-Bi-LSTM

Article 03 August 2022

GIDSL: Indian-Gujarati Isolated Dynamic Sign Language Recognition Using Deep Learning

Article 09 May 2024

Using LSTM to translate Thai sign language to text in real time

Article Open access 29 February 2024

Data availability

The source code and the material and findings data of this study are openly available in full access by GitHub: https://github.com/Rezzy94.

References

Johnson, R.E.: Sign language, culture & community in a traditional Yucatec Maya village. Sign. Lang. Stud. 1073, 461–474 (1991). https://doi.org/10.1353/sls.1991.0031
Article MATH Google Scholar
Supriyati, E., Iqbal, M.: Recognition system of indonesia sign language based on sensor and artificial neural network. Makara J. Technol. 17, 25–31 (2013)
MATH Google Scholar
Suharjito, T., Gunawan, N.: SIBI sign language recognition using convolutional neural network combined with transfer learning and non-trainable parameters. Proc. Comput. Sci. 179, 72–80 (2021). https://doi.org/10.1016/j.procs.2020.12.011
Article MATH Google Scholar
Yugopuspito, P., Made Murwantara, I., Sean, J.: Mobile sign language recognition for Bahasa Indonesia using convolutional neural network. In: ACM International Conference Proceeding Series. 84–91 (2018). https://doi.org/10.1145/3282353.3282356
Palfreyman, N.: Sign language varieties of Indonesia a linguistic and sociolinguistic investigation, (2015)
Hartanto, R., Susanto, A., Santosa, P.I.: Preliminary design of static indonesian sign language recognition system. In: Proceedings - 2013 International Conference on Information Technology and Electrical Engineering: “Intelligent and Green Technologies for Sustainable Development”, ICITEE 2013. 187–192 (2013). https://doi.org/10.1109/ICITEED.2013.6676236
Zakaria, Z., Firmanyah, R.A., Prabowo, Y.A.: Rancang bangun Flex Sensor Gloves untuk penerjemah Bahasa Isyarat menggunakan K-Nearest Neighbors. Seminar Nasional Sains dan Teknologi Terapan VII. 361–366 (2019)
Mamuriyah, N., Deasy, D.: Perancangan pembuatan aplikasi pengenalan dan penerjemah bahasa isyarat sibi menggunakan leap motion dengan hidden markov models. Telcomatics (2020). https://doi.org/10.37253/telcomatics.v5i1.838
Article Google Scholar
Xiao, Q., Qin, M., Yin, Y.: Skeleton-based Chinese sign language recognition and generation for bidirectional communication between deaf and hearing people. Neural Netw. 125, 41–55 (2020). https://doi.org/10.1016/j.neunet.2020.01.030
Article MATH Google Scholar
Caterini, A.L., Chang, D.E.: Recurrent neural networks. In: Deep Neural Networks in a Mathematical Framework. pp. 59–79. Springer International Publishing (2018)
Toharudin, T., Pontoh, R.S., Caraka, R.E., Zahroh, S., Lee, Y., Chen, R.C.: Employing long short-term memory and facebook prophet model in air temperature forecasting. Commun. Stat. Simul. Comput. (2021). https://doi.org/10.1080/03610918.2020.1854302
Article MATH Google Scholar
Fischer, A.: Training restricted Boltzmann machines. KI - Künstliche Intell. 29, 441–444 (2015). https://doi.org/10.1007/s13218-015-0371-2
Article MATH Google Scholar
Sequeira, S., Banu, P.K.N.: Comparisons of stock price predictions using stacked RNN-LSTM. (2021)
Yin, W., Schütze, H.: Attentive convolution: equipping CNNs with RNN-style attention mechanisms. Trans. Assoc. Comput. Linguist. 6, 687–702 (2018). https://doi.org/10.1162/tacl_a_00249
Article MATH Google Scholar
Rahimyar, A.H., Nguyen, H.Q., Wang, X.: Stock Forecasting Using M-Band Wavelet-Based SVR and RNN-LSTMs Models. In: 2019 2nd International Conference on Information Systems and Computer Aided Education, ICISCAE 2019. pp. 234–240 (2019)
Donahue, J., Hendricks, L.A., Rohrbach, M., Venugopalan, S., Guadarrama, S., Saenko, K., Darrell, T.: Long-term recurrent convolutional networks for visual recognition and description. IEEE Trans. Pattern Anal. Mach. Intell. 39, 677–691 (2017). https://doi.org/10.1109/TPAMI.2016.2599174
Article Google Scholar
Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: 31st International Conference on Machine Learning, ICML 2014. 5, 3771–3779 (2014)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. Adv. Neural. Inf. Process. Syst. 4, 3104–3112 (2014)
MATH Google Scholar
Zhang, X.Y., Yin, F., Zhang, Y.M., Liu, C.L., Bengio, Y.: Drawing and recognizing chinese characters with recurrent neural network. IEEE Trans. Pattern Anal. Mach. Intell. 40, 849–862 (2018). https://doi.org/10.1109/TPAMI.2017.2695539
Article MATH Google Scholar
Liu, T., Zhou, W., Li, H.: Sign language recognition with long short-term memory. In: IEEE International Conference on Image Processing (ICIP). pp. 1–4. The Institute of Electrical and Electronics Engineers Signal Processing Society, Arizona (2016)
Adaloglou, N.M., Chatzis, T., Papastratis, I., Stergioulas, A., Papadopoulos, G.T., Zacharopoulou, V., Xydopoulos, G., Antzakas, K., Papazachariou, D., Daras, P.: None: a comprehensive study on deep learning-based methods for sign language recognition. IEEE Trans. Multimed. 9210, 1–14 (2021). https://doi.org/10.1109/TMM.2021.3070438
Article Google Scholar
Midyanti, D.M., Gustiar, D., Sitorus, S.H.: Penerjemahan bahasa isyarat menggunakan metode generalized learning vector quantization (Glvq). Coding J. Komput. dan Aplikasi 8, 1 (2020). https://doi.org/10.26418/coding.v8i3.42156
Article Google Scholar
Darmatasia, D.: Pengenalan sistem isyarat bahasa indonesia (sibi) menggunakan gradient-convolutional neural network. Jurnal INSTEK (Inform Sains dan Teknologi) 6, 56 (2021). https://doi.org/10.24252/instek.v6i1.18637
Article Google Scholar
Lugaresi, C., Tang, J., Nash, H., McClanahan, C., Uboweja, E., Hays, M., Zhang, F., Chang, C.-L., Yong, M.G., Lee, J., Chang, W.-T., Hua, W., Georg, M., Grundmann, M.: MediaPipe: A Framework for Building Perception Pipelines. ArXiv. (2019)
Cao, Z., Hidalgo, G., Simon, T., Wei, S.E., Sheikh, Y.: OpenPose: realtime multi-person 2D pose estimation using part affinity fields. IEEE Trans. Pattern Anal. Mach. Intell. 43, 172–186 (2021). https://doi.org/10.1109/TPAMI.2019.2929257
Article Google Scholar
Güler, R.A., Neverova, N., Kokkinos, I.: DensePose: Dense Human Pose Estimation in the Wild. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 7297–7306 (2018). https://doi.org/10.1109/CVPR.2018.00762
Bazarevsky, V., Grishchenko, I., Raveendran, K., Zhu, T., Zhang, F., Grundmann, M.: BlazePose: On-device Real-time Body Pose tracking. ArXiv. (2020)
Dokmanic, I., Parhizkar, R., Ranieri, J., Vetterli, M.: Euclidean distance matrices: essential theory, algorithms, and applications. IEEE Signal Process. Mag. 32, 12–30 (2015). https://doi.org/10.1109/MSP.2015.2398954
Article MATH Google Scholar
Kapuściński, T., Warchol, D.: Hand posture recognition using skeletal data and distance descriptor. Appl. Sci. (Switzerland) (2020). https://doi.org/10.3390/app10062132
Article MATH Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article MATH Google Scholar
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. (1994). https://doi.org/10.1109/72.279181
Article MATH Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article MATH Google Scholar
Graves, A.: Supervised sequence labelling with recurrent neural networks. Springer-Verlag, Berlin Heidelberg, Berlin (2012)
Book MATH Google Scholar
Le, X.H., Ho, H.V., Lee, G., Jung, S.: Application of long short-term memory (LSTM) neural network for flood forecasting. Water (Switzerland) 11, 2–19 (2019). https://doi.org/10.3390/w11071387
Article MATH Google Scholar
Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with LSTM. Neural Comput. 12, 2451–2471 (2000). https://doi.org/10.1162/089976600300015015
Article MATH Google Scholar
Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference. pp. 1724–1734. Association for Computational Linguistics, Doha (2014)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. 1–9 (2014)
Gholamalinezhad, H., Khosravi, H.: Pooling Methods in Deep Neural Networks, a Review. ArXiv. (2020)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 50, 84–90 (2017). https://doi.org/10.1145/3065386
Article MATH Google Scholar
Kingma, D.P., Lei Ba, J.: ADAM: A Method for Stochastic Optimization. In: ICLR. pp. 1–15 (2015)
Zhang, Z., Sabuncu, M.R.: Generalized cross entropy loss for training deep neural networks with noisy labels. Adv Neural Inf Process Syst. 2018-Decem, 8778–8788 (2018)
Liu, T., Zhou, W., Li, H.: Sign language recognition with long short-term memory. In: 2016 IEEE International Conference on Image Processing (ICIP). pp. 2871–2875. IEEE (2016)
Pu, J., Zhou, W., Zhang, J., Li, H.: Sign Language Recognition Based on Trajectory Modeling with HMMs. In: International Conference on Multimedia Modeling. pp. 686–698. Springer-Verlag, Miami (2016)

Download references

Acknowledgements

Rezzy Eko Caraka is partially supported by the National Research Foundation of Korea (NRF-2023R1A2C1006845). Yunho Kim acknowledges support from the National Research Foundation of Korea NRF-2022R1A5A1033624 and Grant NRF-2023R1A2C1006845.

Funding

The work presented in this paper has been funded by the National Research Foundation of Korea NRF-2022R1A5A1033624 and Grant NRF-2023R1A2C1006845.

Author information

Authors and Affiliations

Research Center for Data and Information Sciences, Research Organization for Electronics and Informatics, National Research and Innovation Agency (BRIN), Bandung, West Java, 40135, Indonesia
Rezzy Eko Caraka
School of Economics and Business, Telkom University, Bandung, 40257, Indonesia
Rezzy Eko Caraka
Department of Mathematical Sciences, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea
Rezzy Eko Caraka & Yunho Kim
Muhammad Sani Karimun Regional Public Hospital, Tanjung Balai Karimun, Riau Island, 29663, Indonesia
Khairunnisa Supardi
Statistical Computing Department, Polytechnic of Statistics - STIS, Jakarta, 13330, Indonesia
Robert Kurniawan, Budi Yuniarto & Faiq Zakki Mubarok
Department of Mathematics, Universitas Sumatera Utara, Medan Baru, Medan, North Sumatra, 20155, Indonesia
Prana Ugiana Gio
Bioinformatics and Data Science Research Center, Bina Nusantara University, Jakarta, 11530, Indonesia
Bens Pardamean
Computer Science Department, BINUS Graduate Program - Master of Computer Science Program, Bina Nusantara University, Jakarta, 11530, Indonesia
Bens Pardamean

Authors

Rezzy Eko Caraka
View author publications
You can also search for this author inPubMed Google Scholar
Khairunnisa Supardi
View author publications
You can also search for this author inPubMed Google Scholar
Robert Kurniawan
View author publications
You can also search for this author inPubMed Google Scholar
Yunho Kim
View author publications
You can also search for this author inPubMed Google Scholar
Prana Ugiana Gio
View author publications
You can also search for this author inPubMed Google Scholar
Budi Yuniarto
View author publications
You can also search for this author inPubMed Google Scholar
Faiq Zakki Mubarok
View author publications
You can also search for this author inPubMed Google Scholar
Bens Pardamean
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

REC conceived the research and constructed the experimental design. REC, YK, and BP managed the project. REC, FZM, and RK analyzed the data. REC participated in the verification and interpretation of data. REC, KS, FZM, and RK wrote the manuscript. REC, KS, RK, YK, PUG, BY, FZM, and BP read and approved the final manuscript.

Corresponding authors

Correspondence to Rezzy Eko Caraka or Yunho Kim.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Ethical approval

Not applicable.

Consent for publication

The participants have consented to the submission of the case report to the journal.

Consent to participate

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

About this article

Cite this article

Caraka, R.E., Supardi, K., Kurniawan, R. et al. Empowering deaf communication: a novel LSTM model for recognizing Indonesian sign language. Univ Access Inf Soc 24, 771–783 (2025). https://doi.org/10.1007/s10209-024-01095-1

Download citation

Accepted: 02 February 2024
Published: 11 March 2024
Issue Date: March 2025
DOI: https://doi.org/10.1007/s10209-024-01095-1

Keywords

Part of a collection:

New trends in the design and evaluation of accessible human-computer interfaces

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Empowering deaf communication: a novel LSTM model for recognizing Indonesian sign language

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Chinese Sign Language Recognition with Batch Sampling ResNet-Bi-LSTM

GIDSL: Indian-Gujarati Isolated Dynamic Sign Language Recognition Using Deep Learning

Using LSTM to translate Thai sign language to text in real time

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Ethical approval

Consent for publication

Consent to participate

Additional information

Publisher's Note

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now