Skip to main content
Log in

3D gesture segmentation for word-level Arabic sign language using large-scale RGB video sequences and autoencoder convolutional networks

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Sign languages use hands, body movements, and facial expressions to deliver a message. Developing a communication environment for the deaf community is a social and economical necessity. Research has been conducted on the segmentation of gestures to develop methods capable of identifying a given sequence of signs and understanding their meaning. However, the variety of hand shapes and the complexity of gestures remain a challenge. In this paper, we propose a novel model called 3D gesture segmentation network (3D GS-Net) from video sequences for word-level Arabic sign language (ArSL) with a small number of features. To efficiently process and analyze the frame sequences, annotation and normalization are applied to the dataset. During the training phase, the preprocessed data are fed into the 3D GS-Net model using an autoencoder convolutional network architecture designed as a two-branch network that is merged at the final layer to produce the final predictive segmentation output. The proposed 3D GS-Net has been experimented with RGB videos of the Moroccan sign language (MoSL) dataset. Our obtained results have been compared with existing approaches and demonstrate the effectiveness and efficiency of our 3D GS-Net approach in the segmentation of gestures through different evaluation metrics .

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Abdel-Fattah, M.A.: Arabic sign language: a perspective. J. Deaf Stud. Deaf Educ. 10(2), 212–221 (2005)

    Article  Google Scholar 

  2. Al-Fityani, K., Padden, C.: Sign language geography in the arab world. Sign languages: A Cambridge survey 20 (2010)

  3. Aly, S., Aly, W.: Deeparslr: A novel signer-independent deep learning framework for isolated arabic sign language gestures recognition. IEEE Access 8, 83199–83212 (2020)

    Article  Google Scholar 

  4. El Meslouhi, O., Elgarrai, Z., Kardouchi, M., Allali, H.: Unimodal multi-feature fusion and one-dimensional hidden markov models for low-resolution face recognition. Int. J. Electr. Computer Eng. 7(4), 1915 (2017)

    Google Scholar 

  5. Elatawy, S.M., Hawa, D.M., Ewees, A.A., Saad, A.M.: Recognition system for alphabet arabic sign language using neutrosophic and fuzzy c-means. Educ. Inf. Technol. 25, 5601–5616 (2020)

    Article  Google Scholar 

  6. Gong, S., Li, G., Zhang, Y., Li, C., Yu, L.: Application of static gesture segmentation based on an improved canny operator. J. Eng. 2019(15), 543–546 (2019)

    Article  Google Scholar 

  7. Ibrahim, N.B., Selim, M.M., Zayed, H.H.: An automatic arabic sign language recognition system (arslrs). J. King Saud University-Computer Inf. Sci. 30(4), 470–477 (2018)

    Google Scholar 

  8. Irfan, M., Jiangbin, Z., Iqbal, M., Arif, M.H.: Enhancing learning classifier systems through convolutional autoencoder to classify underwater images. Soft Computing pp. 1–18 (2021)

  9. Irfan, M., Zheng, J., Iqbal, M., Arif, M.H.: A novel feature extraction model to enhance underwater image classification. In: International Symposium on Intelligent Computing Systems, pp. 78–91. Springer (2020)

  10. Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. CoRR abs/1708.02002 (2017). http://arxiv.org/abs/1708.02002

  11. Luqman, H., El-Alfy, E.S.M., BinMakhashen, G.M.: Joint space representation and recognition of sign language fingerspelling using gabor filter and convolutional neural network. Multim. Tools Appl. 80(7), 10213–10234 (2021)

    Article  Google Scholar 

  12. Mohandes, M., Deriche, M., Liu, J.: Image-based and sensor-based approaches to arabic sign language recognition. IEEE Trans. Human-machine Syst. 44(4), 551–557 (2014)

    Article  Google Scholar 

  13. Rahim, M.A., Islam, M.R., Shin, J.: Non-touch sign word recognition based on dynamic hand gesture using hybrid segmentation and cnn feature fusion. Appl. Sci. 9(18), 3790 (2019)

    Article  Google Scholar 

  14. Ranga, V., Yadav, N., Garg, P.: American sign language fingerspelling using hybrid discrete wavelet transform-gabor filter and convolutional neural network. J. Eng. Sci. Technol. 13(9), 2655–2669 (2018)

    Google Scholar 

  15. Rao, G.A., Kishore, P.: Selfie video based continuous indian sign language recognition system. Ain Shams Eng. J. 9(4), 1929–1939 (2018)

    Article  Google Scholar 

  16. Rekha, J., Bhattacharya, J., Majumder, S.: Shape, texture and local movement hand gesture features for indian sign language recognition. In: 3rd International Conference on Trendz in Information Sciences & Computing (TISC2011), pp. 30–35. IEEE (2011)

  17. Roy, P.P., Kumar, P., Kim, B.G.: An efficient sign language recognition (slr) system using camshift tracker and hidden markov model (hmm). SN Computer Sci. 2(2), 1–15 (2021)

    Article  Google Scholar 

  18. Sadik, F., Subah, M.R., Dastider, A.G., Moon, S.A., Ahbab, S.S., Fattah, S.A.: Bangla sign language recognition with skin segmentation and binary masking. In: 2019 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE), pp. 1–5. IEEE (2019)

  19. Sandler, W., Lillo-Martin, D.: Sign language and linguistic universals. Cambridge University Press, Cambridge (2006)

    Book  Google Scholar 

  20. Sidig, A.A.I., Luqman, H., Mahmoud, S.A.: Arabic sign language recognition using vision and hand tracking features with hmm. Int. J. Intell. Syst. Technol. Appl. 18(5), 430–447 (2019)

    Google Scholar 

  21. Singha, J., Das, K.: Indian sign language recognition using eigen value weighted euclidean distance based classification technique. http://arxiv.org/abs/1303.0634 (2013)

  22. Sruthi, C., Soni, K., Lijiya, A.: Automatic recognition of isl dynamic signs with facial cues. In: Congress on Intelligent Systems, pp. 369–381. Springer (2020)

  23. Zhu, Q., Pan, H., Yang, M., Zhan, Y.: Desktop gestures recognition for human computer interaction. In: International Conference on Swarm Intelligence, pp. 578–585. Springer (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdelbasset Boukdir.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boukdir, A., Benaddy, M., Ellahyani, A. et al. 3D gesture segmentation for word-level Arabic sign language using large-scale RGB video sequences and autoencoder convolutional networks. SIViP 16, 2055–2062 (2022). https://doi.org/10.1007/s11760-022-02167-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-022-02167-6

Keywords

Navigation