Video-Based Vietnamese Sign Language Recognition Using Local Descriptors

Vo, Anh H.; Nguyen, Nhu T. Q.; Nguyen, Ngan T. B.; Pham, Van-Huy; Van Giap, Ta; Nguyen, Bao T.

doi:10.1007/978-3-030-14802-7_59

Anh H. Vo¹⁸,
Nhu T. Q. Nguyen¹⁸,
Ngan T. B. Nguyen¹⁸,
Van-Huy Pham¹⁸,
Ta Van Giap¹⁹ &
…
Bao T. Nguyen²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11432))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

1964 Accesses
1 Citations

Abstract

Sign Language is one of the method for non-verbal communication. It is most commonly used by deaf or dumb people who have hearing or speech problems to communicate among themselves or with normal people. Vietnamese Sign Language (VSL) is a sign language system used in the community of Vietnamese hearing impaired individuals. VSL recognition aims to develop algorithms and methods to correctly identify a sequence of produced signs and to understand their meaning in Vietnamese. However, automatic VSL recognition in video has many challenges due to the orientation of camera, hand position and movement, inter hand relation, etc. In this paper, we present some feature extraction approaches for VSL recognition includes spatial feature, scene-based feature, and especially motion-based feature. Instead of relying on a static image, we specifically capture motion information between frames in a video sequence. We evaluated the proposed framework on our acquired VSL dataset including 23 alphabets, 3 diacritic marks and 5 tones in Vietnamese language with 2D camera. Additionally, in order to gain more information of hand movement and hand position, we also used the data augmentation technique. All these helpful information would contribute to an effective VSL recognition system. The experiments achieved the satisfactory results with 86.61%. It indicates that data augmentation technique provides more information about the orientation of hand. Moreover, the combination of spatial, scene and especially motion information could help the system to be able to capture information from both single frame and from multiple frames, and thus the performance of VSL recognition system could be improved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bartlett, M.S., Littlewort, G., Fasel, I., Movellan, J.R.: Real time face detection and facial expression recognition: development and applications to human computer interaction. In: 2003 Conference on Computer Vision and Pattern Recognition Workshop, vol. 5, pp. 53–53, June 2003. https://doi.org/10.1109/CVPRW.2003.10057
Bui, T.D., Nguyen, L.T.: Recognizing postures in Vietnamese sign language with mems accelerometers. IEEE Sens. J. 7(5), 707–712 (2007). https://doi.org/10.1109/JSEN.2007.894132
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893, June 2005. https://doi.org/10.1109/CVPR.2005.177
Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description (2015)
Google Scholar
Duc, H.V., Huynh, H.H., Phuoc, M.D., Meunier, J.: Dynamic gesture classification for Vietnamese sign language recognition. Int. J. Adv. Comput. Sci. Appl. 8, 415–420 (2017)
Google Scholar
Hai, P.T., Thinh, H.C., Phuc, B.V., Kha, H.H.: Automatic feature extraction for Vietnamese sign language recognition using support vector machine. In: 2018 2nd International Conference on Recent Advances in Signal Processing, Telecommunications Computing (SigTelCom), pp. 146–151, January 2018. https://doi.org/10.1109/SIGTELCOM.2018.8325780
Liang, Z.J., Liao, S.B., Hu, B.Z.: 3D convolutional neural networks for dynamic sign language recognition. Comput. J. 61(11), 1724–1736 (2018). https://doi.org/10.1093/comjnl/bxy049
Article Google Scholar
Pigou, L., Dieleman, S., Kindermans, P.-J., Schrauwen, B.: Sign language recognition using convolutional neural networks. In: Agapito, L., Bronstein, M.M., Rother, C. (eds.) ECCV 2014. LNCS, vol. 8925, pp. 572–578. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16178-5_40
Chapter Google Scholar
Ng, J.Y., Hausknecht, M.J., Vijayanarasimhan, S., Vinyals, O., Monga, R., Toderici, G.: Deep networks for video classification (2015)
Google Scholar
Ojala, T., Pietikäinen, M., Harwood, D.: A comparative study of texture measures with classification based on feature distributions. Pattern Recogn. 29, 51–59 (1996). https://doi.org/10.1016/0031-3203(95)00067-4
Article Google Scholar
Ojansivu, V., Heikkilä, J.: Blur insensitive texture classification using local phase quantization. In: Elmoataz, A., Lezoray, O., Nouboud, F., Mammass, D. (eds.) ICISP 2008. LNCS, vol. 5099, pp. 236–243. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-69905-7_27
Chapter Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001). https://doi.org/10.1023/A:1011139631724
Article MATH Google Scholar
Shan, C., Gong, S., McOwan, P.W.: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009). https://doi.org/10.1016/j.imavis.2008.08.005
Article Google Scholar
Vo, D., Nguyen, T., Huynh, H., Meunier, J.: Recognizing Vietnamese sign language based on rank matrix and alphabetic rules. In: 2015 International Conference on Advanced Technologies for Communications (ATC), pp. 279–284, October 2015. https://doi.org/10.1109/ATC.2015.7388335

Download references

Acknowledgment

The authors would like to thank the teachers of the deaf people in Binh Duong province, Vietnam. We acknowlegment the support of the students in Ton Duc Thang University.

Author information

Authors and Affiliations

Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam
Anh H. Vo, Nhu T. Q. Nguyen, Ngan T. B. Nguyen & Van-Huy Pham
Can Tho Medical College, Can Tho City, Vietnam
Ta Van Giap
University of Education and Technology, Ho Chi Minh, Vietnam
Bao T. Nguyen

Authors

Anh H. Vo
View author publications
You can also search for this author in PubMed Google Scholar
Nhu T. Q. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Ngan T. B. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Van-Huy Pham
View author publications
You can also search for this author in PubMed Google Scholar
Ta Van Giap
View author publications
You can also search for this author in PubMed Google Scholar
Bao T. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Van-Huy Pham or Bao T. Nguyen .

Editor information

Editors and Affiliations

Ton Duc Thang University, Ho Chi Minh City, Vietnam
Ngoc Thanh Nguyen
Bina Nusantara University, Jakarta, Indonesia
Ford Lumban Gaol
National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong
Wrocław University of Science and Technology, Wrocław, Poland
Bogdan Trawiński

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vo, A.H., Nguyen, N.T.Q., Nguyen, N.T.B., Pham, VH., Van Giap, T., Nguyen, B.T. (2019). Video-Based Vietnamese Sign Language Recognition Using Local Descriptors. In: Nguyen, N., Gaol, F., Hong, TP., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2019. Lecture Notes in Computer Science(), vol 11432. Springer, Cham. https://doi.org/10.1007/978-3-030-14802-7_59

Download citation

DOI: https://doi.org/10.1007/978-3-030-14802-7_59
Published: 07 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14801-0
Online ISBN: 978-3-030-14802-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics