Skip to main content
Log in

An Efficient Sign Language Recognition (SLR) System Using Camshift Tracker and Hidden Markov Model (HMM)

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

An efficient Sign Language Recognition (SLR) system could facilitate communication with hearing impaired persons by identifying the sign gestures. Similar to regional spoken languages, different regions have developed their own sign gesture representations (for example, American Sign Language (ASL), German Sign Language (GSL), Indian Sign Language (ISL), etc.). Such variations in the hand shapes and movements add many challenges in the recognition process. The overall SLR process can be divided into a number of modules such as hand and face detection, hand tracking, features extraction and gesture recognition. In this paper, we propose a novel end-to-end SLR system from RGB video-sequences. After detecting the skin color from video frames, Camshift tracker is employed to extract the trajectories of hand motion. Next, Hidden Markov Model (HMM) based sequence classification is used to recognize the gestures. A novel approach identifying double and single hand gestures is also proposed. Furthermore new features, from skin region and hand trajectories, are proposed to improve the gesture classification performance. We tested our system on dataset proposed by American Sign Language Linguistic Research Project (ASLLRP) [25], which consists of isolated signs. The experiment results are encouraging.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Ackovska N, Kostoska, M. Sign language tutor—rebuilding and optimizing. In: Information and communication technology, electronics and microelectronics (MIPRO), 2014 37th international convention on. IEEE; 2014. p. 704–709.

  2. Ahmed SA, Dogra DP, Kar S, Kim B-G, Hill P, Bhaskar H. Localization of region of interest in surveillance scene. Multimedia Tools Appl. 2017;76(11):13651–80.

    Article  Google Scholar 

  3. Athitsos V, Neidle C, Sclaroff S, Nash J, Stefan A, Thangali A, Wang H, Yuan Q. Large lexicon project: American sign language video corpus and sign language indexing/retrieval algorithms. In: Workshop on the representation and processing of sign languages: corpora and sign language technologies. 2010. p. 11–14.

  4. Bradski GR. Real time face and object tracking as a component of a perceptual user interface. In: Applications of Computer Vision. 1998. p. 214–219.

  5. Cui R, Liu H, Zhang C. A deep neural framework for continuous sign language recognition by iterative training. IEEE Trans Multimedia. 2019;21(7):1880–91.

    Article  Google Scholar 

  6. Dilsizian M, Yanovich P, Wang S, Neidle C, Metaxas DN. A new framework for sign language recognition based on 3d handshape identification and linguistic modeling. In: Language Resource Evaluation Conference. 2014. p. 1924–1929.

  7. Fagiani M, Principi E, Squartini S, Piazza F. A new italian sign language database. In: International conference on brain inspired cognitive systems. 2012. p. 164–173.

  8. Fagiani M, Principi E, Squartini S, Piazza F. Signer independent isolated italian sign recognition based on hidden markov models. Pattern Anal Appl. 2015;18(2):385–402.

    Article  MathSciNet  Google Scholar 

  9. Hsu R-L, Abdel-Mottaleb M, Jain AK. Face detection in color images. IEEE Trans Pattern Anal Mach Intell. 2002;24(5):696–706.

    Article  Google Scholar 

  10. Jones MJ, Rehg JM. Statistical color models with application to skin detection. Int J Comput Vis. 2002;46(1):81–96.

    Article  Google Scholar 

  11. KaewTraKulPong P, Bowden R. An improved adaptive background mixture model for real-time tracking with shadow detection. In: Video-based surveillance systems. Springer; 2002. p. 135–144.

  12. Kong W, Ranganath S. Towards subject independent continuous sign language recognition: a segment and merge approach. Pattern Recognit. 2014;47(3):1294–308.

    Article  Google Scholar 

  13. Kovač J, Peer P, Solina F. Human skin color clustering for face detection, vol. 2. New York: IEEE; 2003.

    Google Scholar 

  14. Kumar P, Gauba H, Roy PP, Dogra DP. Coupled hmm-based multi-sensor data fusion for sign language recognition. Pattern Recognition Letters. 2016.

  15. Kumar P, Gauba H, Roy PP, Dogra DP. A multimodal framework for sensor based sign language recognition. Neurocomputing. 2017.

  16. Kumar P, Roy PP, Dogra DP. Independent bayesian classifier combination based sign language recognition using facial expression. Inform Sci. 2018;428:30–48.

    Article  MathSciNet  Google Scholar 

  17. Kumar P, Saini R, Behera SK, Dogra DP, Roy PP. Real-time recognition of sign language gestures and air-writing using leap motion. In: 2017 fifteenth IAPR international conference on machine vision applications (MVA). IEEE; 2017. p. 157–160.

  18. Kumar P, Saini R, Roy PP, Dogra DP. Study of text segmentation and recognition using leap motion sensor. IEEE Sens J. 2016;17(5):1293–301.

    Article  Google Scholar 

  19. Kumar P, Saini R, Roy PP, Dogra DP. 3d text segmentation and recognition using leap motion. Multimedia Tools Appl. 2017;76(15):16491–510.

    Article  Google Scholar 

  20. Kumar P, Saini R, Roy PP, Dogra DP. A position and rotation invariant framework for sign language recognition (slr) using kinect. Multimedia ToolsAppl. 2018;77(7):8823–46.

    Article  Google Scholar 

  21. Le THN, Quach KG, Zhu C, Duong CN, Luu K, Savvides M. Robust hand detection and classification in vehicles and in the wild. In: 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW). IEEE; 2017. p. 1203–1210.

  22. Lee JY, Yoo SI. An elliptical boundary model for skin color detection. In: International conference on imaging science, systems, and technology. 2002.

  23. Mittal A, Kumar P, Roy PP, Balasubramanian R, Chaudhuri BB. A modified lstm model for continuous sign language recognition using leap motion. IEEE Sens J. 2019;19(16):7056–63.

    Article  Google Scholar 

  24. Natarajan P, Nevatia R. Hierarchical multi-channel hidden semi markov graphical models for activity recognition. Comput Vis Image Understanding. 2013;117(10):1329–44.

    Article  Google Scholar 

  25. Neidle C, Vogler C. A new web interface to facilitate access to corpora: Development of the asllrp data access interface (dai). In: 5th workshop on the representation and processing of sign languages: interactions between corpus and lexicon,language resource evaluation conference. 2012.

  26. Niu Z, Mak B. Stochastic fine-grained labeling of multi-state sign glosses for continuous sign language recognition. In: European conference on computer vision. Springer; 2020. p. 172–186.

  27. Ong E-J, Koller O, Pugeault N, Bowden R. Sign spotting using hierarchical sequential patterns with temporal intervals. In: Confrence on computer vision and pattern recognition. 2014. p. 1923–1930.

  28. Pu J, Zhou W, Li H. Iterative alignment network for continuous sign language recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2019. p. 4165–4174.

  29. Qian C, Sun X, Wei Y, Tang X, Sun J. Realtime and robust hand tracking from depth. In: Conference on computer vision and pattern recognition. 2014. p. 1106–1113.

  30. Roy K, Mohanty A, Sahay RR. Deep learning based hand detection in cluttered environment using skin segmentation. In: Proceedings of the IEEE international conference on computer vision workshops. 2017. p. 640–649.

  31. Saini R, Kumar P, Roy PP, Dogra DP. A novel framework of continuous human-activity recognition using kinect. Neurocomputing. 2018;311:99–111.

    Article  Google Scholar 

  32. Sigal L, Sclaroff S, Athitsos V. Skin color-based video segmentation under time-varying illumination. IEEE Trans Pattern Anal Mach Intell. 2004;26(7):862–77.

    Article  Google Scholar 

  33. Sridhar S, Oulasvirta A, Theobalt C. Interactive markerless articulated hand motion tracking using rgb and depth data. In: International conference on computer vision. 2013. p. 2456–2463.

  34. Sun C, Zhang T, Bao B-K, Xu C, Mei T. Discriminative exemplar coding for sign language recognition with kinect. IEEE Trans Cybernet. 2013;43(5):1418–28.

    Article  Google Scholar 

  35. Trinh H, Fan Q, Gabbur P, Pankanti S. Hand tracking by binary quadratic programming and its application to retail activity recognition. In: Computer Vision and Pattern Recognition. IEEE; 2012. p. 1902–1909.

  36. Vezhnevets V, Sazonov V, Andreeva A. A survey on pixel-based skin color detection techniques. In: Proc. Graphicon, volume 3. 2003. p. 85–92.

  37. Yang M-H, Ahuja N. Gaussian mixture model for human skin color and its applications in image and video databases. In: Electronic Imaging’99. International Society for Optics and Photonics; 1998. p. 458–466.

  38. Yang R, Sarkar S. Coupled grouping and matching for sign and gesture recognition. Comput Vis Image Understanding. 2009;113(6):663–81.

    Article  Google Scholar 

  39. Yoon H-S, Soh J, Bae YJ, Yang HS. Hand gesture recognition using combined features of location, angle and velocity. Pattern Recognit. 2001;34(7):1491–501.

    Article  Google Scholar 

  40. Zarit BD, Super BJ, Quek FK. Comparison of five color models in skin pixel classification. In: International workshop on recognition, analysis, and tracking of faces and gestures in real-time systems. 1999. p. 58–63.

  41. Zivkovic Z. Improved adaptive gaussian mixture model for background subtraction. In: 17th International conference on pattern recognition, volume 2. 2004. p. 28–31.

  42. Kumar P. Sign Language Recognition using Depth Sensors. Indian Institute of Technology Roorkee.

Download references

Acknowledgements

The authors would like to thank Mr. Utkarsh Agrawal, IIT Roorkee for developing and testing few modules of the system.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pradeep Kumar.

Ethics declarations

Conflict of Interest

All authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Funding

There is no funding received from any source.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Roy, P.P., Kumar, P. & Kim, BG. An Efficient Sign Language Recognition (SLR) System Using Camshift Tracker and Hidden Markov Model (HMM). SN COMPUT. SCI. 2, 79 (2021). https://doi.org/10.1007/s42979-021-00485-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-021-00485-z

Keywords

Navigation