Multi-sensor data fusion for sign language recognition based on dynamic Bayesian network and convolutional neural network

Xiao, Qinkun; Zhao, Yidan; Huan, Wang

doi:10.1007/s11042-018-6939-8

Multi-sensor data fusion for sign language recognition based on dynamic Bayesian network and convolutional neural network

Published: 30 November 2018

Volume 78, pages 15335–15352, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Qinkun Xiao¹,
Yidan Zhao¹ &
Wang Huan¹

723 Accesses
25 Citations
Explore all metrics

Abstract

A new multi-sensor fusion framework is proposed, which is based on the Convolutional Neural Network (CNN) and the Dynamic Bayesian Network (DBN) for Sign Language Recognition (SLR). In this framework, a Microsoft Kinect, which is a low-cost RGB-D sensor, is used as tools of the Human-Computer-Interaction (HCI). In our method, at first, the color and depth videos are collected using the Kinect, the next, all image sequences features are extracted out using the CNN. The color and depth feature sequences are input into the DBN as observation data. Based on graph model fusion, the maximum recognition rate of dynamic isolated sign language is calculated. The proposed the DBN + CNN SLR framework is tested in our dataset, the highest recognition rate can up to 99.40%. The test results show that our approach is effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Event-Based American Sign Language Recognition Using Dynamic Vision Sensor

Real-Time Sign Language Detection Using OpenCV

An approach based on 1D fully convolutional network for continuous sign language recognition and labeling

Article 07 June 2022

References

Almeida SGM, Guimarães FG, Ramírez JA (2014) Feature extraction in brazilian sign language recognition based on phonological structure and using RGB-D sensors. Expert Syst Appl 4l:7259–7271
Article Google Scholar
Brand MO, Pentland N (1997) A coupled hidden Markov models for complex action recognition. CVPR: 994–999
Celebi S, Aydin AS, Temiz TT, Arici T (2013) Gesture recognition using skeleton data with weighted dynamic time warping. Int Conf Comput Vision Theory Appl: 620–625
Chen FS, Fu CM, Huang CL (2003) Hand gesture recognition using a real-time tracking method and hidden markov models. Image Vis Comput 2003(21):745–758
Article Google Scholar
Yan Chenggang, Zhang Yongdong, Xu Jizheng, Dai Feng, Li Liang, Dai Qionghai, Wu Feng. A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Sign Process Lett, v 21, n 5, p 573–576, 2014
Yan Chenggang, Zhang Yongdong, Xu Jizheng, Dai Feng, Zhang Jun, Dai Qionghai, Wu Feng. Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans Circ Syst Video Technol, 24, n 12, p 2077–2089, 2014
Yan Chenggang, Xie Hongtao, Yang Dongbao, Yin Jian, Zhang Yongdong, Dai Qionghai. Supervised hash coding with deep neural network for environment perception of intelligent vehicles. IEEE Trans Intell Transp Syst, 19, n 1, p 284–295, 2018
Yan Chenggang, Xie Hongtao, Liu Shun, Yin Jian, Zhang Yongdong, Dai Qionghai. Effective Uyghur language text detection in complex background images for traffic prompt identification. IEEE Trans Intell Transp Syst, 19, n 1, p 220–229, 2018
Chu SM, Huang TS (2002) Audio-visual speech modeling using coupled hidden markov models. ICASSP: 2009–2012
Dagum P, Galper A, Horvitz E (1992) Dynamic network models for forecasting. Proc Eighth Conf Uncertainty Artif Intell AUAI Press: 41–48
Elons A, Ahmed M, Shedid H, Tolba M (2014) Arabia sign language recognition using leap motion sensor. Int Conf Comput Eng Syst:368–373
Graves A, Liwicki M, Fern’andez S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31:855–868
Article Google Scholar
Lang S, Block M, Rojas R (2012) Sign language recognition using kinect. Artif Intell Soft Comput: 394–402
Marin G, Dominio F, Zanuttigh P (2014) Hand gesture recognition with leap motion and kinect devices. Int Conf Image Process: 1565–1569
Marin G, Dominio F, Zanuttigh P (2015) Hand gesture recognition with jointly calibrated leap motion and depth sensor. Multimed Tools Appl I25
Nefian AV, Liang L, Pi X, Xiaoxiang L, Mao C, Murphy K (2002) A coupled hmm for audio-visual speech recognition. ICASSP 2002:2013–2016
Google Scholar
Pedersoli F, Benini S, Adami N, Leonardi R (2014) Xkin: an open source framework for hand pose and gesture recognition using kinect. Vis Comput 30:1107–1122
Article Google Scholar
Pugeault N, Bowden R (2011) Spelling it out: real-time ASL finger spelling recognition. ICCV:1114–1119
Russell S, Norvig P (2010) Artificial intelligence: a modern approach (third ed.). Prentice Hall
Suk HI, Sin BK, Lee SW (2010) Hand gesture recognition based on dynamic Bayesian network framework. Pattern Recogn: 3059–3072

Download references

Acknowledgements

This work is supported by the Nature Science Foundation of China (Nos. 60972095, 61271362, 61671362) and Nature Science Basic Research Plan in Shaanxi Province of China (Nos. 2017JM6041).

Author information

Authors and Affiliations

Department of Electronics Information and Engineering, Xi’an Technological University, Xi’an City, People’s Republic of China, 710032
Qinkun Xiao, Yidan Zhao & Wang Huan

Authors

Qinkun Xiao
View author publications
You can also search for this author inPubMed Google Scholar
Yidan Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Wang Huan
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Qinkun Xiao.

Ethics declarations

Conflicts of interest

Qinkun Xiao stated that he has no conflicts of interest.

Author Zhao Yidan claims she has no conflicts of interest.

Author Wang Huan claims she has no conflicts of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiao, Q., Zhao, Y. & Huan, W. Multi-sensor data fusion for sign language recognition based on dynamic Bayesian network and convolutional neural network. Multimed Tools Appl 78, 15335–15352 (2019). https://doi.org/10.1007/s11042-018-6939-8

Download citation

Received: 28 June 2018
Revised: 14 November 2018
Accepted: 23 November 2018
Published: 30 November 2018
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s11042-018-6939-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-sensor data fusion for sign language recognition based on dynamic Bayesian network and convolutional neural network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Event-Based American Sign Language Recognition Using Dynamic Vision Sensor

Real-Time Sign Language Detection Using OpenCV

An approach based on 1D fully convolutional network for continuous sign language recognition and labeling

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now