Multi-sensor data fusion for sign language recognition based on dynamic Bayesian network and convolutional neural network

Xiao, Qinkun; Zhao, Yidan; Huan, Wang

doi:10.1007/s11042-018-6939-8

Multi-sensor data fusion for sign language recognition based on dynamic Bayesian network and convolutional neural network

Published: 30 November 2018

Volume 78, pages 15335–15352, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Qinkun Xiao¹,
Yidan Zhao¹ &
Wang Huan¹

675 Accesses
23 Citations
Explore all metrics

Abstract

A new multi-sensor fusion framework is proposed, which is based on the Convolutional Neural Network (CNN) and the Dynamic Bayesian Network (DBN) for Sign Language Recognition (SLR). In this framework, a Microsoft Kinect, which is a low-cost RGB-D sensor, is used as tools of the Human-Computer-Interaction (HCI). In our method, at first, the color and depth videos are collected using the Kinect, the next, all image sequences features are extracted out using the CNN. The color and depth feature sequences are input into the DBN as observation data. Based on graph model fusion, the maximum recognition rate of dynamic isolated sign language is calculated. The proposed the DBN + CNN SLR framework is tested in our dataset, the highest recognition rate can up to 99.40%. The test results show that our approach is effective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Transfer learning for image classification using VGG19: Caltech-101 image data set

Article 17 September 2021

Monika Bansal, Munish Kumar, … Ajay Mittal

A review of convolutional neural networks in computer vision

Article Open access 23 March 2024

Xia Zhao, Limin Wang, … Milan Parmar

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

References

Almeida SGM, Guimarães FG, Ramírez JA (2014) Feature extraction in brazilian sign language recognition based on phonological structure and using RGB-D sensors. Expert Syst Appl 4l:7259–7271
Article Google Scholar
Brand MO, Pentland N (1997) A coupled hidden Markov models for complex action recognition. CVPR: 994–999
Celebi S, Aydin AS, Temiz TT, Arici T (2013) Gesture recognition using skeleton data with weighted dynamic time warping. Int Conf Comput Vision Theory Appl: 620–625
Chen FS, Fu CM, Huang CL (2003) Hand gesture recognition using a real-time tracking method and hidden markov models. Image Vis Comput 2003(21):745–758
Article Google Scholar
Yan Chenggang, Zhang Yongdong, Xu Jizheng, Dai Feng, Li Liang, Dai Qionghai, Wu Feng. A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Sign Process Lett, v 21, n 5, p 573–576, 2014
Yan Chenggang, Zhang Yongdong, Xu Jizheng, Dai Feng, Zhang Jun, Dai Qionghai, Wu Feng. Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans Circ Syst Video Technol, 24, n 12, p 2077–2089, 2014
Yan Chenggang, Xie Hongtao, Yang Dongbao, Yin Jian, Zhang Yongdong, Dai Qionghai. Supervised hash coding with deep neural network for environment perception of intelligent vehicles. IEEE Trans Intell Transp Syst, 19, n 1, p 284–295, 2018
Yan Chenggang, Xie Hongtao, Liu Shun, Yin Jian, Zhang Yongdong, Dai Qionghai. Effective Uyghur language text detection in complex background images for traffic prompt identification. IEEE Trans Intell Transp Syst, 19, n 1, p 220–229, 2018
Chu SM, Huang TS (2002) Audio-visual speech modeling using coupled hidden markov models. ICASSP: 2009–2012
Dagum P, Galper A, Horvitz E (1992) Dynamic network models for forecasting. Proc Eighth Conf Uncertainty Artif Intell AUAI Press: 41–48
Elons A, Ahmed M, Shedid H, Tolba M (2014) Arabia sign language recognition using leap motion sensor. Int Conf Comput Eng Syst:368–373
Graves A, Liwicki M, Fern’andez S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31:855–868
Article Google Scholar
Lang S, Block M, Rojas R (2012) Sign language recognition using kinect. Artif Intell Soft Comput: 394–402
Marin G, Dominio F, Zanuttigh P (2014) Hand gesture recognition with leap motion and kinect devices. Int Conf Image Process: 1565–1569
Marin G, Dominio F, Zanuttigh P (2015) Hand gesture recognition with jointly calibrated leap motion and depth sensor. Multimed Tools Appl I25
Nefian AV, Liang L, Pi X, Xiaoxiang L, Mao C, Murphy K (2002) A coupled hmm for audio-visual speech recognition. ICASSP 2002:2013–2016
Google Scholar
Pedersoli F, Benini S, Adami N, Leonardi R (2014) Xkin: an open source framework for hand pose and gesture recognition using kinect. Vis Comput 30:1107–1122
Article Google Scholar
Pugeault N, Bowden R (2011) Spelling it out: real-time ASL finger spelling recognition. ICCV:1114–1119
Russell S, Norvig P (2010) Artificial intelligence: a modern approach (third ed.). Prentice Hall
Suk HI, Sin BK, Lee SW (2010) Hand gesture recognition based on dynamic Bayesian network framework. Pattern Recogn: 3059–3072

Download references

Acknowledgements

This work is supported by the Nature Science Foundation of China (Nos. 60972095, 61271362, 61671362) and Nature Science Basic Research Plan in Shaanxi Province of China (Nos. 2017JM6041).

Author information

Authors and Affiliations

Department of Electronics Information and Engineering, Xi’an Technological University, Xi’an City, People’s Republic of China, 710032
Qinkun Xiao, Yidan Zhao & Wang Huan

Authors

Qinkun Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Yidan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Wang Huan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qinkun Xiao.

Ethics declarations

Conflicts of interest

Qinkun Xiao stated that he has no conflicts of interest.

Author Zhao Yidan claims she has no conflicts of interest.

Author Wang Huan claims she has no conflicts of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiao, Q., Zhao, Y. & Huan, W. Multi-sensor data fusion for sign language recognition based on dynamic Bayesian network and convolutional neural network. Multimed Tools Appl 78, 15335–15352 (2019). https://doi.org/10.1007/s11042-018-6939-8

Download citation

Received: 28 June 2018
Revised: 14 November 2018
Accepted: 23 November 2018
Published: 30 November 2018
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s11042-018-6939-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-sensor data fusion for sign language recognition based on dynamic Bayesian network and convolutional neural network

Abstract

Access this article

Similar content being viewed by others

Transfer learning for image classification using VGG19: Caltech-101 image data set

A review of convolutional neural networks in computer vision

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-sensor data fusion for sign language recognition based on dynamic Bayesian network and convolutional neural network

Abstract

Access this article

Similar content being viewed by others

Transfer learning for image classification using VGG19: Caltech-101 image data set

A review of convolutional neural networks in computer vision

Multi-scale Dilated Attention Graph Convolutional Network for Skeleton-Based Action Recognition

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation