An Optimized Implementation of Speech Recognition Combining GPU with Deep Belief Network for IoT

Jing, Weipeng; Jiang, Tao; Mukherjee, Mithun; Shu, Lei; Kang, Jian

doi:10.1007/978-3-030-00410-1_30

Weipeng Jing^19,20,
Tao Jiang¹⁹,
Mithun Mukherjee²⁰,
Lei Shu^20,21 &
…
Jian Kang¹⁹

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 246))

Included in the following conference series:

International Conference on Internet of Things as a Service

1155 Accesses

Abstract

With the advancement in Internet of Things (Iot), the speech recognition technology in mobile terminals’ applications has become a new trend. Consequently, how to accelerate the training and improve the accuracy in speech recognition has attracted the attention of academia and industry. Generally, Deep Belief Network (DBN) with Graphic Processing Unit (GPU) is applied in acoustic model of speech recognition, critical research challenges are yet to be solved. It’s hard for GPU to store the parameters of DBN at one time as well as GPU’s shared memory is not fully used. And parameters transmission have become a bottleneck in multi-GPUs. This paper presents a new method in which the weight matrix is divided into sub-weight matrices and established a reasonable memory model. To eliminate the inefficient idle-state during data transfers, a stream process model is proposed in which the data transfer and kernel execution are performed simultaneously. Further, apply the optimized single GPU implementation to multi-GPUs and is intend to solve the parameters transmission. Experimental results show the optimized GPU implementation without violating the size limitation of GPU’s memory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Optimization of DBN/GPU Speech Recognition on Wireless Network Applications

Unit middleware for implementation of human–machine interconnection intelligent ecology construction

Article Open access 21 June 2023

GPU Accelerated Speech Recognition

Notes

1.
In 2006, a parallel computing platform and programming model for NVIDIA GPUs named CUDA [11] was introduced aiming to make full use of computing power of GPUs to achieve general purpose computation. CUDA also enables programmers without any knowledge about graphic APIs to write C/C++ code for high performance scientific computation by using NVIDIA GPUs. Therefore, it is widely used in speech recognition based on DBN model.

References

Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of things (IoT): a vision, architectural elements, and future directions. Future Gen. Comput. Syst. 29(7), 1645–1660 (2013)
Article Google Scholar
Su, D., Wu, X., Xu, L.: GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection. In: IEEE International Conference on Acoustics Speech and Signal Processing, pp. 4890–4893, Texas, USA, March 2010
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet Google Scholar
Seide, F., Li, G., Yu, D.: Conversational speech transcription using context-dependent deep neural networks. In: Conference of the International Speech Communication Association (INTERSPEECH), Florence, Italy, pp. 437–440, August 2011
Google Scholar
Sainath, T.N., Kingsbury, B., Ramabhadran, B., Fousek, P.: Making deep belief networks effective for large vocabulary continuous speech recognition. In: Proceedings of Automatic Speech Recognition and Understanding, pp. 30–35, December 2011
Google Scholar
Raina, R., Madhavan, A., Ng, A.Y.: Large-scale deep unsupervised learning using graphics processors. In: Proceedings of International Conference on Machine Learning (ICML), Montreal, Quebec, Canada, pp. 873–880, June 2009
Google Scholar
Lopes, N., Ribeiro, B.: Towards adaptive learning with improved convergence of deep belief networks on graphics processing units. Pattern Recogn. 47(1), 114–127 (2014)
Article Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)
Article Google Scholar
Swersky, K., Chen, B., Marlin, B., De Freitas, N.: A tutorial on stochastic approximation algorithms for training restricted Boltzmann machines and deep belief nets. In: Information Theory and Applications Workshop, pp. 1–10, January 2010
Google Scholar
Deng, L., Togneri, R.: Deep dynamic models for learning hidden representations of speech features. In: Ogunfunmi, T., Togneri, R., Narasimha, M. (eds.) Speech & Audio Processing for Coding Enhancement & Recognition, pp. 153–195. Springer, Heidelberg (2015). https://doi.org/10.1007/978-1-4939-1456-2_6
Chapter Google Scholar
NVIDIA. What is CUDA (2006)
Google Scholar
Wang, Y., Tang, P., An, H., Liu, Z., Wang, K., Zhou, Y.: Optimization and analysis of parallel back propagation neural network on GPU Using CUDA. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9491, pp. 156–163. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26555-1_18
Chapter Google Scholar
Povey, D., et al.: The Kaldi speech recognition toolkit. IDIAP Publications (2012)
Google Scholar
Xue, S., Yan, S., Dai, L.: Fast training algorithm for deep neural network using multiple GPUs. J. Tsinghua Univ. (Sci. Technol.) 53(6), 745–748 (2013)
Google Scholar

Download references

Acknowledgement

The work described in this paper is supported by Guangdong Provincial Key Laboratory of Petrochemical Equipment Fault Diagnosis, Guangdong University of Petrochemical Technology (GDUPTKLAB201502) and Special Fund for Forest Scientific Research in the Public Welfare (201504307).

Author information

Authors and Affiliations

College of Information and Computer Engineering, Northeast Forestry University, Harbin, China
Weipeng Jing, Tao Jiang & Jian Kang
Guangdong Provincial Key Laboratory on Petrochemical Equipment Fault Diagnosis, Guangdong University of Petrochemical Technology, Maoming, China
Weipeng Jing, Mithun Mukherjee & Lei Shu
School of Engineering, University of Lincoln, Lincoln, UK
Lei Shu

Authors

Weipeng Jing
View author publications
You can also search for this author in PubMed Google Scholar
Tao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Mithun Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
Lei Shu
View author publications
You can also search for this author in PubMed Google Scholar
Jian Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Weipeng Jing .

Editor information

Editors and Affiliations

National Chiao Tung University, Hsinchu, Taiwan, Taiwan
Yi-Bing Lin
Department of Computer Science and Information, National Changhua University of Education, Changhua, Taiwan
Der-Jiunn Deng
Seoul, Korea (Republic of)
Ilsun You
Department of Industrial Engineering and Management, National Chiao Tung University, Hsinchu, Taiwan
Chun-Cheng Lin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jing, W., Jiang, T., Mukherjee, M., Shu, L., Kang, J. (2018). An Optimized Implementation of Speech Recognition Combining GPU with Deep Belief Network for IoT. In: Lin, YB., Deng, DJ., You, I., Lin, CC. (eds) IoT as a Service. IoTaaS 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 246. Springer, Cham. https://doi.org/10.1007/978-3-030-00410-1_30

Download citation

DOI: https://doi.org/10.1007/978-3-030-00410-1_30
Published: 18 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00409-5
Online ISBN: 978-3-030-00410-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics