Learning Speed Improvement Using Multi-GPUs on DNN-Based Acoustic Model Training in Korean Intelligent Personal Assistant

Lee, Donghyun; Kim, Kwang-Ho; Kang, Hee-Eun; Wang, Sang-Ho; Park, Sung-Yong; Kim, J. -H.

doi:10.1007/978-3-319-19291-8_27

Donghyun Lee⁵,
Kwang-Ho Kim⁵,
Hee-Eun Kang⁵,
Sang-Ho Wang⁵,
Sung-Yong Park⁵ &
…
J. -H. Kim⁵

1110 Accesses
2 Citations

Abstract

This paper proposes a learning speed improvement using multi-GPUs on DNN-based acoustic model training in Korean intelligent personal assistant (IPA). DNN learning involves iterative, stochastic parameter updates. These updates depend on the previous updates. The proposed method provides a distributed computing for DNN learning. DNN-based acoustic models are trained by using 320 h length Korean speech corpus. It was shown that the learning speed becomes five times faster on this implementation while maintaining speech recognition rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Allcock B, Bester J, Bresnahan J, Chervenak AL, Foster I, Kesselman C, Meder S, Nefedova V, Quesnel D, Tuecke S (2002) Data management and transfer in high performance computational grid environments. Parallel Comput J 28:749–771
Article Google Scholar
Bourlard H, Morgan N (1994) Connectionist speech recognition: a hybrid approach. Kluwer Academic Publishers, pp 1–263. doi:10.1007/978-1-4615-3210-1
Castellano M, Mastronardi G, Tarricone G (2009) Intrusion detection using neural networks: a grid computing based data mining approach. Neural Inf Process 5864:777–785
Google Scholar
Deng L, Acero A, Wang Y, Wang K, Hon H, Droppo J, Mahajan M, Huang XD (2002) A speech-centric perspectives for human–computer interface. In: IEEE Work. Multimed. Signal Process, pp 263–267
Google Scholar
Dragan RV (2002) Sun one grid engine enterprise edition software, 1 October, pp 1–3
Google Scholar
Forster I, Kesselan C (1999) The grid: blueprint for a new computing infrastructure. Morgan Kaufman, San Francisco
Google Scholar
Li X, Wang YY, Tur G (2011) Multi-task learning for spoken language understanding with shared slots. In: Proc. Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH 2011), pp 701–704
Google Scholar
Oh K-S, Jung K (2004) GPU implementation of neural networks. Pattern Recognit 37:1311–1314. doi:10.1016/j.patcog.2004.01.013
Article MATH Google Scholar
Raina R, Madhavan A, Ng AY (2009) Large-scale deep unsupervised learning using graphics processors. In: Proc. 26th Int. Conf. Mach. Learn, pp 873–880
Google Scholar
Reed DA, Mendes CL, Lu C, Foster I, Kesselmann C (2003) The grid 2: blueprint for a new computing infrastructure. Morgan Kaufman, San Francisco
Google Scholar
Seltzer ML, Droppo J (2013) Multi-task learning in deep neural networks for improved phoneme recognition. In: Proc. 2013 IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp 6965–6969
Google Scholar
Steinkraus D, Buck I, Simard PY (2005) Using GPUs for machine learning algorithms. In: Proc. Eighth Int. Conf. Doc. Anal. Recognit, pp 1115–1120
Google Scholar
Torresen O, Landsverk J (1998) A review of parallel implementations of backpropagation neural networks. In: Parallel Archit. Artif. Neural Networks, pp 25–64
Google Scholar
Tur G (2006) Multitask learning for spoken language understanding. In: Proc. 2006 IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp 585–588
Google Scholar
Wang G (2014) Context-dependent acoustic modelling for speech recognition. PhD Thesis, National University of Singapore
Google Scholar

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (No. NRF-2014R1A1A1002197).

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Sogang University, Seoul, 121-742, South Korea
Donghyun Lee, Kwang-Ho Kim, Hee-Eun Kang, Sang-Ho Wang, Sung-Yong Park & J. -H. Kim

Authors

Donghyun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Kwang-Ho Kim
View author publications
You can also search for this author in PubMed Google Scholar
Hee-Eun Kang
View author publications
You can also search for this author in PubMed Google Scholar
Sang-Ho Wang
View author publications
You can also search for this author in PubMed Google Scholar
Sung-Yong Park
View author publications
You can also search for this author in PubMed Google Scholar
J. -H. Kim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. -H. Kim .

Editor information

Editors and Affiliations

Department of Computer Science and Engin, Pohang University of Science & Tech, Namgu, Pohang, Korea (Republic of)
G.G. Lee
School of Information and Communications, Gwangju Institute of Science and Tech, Buk-gu, Gwangju, Korea (Republic of)
H.K. Kim
Microsoft Corporation, Redmond, Washington, USA
M. Jeong
Dept of Computer Science and Engineering, Sogang University, Mapo-gu, Seoul, Korea (Republic of)
J.-H. Kim

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lee, D., Kim, KH., Kang, HE., Wang, SH., Park, SY., Kim, J.H. (2015). Learning Speed Improvement Using Multi-GPUs on DNN-Based Acoustic Model Training in Korean Intelligent Personal Assistant. In: Lee, G., Kim, H., Jeong, M., Kim, JH. (eds) Natural Language Dialog Systems and Intelligent Assistants. Springer, Cham. https://doi.org/10.1007/978-3-319-19291-8_27

Download citation

DOI: https://doi.org/10.1007/978-3-319-19291-8_27
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-19290-1
Online ISBN: 978-3-319-19291-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics