Skip to main content

Learning Speed Improvement Using Multi-GPUs on DNN-Based Acoustic Model Training in Korean Intelligent Personal Assistant

  • Chapter
Natural Language Dialog Systems and Intelligent Assistants

Abstract

This paper proposes a learning speed improvement using multi-GPUs on DNN-based acoustic model training in Korean intelligent personal assistant (IPA). DNN learning involves iterative, stochastic parameter updates. These updates depend on the previous updates. The proposed method provides a distributed computing for DNN learning. DNN-based acoustic models are trained by using 320 h length Korean speech corpus. It was shown that the learning speed becomes five times faster on this implementation while maintaining speech recognition rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Allcock B, Bester J, Bresnahan J, Chervenak AL, Foster I, Kesselman C, Meder S, Nefedova V, Quesnel D, Tuecke S (2002) Data management and transfer in high performance computational grid environments. Parallel Comput J 28:749–771

    Article  Google Scholar 

  • Bourlard H, Morgan N (1994) Connectionist speech recognition: a hybrid approach. Kluwer Academic Publishers, pp 1–263. doi:10.1007/978-1-4615-3210-1

  • Castellano M, Mastronardi G, Tarricone G (2009) Intrusion detection using neural networks: a grid computing based data mining approach. Neural Inf Process 5864:777–785

    Google Scholar 

  • Deng L, Acero A, Wang Y, Wang K, Hon H, Droppo J, Mahajan M, Huang XD (2002) A speech-centric perspectives for human–computer interface. In: IEEE Work. Multimed. Signal Process, pp 263–267

    Google Scholar 

  • Dragan RV (2002) Sun one grid engine enterprise edition software, 1 October, pp 1–3

    Google Scholar 

  • Forster I, Kesselan C (1999) The grid: blueprint for a new computing infrastructure. Morgan Kaufman, San Francisco

    Google Scholar 

  • Li X, Wang YY, Tur G (2011) Multi-task learning for spoken language understanding with shared slots. In: Proc. Annu. Conf. Int. Speech Commun. Assoc. (INTERSPEECH 2011), pp 701–704

    Google Scholar 

  • Oh K-S, Jung K (2004) GPU implementation of neural networks. Pattern Recognit 37:1311–1314. doi:10.1016/j.patcog.2004.01.013

    Article  MATH  Google Scholar 

  • Raina R, Madhavan A, Ng AY (2009) Large-scale deep unsupervised learning using graphics processors. In: Proc. 26th Int. Conf. Mach. Learn, pp 873–880

    Google Scholar 

  • Reed DA, Mendes CL, Lu C, Foster I, Kesselmann C (2003) The grid 2: blueprint for a new computing infrastructure. Morgan Kaufman, San Francisco

    Google Scholar 

  • Seltzer ML, Droppo J (2013) Multi-task learning in deep neural networks for improved phoneme recognition. In: Proc. 2013 IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp 6965–6969

    Google Scholar 

  • Steinkraus D, Buck I, Simard PY (2005) Using GPUs for machine learning algorithms. In: Proc. Eighth Int. Conf. Doc. Anal. Recognit, pp 1115–1120

    Google Scholar 

  • Torresen O, Landsverk J (1998) A review of parallel implementations of backpropagation neural networks. In: Parallel Archit. Artif. Neural Networks, pp 25–64

    Google Scholar 

  • Tur G (2006) Multitask learning for spoken language understanding. In: Proc. 2006 IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp 585–588

    Google Scholar 

  • Wang G (2014) Context-dependent acoustic modelling for speech recognition. PhD Thesis, National University of Singapore

    Google Scholar 

Download references

Acknowledgements

This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (No. NRF-2014R1A1A1002197).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. -H. Kim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Lee, D., Kim, KH., Kang, HE., Wang, SH., Park, SY., Kim, J.H. (2015). Learning Speed Improvement Using Multi-GPUs on DNN-Based Acoustic Model Training in Korean Intelligent Personal Assistant. In: Lee, G., Kim, H., Jeong, M., Kim, JH. (eds) Natural Language Dialog Systems and Intelligent Assistants. Springer, Cham. https://doi.org/10.1007/978-3-319-19291-8_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19291-8_27

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19290-1

  • Online ISBN: 978-3-319-19291-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics