Skip to main content

Automatic Recognition of Kazakh Speech Using Deep Neural Networks

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11432))

Abstract

This article presents a deep neural network (DNN) system based on automatic speech recognition for Kazakh language, developed using the Kaldi speech recognition tool. DNNs are initialized using the restricted Boltzmann machines (RBM) and are trained using cross-entropy as the objective function and the standard back propagation of error. In order to achieve optimal results, the training has been modified based on peculiarities of Kazakh language. A 76 hours-corpus has been used in training. Results are compared for two different sets of values between classical models and various DNN settings.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Stouten, F., Duchateau, J., Martens, J.-P., Wambacq, P.: Coping with disfluencies in spontaneous speech recognition: acoustic detection and linguistic context manipulation. Speech Commun. 48, 1590–1606 (2006)

    Article  Google Scholar 

  2. Tsiaras, V., Panagiotakis, C., Stylianou, Y.: Video and audio based detection of filled hesitation pauses in classroom lectures. In: Proceedings of the 17th European Signal Processing Conference (EUSIPCO 2009), Glasgow, Scotland, 24–28 August 2009, pp. 834–838 (2009)

    Google Scholar 

  3. Psutka, J., Ircing, P., Psutka, J.V., Hajič, J., Byrne, W.J., Mirovsky, J.: Automatic transcription of Czech, Russian, and Slovak spontaneous speech in the MALACH project. In: Proceedings of Eurospeech, Portugal, Lisboa, 4–8 September 2005, pp. 1349–1352 (2005)

    Google Scholar 

  4. Young, S., et al.: The HTK Book (for HTK Version 3.4), Cambridge, UK, 375 p. (2009)

    Google Scholar 

  5. Karpov, A., Kipyatkova, I., Ronzhin, A.: Very large vocabulary ASR for spoken Russian with syntactic and morphemic analysis. In: Proceedings INTERSPEECH-2011, Florence, Italy, pp. 3161–3164 (2011)

    Google Scholar 

  6. Serizel, R., Giuliani, D.: Vocal tract length normalization approaches to DNN-Based children’s and adults’ speech recognition. In: IEEE Workshop on Spoken Language Technology, pp. 135–140 (2014)

    Google Scholar 

  7. Behbahani, Y.M., Babaali, B., Turdalyuly, M.: Persian sentences to phoneme sequences conversion based on recurrent neural networks. Open Comput. Sci. 6, 219–225 (2016)

    Article  Google Scholar 

  8. Yu, D., Deng, L.: Automatic Speech Recognition, p. 315. Springer, London (2014). https://doi.org/10.1007/978-1-4471-5779-3

    Book  Google Scholar 

Download references

Acknowledgements

This work was supported by the Ministry of Education and Science of the Republic of Kazakhstan. IRN AP05131207 Development of technologies for multilingual automatic speech recognition using deep neural networks.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Orken Mamyrbayev or Mussa Turdalyuly .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mamyrbayev, O., Turdalyuly, M., Mekebayev, N., Alimhan, K., Kydyrbekova, A., Turdalykyzy, T. (2019). Automatic Recognition of Kazakh Speech Using Deep Neural Networks. In: Nguyen, N., Gaol, F., Hong, TP., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2019. Lecture Notes in Computer Science(), vol 11432. Springer, Cham. https://doi.org/10.1007/978-3-030-14802-7_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-14802-7_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-14801-0

  • Online ISBN: 978-3-030-14802-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics