Skip to main content

Using Machine Learning Algorithms Combined with Deep Learning in Speech Recognition

  • Conference paper
  • First Online:
  • 1165 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1500))

Abstract

Machine learning and deep learning applications are widely used, especially in the field of speech recognition. The authors have combined a number of machine learning algorithms with deep learning to recognize speech for device control, applying to the speech recognition problem to the advising education enrollment robot. As a result, a three-step machine learning model has been built: data preprocessing, speech recognition using neural networks, and answering questions based on recognized keywords. In which, for the data preprocessing step, the authors convert the sound wave into a spectral image. The speech recognition step uses CNN for noise filtering and feature extraction, and uses an LSTM network for keyword recognition. Tests under different conditions such as voice speed and loudness, environments with different noise levels have proven the effectiveness of the proposed model and algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://commonvoice.mozilla.org.

References

  1. Albaqshi, H., Sagheer, A.: Dysarthric speech recognition using convolutional recurrent neural networks. Int. J. Intell. Eng. Syst. 13(6), 384–392 (2020)

    Google Scholar 

  2. Han, W., et al.: Improving convolutional neural networks for automatic speech recognition with global context. Interspeech (2020). https://doi.org/10.21437/interspeech.2020-2059

  3. Warden, P., Brain, G.: Speech Commands: A Dataset for Limited-vocabulary Speech Recognition. Mountain View, California (2018)

    Google Scholar 

  4. Nassif, A.B., Shahin, I., Attili, I., Azzeh, M., Shaalan, K.: Speech recognition using deep neural networks: a systematic review. IEEE Access 7, 19143–19165 (2019)

    Google Scholar 

  5. Wu, C., Karanasou, P., Gales, M., Sim, K.C.: Stimulated deep neural network for speech recognition. Interspeech (2016). https://doi.org/10.21437/Interspeech.2016-580

  6. Manaswi, N.K.: Deep Learning with Applications Using Python: Chatbots and Face, Object, and Speech Recognition with TensorFlow and Keras, 1st edn. Apress, Berkeley, CA (2018)

    Book  Google Scholar 

  7. Thomas, F., Christopher, K.: Deep learning with long short-term memory networks for financial market predictions. Eur. J. Oper. Res. 270(2), 654–669 (2018)

    Google Scholar 

  8. Mac, D.H., Tong, V.V., Bui, T.T., Tran, Q.D., Nguyen, L.G.: A method to improve LSTM using statistical features for DGA botnet detection. Res. Dev. Inf. Commun. Technol. E-3(14), 33–42 (2018)

    Google Scholar 

  9. CODE24h. https://code24h.com. Accessed 30 June 2021

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vu Thanh Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, V.T. et al. (2021). Using Machine Learning Algorithms Combined with Deep Learning in Speech Recognition. In: Dang, T.K., Küng, J., Chung, T.M., Takizawa, M. (eds) Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications. FDSE 2021. Communications in Computer and Information Science, vol 1500. Springer, Singapore. https://doi.org/10.1007/978-981-16-8062-5_35

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-8062-5_35

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-8061-8

  • Online ISBN: 978-981-16-8062-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics