Using Machine Learning Algorithms Combined with Deep Learning in Speech Recognition

Nguyen, Vu Thanh; Tiep, Mai Viet; Huy, Phu Phuoc; Nho, Nguyen Thai; Dung, Luong The; Hien, Vu Thanh; Toan, Phan Thanh

doi:10.1007/978-981-16-8062-5_35

Using Machine Learning Algorithms Combined with Deep Learning in Speech Recognition

Vu Thanh Nguyen⁹,
Mai Viet Tiep¹⁰,
Phu Phuoc Huy¹¹,
Nguyen Thai Nho¹²,
Luong The Dung¹⁰,
Vu Thanh Hien¹³ &
…
Phan Thanh Toan¹⁴

Conference paper
First Online: 14 November 2021

1165 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1500))

Abstract

Machine learning and deep learning applications are widely used, especially in the field of speech recognition. The authors have combined a number of machine learning algorithms with deep learning to recognize speech for device control, applying to the speech recognition problem to the advising education enrollment robot. As a result, a three-step machine learning model has been built: data preprocessing, speech recognition using neural networks, and answering questions based on recognized keywords. In which, for the data preprocessing step, the authors convert the sound wave into a spectral image. The speech recognition step uses CNN for noise filtering and feature extraction, and uses an LSTM network for keyword recognition. Tests under different conditions such as voice speed and loudness, environments with different noise levels have proven the effectiveness of the proposed model and algorithms.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://commonvoice.mozilla.org.

References

Albaqshi, H., Sagheer, A.: Dysarthric speech recognition using convolutional recurrent neural networks. Int. J. Intell. Eng. Syst. 13(6), 384–392 (2020)
Google Scholar
Han, W., et al.: Improving convolutional neural networks for automatic speech recognition with global context. Interspeech (2020). https://doi.org/10.21437/interspeech.2020-2059
Warden, P., Brain, G.: Speech Commands: A Dataset for Limited-vocabulary Speech Recognition. Mountain View, California (2018)
Google Scholar
Nassif, A.B., Shahin, I., Attili, I., Azzeh, M., Shaalan, K.: Speech recognition using deep neural networks: a systematic review. IEEE Access 7, 19143–19165 (2019)
Google Scholar
Wu, C., Karanasou, P., Gales, M., Sim, K.C.: Stimulated deep neural network for speech recognition. Interspeech (2016). https://doi.org/10.21437/Interspeech.2016-580
Manaswi, N.K.: Deep Learning with Applications Using Python: Chatbots and Face, Object, and Speech Recognition with TensorFlow and Keras, 1st edn. Apress, Berkeley, CA (2018)
Book Google Scholar
Thomas, F., Christopher, K.: Deep learning with long short-term memory networks for financial market predictions. Eur. J. Oper. Res. 270(2), 654–669 (2018)
Google Scholar
Mac, D.H., Tong, V.V., Bui, T.T., Tran, Q.D., Nguyen, L.G.: A method to improve LSTM using statistical features for DGA botnet detection. Res. Dev. Inf. Commun. Technol. E-3(14), 33–42 (2018)
Google Scholar
CODE24h. https://code24h.com. Accessed 30 June 2021

Download references

Author information

Authors and Affiliations

Ho Chi Minh City University of Food Industry, Ho Chi Minh City, Vietnam
Vu Thanh Nguyen
Academy of Cryptography Techniques, Ho Chi Minh City, Vietnam
Mai Viet Tiep & Luong The Dung
Military Information Technology Institute, Ho Chi Minh City, Vietnam
Phu Phuoc Huy
Saigon Technology University, Ho Chi Minh City, Vietnam
Nguyen Thai Nho
Ho Chi Minh City University of Technology (HUTECH), Ho Chi Minh City, Vietnam
Vu Thanh Hien
Posts and Telecommunications Institute of Technology, Ho Chi Minh City, Vietnam
Phan Thanh Toan

Authors

Vu Thanh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Mai Viet Tiep
View author publications
You can also search for this author in PubMed Google Scholar
Phu Phuoc Huy
View author publications
You can also search for this author in PubMed Google Scholar
Nguyen Thai Nho
View author publications
You can also search for this author in PubMed Google Scholar
Luong The Dung
View author publications
You can also search for this author in PubMed Google Scholar
Vu Thanh Hien
View author publications
You can also search for this author in PubMed Google Scholar
Phan Thanh Toan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vu Thanh Nguyen .

Editor information

Editors and Affiliations

HCMC University of Technology (HCMUT), Ho Chi Minh City, Vietnam
Tran Khanh Dang
Johannes Kepler University of Linz, Linz, Austria
Josef Küng
Sungkyunkwan University, Suwon, Korea (Republic of)
Tai M. Chung
Hosei University, Tokyo, Japan
Makoto Takizawa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen, V.T. et al. (2021). Using Machine Learning Algorithms Combined with Deep Learning in Speech Recognition. In: Dang, T.K., Küng, J., Chung, T.M., Takizawa, M. (eds) Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications. FDSE 2021. Communications in Computer and Information Science, vol 1500. Springer, Singapore. https://doi.org/10.1007/978-981-16-8062-5_35

Download citation

DOI: https://doi.org/10.1007/978-981-16-8062-5_35
Published: 14 November 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8061-8
Online ISBN: 978-981-16-8062-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics