Skip to main content

Advertisement

Log in

Exploration of English speech translation recognition based on the LSTM RNN algorithm

  • S.I.: Evolutionary Computation based Methods and Applications for Data Processing
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In today’s information society, the demand for intelligence is increasing daily. English speech translation recognition technology based on the LSTM (long short-term memory) recurrent neural network (RNN) algorithm is an important manifestations of computer intelligence. In recent years, many scholars have conducted research on speech translation recognition technology, including template matching and statistical pattern recognition. Each of these methods has its drawbacks. This paper discusses English speech recognition techniques by utilizing the basic RNN principles. Moreover, its application and construction in practice, which can provide some useful reference for future researchers, are analysed. LSTM RNN is an intelligent system that is different from traditional pattern recognition methods. The greatest difference is that it simulates the information processing of the human brain and realizes the intelligent information processing in a distributed manner. It has a variety of automatic recognition and extraction functions, such as storage, association, and retrieval, especially for speech translation and recognition problems with high perception ability. This new neural network recognition system has a strong scientific nature and can store sound information in a decentralized manner, similar to the human brain. The LSTM RNN has been widely used in the speech recognition field due to its excellent performance in extraction and classification. The study found that the recognition accuracy of the original RNN was generally maintained between 48 and 54%, and the data loss rate was relatively high. The accuracy rate of speech recognition based on LSTM RNN was as high as 94%, and the information storage efficiency was high, which greatly avoided repetitive processes. The voice data processing speed can be completed in 4.5 s at the fastest, which plays an important role in terms of mass satisfaction and social development needs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability statement

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.

References

  1. Du S (2019) Optimization of speech recognition system of english education industry based on machine learning. Computer-Aided Des Appl 17(1):124–136

    Article  Google Scholar 

  2. Chen X (2021) Simulation of english speech translation recognition based on transfer learning and CNN neural network. J Intell Fuzzy Syst 40(2):2349–2360

    Article  Google Scholar 

  3. Zenkel T, Sperber M, Niehues J (2018) An open source toolkit for speech-to-English text translation. Prague Bull Math Ling 111(1):125–135

    Article  Google Scholar 

  4. Dharmale G, Thakare VM, Patil DD (2019) Implementation of Efficient speech recognition system on mobile device for Hindi and English language. Int J Adv Comput Sci Appl 10(2):83–87

    Google Scholar 

  5. Hou Q, Li C, Kang M (2020) Intelligent model for speech recognition based on SVM: a case study on English language. J Intell Fuzzy Syst 40(7):1–11

    Google Scholar 

  6. Duan R, Wang Y, Qin H (2020) A speech recognition model for correcting spoken English teaching. Journal of Intelligence and Fuzzy Systems 40(1):1–12

    Google Scholar 

  7. Hai Y (2020) Computer-aided teaching mode of oral English intelligent learning based on speech recognition and network assistance. J Intell Fuzzy Syst 39(4):5749–5760

    Article  MathSciNet  Google Scholar 

  8. Zhu H (2020) Construction of English spoken language system based on machine learning algorithm and natural language recognition. J Intell Fuzzy Syst 39(99):1–12

    Google Scholar 

  9. Sangeetha J, Jothilakshmi S (2017) Speech translation system for english to dravidian languages. Appl Intell 46(3):534–550

    Article  Google Scholar 

  10. Mott M, Midgley KJ, Holcomb PJ (2020) Speech recognition translation initiation and image effects in American Sign Language deaf and English listening learners. Biling Lang Cognit 23(5):1032–1044

    Article  Google Scholar 

  11. Mendel LL, Poussen M, Bass JK (2019) English speech recognition threshold test for Spanish children. Am J Audiol 28(1):1–8

    Google Scholar 

  12. Long Y, Li Y, Zhang Q (2020) Acoustic data augmentation for Mandarin-English code-switching speech recognition. Appl Acoust 161(11):107–125

    Google Scholar 

  13. Feng X, Zhou Y (2021) English translation language retrieval based on adaptive English phonetic adjustment algorithm. Complexity 202(1):1–12

    MathSciNet  Google Scholar 

  14. Cao D, Guo Y (2020) Algorithm research of spoken English assessment based on fuzzy measure and speech recognition technology. Int J Biom 12(1):120–131

    MathSciNet  Google Scholar 

  15. Miller MK, Calandruccio L, Buss E (2019) Masked English speech recognition performance in younger and older Spanish–English bilingual and English monolingual children. J Speech Lang Hear Res 62(12):1–14

    Article  Google Scholar 

  16. Yun Z (2017) Research on spoken english speech recognition technology in computer network environment. Boletin Tecnico/Tech Bull 55(16):445–449

    Google Scholar 

  17. Zhang Y, Liu L (2018) Using computer speech recognition technology to evaluate spoken English. Educ Sci Theory Pract 18(5):20–31

    MathSciNet  Google Scholar 

  18. Hidayat R, Winursito A (2021) Improved MFCC robust English speech recognition based on wavelet denoising. Int J Intell Eng Syst 14(1):12–21

    Google Scholar 

  19. Pathak A, Pakray P, Bentham J (2019) English-Mizo machine translation using neural and statistical approaches. Neural Comput Appl 31:7615–7631

    Article  Google Scholar 

  20. Bawa S (2021) A Sanskrit-to-English machine translation using hybridization of direct and rule-based approach. Neural Comput Appl 33:2819–2838

    Article  Google Scholar 

Download references

Funding

This work was supported by Shaoyang Science and Technology Planning Project (2021025ZD): Construction of College English online education Platform under the background of "Internet+".

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Dai.

Ethics declarations

Conflict of interest

These are no potential competing interests in our paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yuan, Q., Dai, Y. & Li, G. Exploration of English speech translation recognition based on the LSTM RNN algorithm. Neural Comput & Applic 35, 24961–24970 (2023). https://doi.org/10.1007/s00521-023-08462-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-08462-8

Keywords

Navigation