skip to main content
10.1145/3591569.3591614acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiciitConference Proceedingsconference-collections
research-article

Build A Module for Improvement Real Time Speech enhancement using Long Short-term Memory Approach: Improvement Real Time Speech enhancement using Long Short-term Memory

Published:13 July 2023Publication History

ABSTRACT

An essential customer experience is required for all businesses today, and customer support as a service brings the right people and processes together. When designing a system for in the context of audio communication for transmission purposes, noise influences must be carefully considered. Improving the quality of phone calls for a smart virtual call center is essential for more effective customer care. This paper proposed a module for improving real-time speech enhancement of phone calls using Long short-term memory (LSTM), an artificial neural network used in the fields of artificial intelligence and deep learning. LSTMs are designed to revoke the long-term dependency issue, remembering information for long periods is generally their default way of behaving. The data set using for this approach is both in English and Vietnamese, the results also improve with evaluation metrics such as PESQ, SI-SDR, STOI.

References

  1. Kumar V. and Werner R. 2018. Customer Relationship Management. Springer.Google ScholarGoogle Scholar
  2. Gillian M. Davis. 2002. Noise Reduction in Speech Applications. CRC Press.Google ScholarGoogle Scholar
  3. Lim J.S. and Oppenheim A.V. 1979. Enhancement and bandwidth compression of noisy speech. Proceedings of the IEEE 67, 1586-1979. DOI:https://doi.org/10.1109/PROC.Google ScholarGoogle ScholarCross RefCross Ref
  4. Welch, G. & Bishop, G. 1995. An Introduction to the Kalman Filter. Technical report, University of North Carolina at Chapel Hill , University of North Carolina at Chapel Hill , Chapel Hill, NC, USA .Google ScholarGoogle Scholar
  5. Oswald Campesato. 2020. Chapter 4, 5. Artificial Intelligence, Machine Learning, and Deep Learning. Mercury Learning and Information.Google ScholarGoogle Scholar
  6. Wang, DeLiang and Chen, Jitong. 2017. Supervised Speech Separation Based on Deep Learning: An Overview. IEEE/ACM Transactions on Audio, Speech, and Language Processing 26, 1702-1726. DOI:https://doi.org/10.1109/TASLP.2018.2842159Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Prajna Kunche and N. Manikanthababu. 2020. Fractional Fourier Transform Techniques for Speech Enhancement. Springer Nature, 22-50.Google ScholarGoogle Scholar
  8. Firdauzi, Anugerah, Wirianto, Kiki, Arijal, Muhammad, and Adiono, Trio. 2013. Design and Implementation of Real Time Noise Cancellation System based on Spectral Subtraction Method. Procedia Technology 11, 100-1010. DOI:https://doi.org/10.1016/j.protcy.2013.12.287Google ScholarGoogle ScholarCross RefCross Ref
  9. Jean-Marc Valin. 2018. A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement. 2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP). DOI:https://doi.org/10.1109/mmsp.2018.8547084Google ScholarGoogle ScholarCross RefCross Ref
  10. Westhausen, N. L., & Meyer, B. T. 2020. Dual-signal transformation LSTM network for real-time noise suppression. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (Vol. 2020-October, pp. 2477–2481). International Speech Communication Association. https://doi.org/10.21437/Interspeech.2020-2631Google ScholarGoogle ScholarCross RefCross Ref
  11. Yi Hu and Philipos C. Loizou. 2008. Evaluation of Objective Quality Measures for Speech Enhancement. IEEE Transactions on Audio, Speech, and Language Processing, 16, 229-238. DOI:https://doi.org/10.1109/TASL.2007.911054Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Cees H. Taal, Richard C. Hendriks, and Richard Heusdens. 2011. An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech. IEEE Transactions on Audio, Speech, and Language Processing 19, 2125-2136. DOI:https://doi.org/10.1109/TASL.2011.2114881Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jonathan Le Roux, Scott Wisdom, Hakan Erdogan, and John R. Hershey. 2019. SDR – Half-baked or Well Done? ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 19, 626-630. DOI: https://doi.org/doi: 10.1109/ICASSP.2019.8683855Google ScholarGoogle ScholarCross RefCross Ref
  14. Robert, J., Webbie, M. & others, 2018. Pydub, GitHub. Available at: http://pydub.com/.Google ScholarGoogle Scholar

Index Terms

  1. Build A Module for Improvement Real Time Speech enhancement using Long Short-term Memory Approach: Improvement Real Time Speech enhancement using Long Short-term Memory
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Other conferences
            ICIIT '23: Proceedings of the 2023 8th International Conference on Intelligent Information Technology
            February 2023
            310 pages
            ISBN:9781450399616
            DOI:10.1145/3591569

            Copyright © 2023 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 13 July 2023

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited
          • Article Metrics

            • Downloads (Last 12 months)22
            • Downloads (Last 6 weeks)0

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format