Conferences >2019 13th International Confe...

Monaural Speech Enhancement Based On Two Stage Long Short-Term Memory Networks

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The performance of the deep neural networks (DNNs) based monaural speech enhancement methods is still limited in real room environments, particularly for the speaker-inde...Show More

Metadata

Abstract:

The performance of the deep neural networks (DNNs) based monaural speech enhancement methods is still limited in real room environments, particularly for the speaker-independent case. The surface reflections and unseen speakers increase the challenge in the estimation of sources from reverberant noisy speech mixtures. To address these issues, we propose a two-stage approach using long short-term memory (LSTM) networks. In the first stage, the dereverberation mask (DM) is obtained by using a trained LSTM, which aims to dereverberate the noisy speech mixture. In the second stage, the ideal ratio mask (IRM) is estimated by the second trained LSTM, which is exploited to separate the desired speech signal from the dereverberated speech mixture. The signal-to-distortion ratio (SDR) shows the efficacy of the LSTMs over DNNs.

Published in: 2019 13th International Conference on Signal Processing and Communication Systems (ICSPCS)

Date of Conference: 16-18 December 2019

Date Added to IEEE Xplore: 27 February 2020

ISBN Information:

DOI: 10.1109/ICSPCS47537.2019.9008709

Conference Location: Gold Coast, QLD, Australia

Contents

References is not available for this document.

Monaural Speech Enhancement Based On Two Stage Long Short-Term Memory Networks

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Monaural Speech Enhancement Based On Two Stage Long Short-Term Memory Networks

Alerts

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?