Abstract:
Traditional speech separation systems enhance the magnitude response of noisy speech. Recent studies, however, have shown that perceptual speech quality is significantly ...Show MoreMetadata
Abstract:
Traditional speech separation systems enhance the magnitude response of noisy speech. Recent studies, however, have shown that perceptual speech quality is significantly improved when magnitude and phase are both enhanced. These studies, however, have not determined if phase enhancement is beneficial in environments that contain reverberation as well as noise. In this paper, we present an approach that jointly enhances the magnitude and phase of reverberant and noisy speech. We use a deep neural network to estimate the real and imaginary components of the complex ideal ratio mask (cIRM), which results in clean and anechoic speech when applied to a reverberant-noisy mixture. Our results show that phase is important for dereverberation, and that complex ratio masking outperforms related methods.
Published in: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 05-09 March 2017
Date Added to IEEE Xplore: 19 June 2017
ISBN Information:
Electronic ISSN: 2379-190X