Abstract:
Speech enhancement is an important task in many applications such as speech recognition. Conventional methods always require some principles by which to distinguish speec...Show MoreMetadata
Abstract:
Speech enhancement is an important task in many applications such as speech recognition. Conventional methods always require some principles by which to distinguish speech and noise and the most successful enhancement requires strong models for both speech and noise. However, if the noise actually encountered differs significantly from the system's assumptions, performance will rapidly declines. In this work, we propose an unsupervised speech enhancement system based on decomposing the frequency-time spectrogram into a sparse foreground speech and a low-rank background noise, which makes few assumptions about the noise other than its limited spectral variation. An image based masking is also designed to handle the poor performance of noise removing when using spectrogram decomposition only. Evaluations via PESQ and SegSNR show that the new approach improves signal-to-distortion ratio and PESQ in most cases when compared to several traditional speech enhancement algorithms.
Date of Conference: 12-14 September 2014
Date Added to IEEE Xplore: 27 October 2014
Electronic ISBN:978-1-4799-4219-0