Loading [a11y]/accessibility-menu.js
Features for Masking-Based Monaural Speech Separation in Reverberant Conditions | IEEE Journals & Magazine | IEEE Xplore

Features for Masking-Based Monaural Speech Separation in Reverberant Conditions


Abstract:

Monaural speech separation is a fundamental problem in speech and signal processing. This problem can be approached from a supervised learning perspective by predicting a...Show More

Abstract:

Monaural speech separation is a fundamental problem in speech and signal processing. This problem can be approached from a supervised learning perspective by predicting an ideal time-frequency mask from features of noisy speech. In reverberant conditions at low signal-to-noise ratios (SNRs), accurate mask prediction is challenging and can benefit from effective features. In this paper, we investigate an extensive set of acoustic-phonetic features extracted in adverse conditions. Deep neural networks are used as the learning machine, and separation performance is evaluated using standard objective speech intelligibility metrics. Separation performance is systematically evaluated in both nonspeech and speech interference, in a variety of SNRs, reverberation times, and direct-to-reverberant energy ratios. Considerable performance improvement is observed by using contextual information, likely due to temporal effects of room reverberation. In addition, we construct feature combination sets using a sequential floating forward selection algorithm, and combined features outperform individual ones. We also find that optimal feature sets in anechoic conditions are different from those in reverberant conditions.
Page(s): 1085 - 1094
Date of Publication: 27 March 2017

ISSN Information:

Funding Agency:


References

References is not available for this document.