Cochannel speech separation is the task of separating two speech signals from a single mixture. The task becomes even more challenging if the speech mixture is further corrupted by background noise. In this study, we focus on a gender-dependent scenario, where target speech is from a male speaker and interfering speech from a female speaker. We propose a two-stage separation strategy to address this problem in a noise-independent way. In the proposed system, denoising and cochannel separation are performed successively by two modules, which are based on a newly-introduced convolutional neural network for speech separation. The evaluation results demonstrate that the proposed system substantially outperforms one-stage baselines in terms of objective intelligibility and perceptual quality.
Cite as: Tan, K., Wang, D. (2018) A Two-Stage Approach to Noisy Cochannel Speech Separation with Gated Residual Networks. Proc. Interspeech 2018, 3484-3488, doi: 10.21437/Interspeech.2018-1406
@inproceedings{tan18b_interspeech, author={Ke Tan and DeLiang Wang}, title={{A Two-Stage Approach to Noisy Cochannel Speech Separation with Gated Residual Networks}}, year=2018, booktitle={Proc. Interspeech 2018}, pages={3484--3488}, doi={10.21437/Interspeech.2018-1406} }