Abstract:
Speaker identification (SID) in cochannel speech, where two speakers are talking simultaneously over a single recording channel, is a challenging problem. Previous studie...Show MoreMetadata
Abstract:
Speaker identification (SID) in cochannel speech, where two speakers are talking simultaneously over a single recording channel, is a challenging problem. Previous studies address this problem in the anechoic environment under the Gaussian mixture model (GMM) framework. On the other hand, cochannel SID in reverberant conditions has not been addressed. This paper studies cochannel SID in both anechoic and reverberant conditions. We first investigate GMM-based approaches and propose a combined system that integrates two cochannel SID methods. Second, we explore deep neural networks (DNNs) for cochannel SID and propose a DNN-based recognition system. Evaluation results demonstrate that our proposed systems significantly improve SID performance over recent approaches in both anechoic and reverberant conditions and various target-to-interferer ratios.
Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing ( Volume: 23, Issue: 11, November 2015)