The performance of speaker verification degrades significantly when the test speech is corrupted by interference from non-target speakers. Speaker diarization separates speakers well only if the speakers are not overlapped. However, if multiple talkers speak at the same time, we need a technique to separate the speech in the spectral domain. In this paper, we study a way to extract the target speaker’s speech from an overlapped multi-talker speech. Specifically, given some reference speech samples from the target speaker, the target speaker’s speech is firstly extracted from the overlapped multi-talker speech, then the extracted speech is processed in the speaker verification system. Experimental results show that the proposed approach significantly improves the performance of overlapped multi-talker speaker verification and achieves 64.4% relative EER reduction over the zero-effort baseline.
Cite as: Rao, W., Xu, C., Chng, E.S., Li, H. (2019) Target Speaker Extraction for Multi-Talker Speaker Verification. Proc. Interspeech 2019, 1273-1277, doi: 10.21437/Interspeech.2019-1410
@inproceedings{rao19_interspeech, author={Wei Rao and Chenglin Xu and Eng Siong Chng and Haizhou Li}, title={{Target Speaker Extraction for Multi-Talker Speaker Verification}}, year=2019, booktitle={Proc. Interspeech 2019}, pages={1273--1277}, doi={10.21437/Interspeech.2019-1410} }