ISCA Archive Interspeech 2016
ISCA Archive Interspeech 2016

Parallel Speaker and Content Modelling for Text-Dependent Speaker Verification

Jianbo Ma, Saad Irtza, Kaavya Sriskandaraja, Vidhyasaharan Sethu, Eliathamby Ambikairajah

Text-dependent short duration speaker verification involves two challenges. The primary challenge of interest is the verification of the speaker’s identity, and often a secondary challenge of interest is the verification of the lexical content of the pass-phrase. In this paper, we propose the use of two systems to handle these two tasks in parallel with one sub-system modelling speaker identity based on the assumption that lexical content is known and the other sub-system modelling lexical content in a speaker dependent manner. The text-dependent speaker verification sub-system is based on hidden Markov models and the lexical content verification system is based on models of speech segments that use a distinct Gaussian mixture model for each segment. Furthermore, a mixture selection method based on KL divergence was applied to refine the lexical content sub-system by making the models more discriminative. Experiments on part 1 of the RedDots database showed that the proposed combination of two sub-systems outperformed the baseline system by 39.8%, 51.1% and 37.3% in terms of the ‘imposter_correct’, ‘target_wrong’ and ‘imposter_wrong’ metrics respectively.


doi: 10.21437/Interspeech.2016-825

Cite as: Ma, J., Irtza, S., Sriskandaraja, K., Sethu, V., Ambikairajah, E. (2016) Parallel Speaker and Content Modelling for Text-Dependent Speaker Verification. Proc. Interspeech 2016, 435-439, doi: 10.21437/Interspeech.2016-825

@inproceedings{ma16_interspeech,
  author={Jianbo Ma and Saad Irtza and Kaavya Sriskandaraja and Vidhyasaharan Sethu and Eliathamby Ambikairajah},
  title={{Parallel Speaker and Content Modelling for Text-Dependent Speaker Verification}},
  year=2016,
  booktitle={Proc. Interspeech 2016},
  pages={435--439},
  doi={10.21437/Interspeech.2016-825}
}