ISCA Archive Odyssey 2016
ISCA Archive Odyssey 2016

Compensation for phonetic nuisance variability in speaker recognition using DNNs

Themos Stafylakis, Patrick Kenny, Vishwa Gupta, Jahangir Alam, Marcel Kockmann

In this paper, a new way of using phonetic DNN in text-independent speaker recognition is examined. Inspired by the Subspace GMM approach to speech recognition, we try to extract i-vectors that are invariant to the phonetic content for the utterance. We overcome the assumption of gaussian distributed senones by combining DNN with UBM posteriors and we form a complete EM algorithm for training and extracting phonetic content compensated i-vectors. A simplified version of the model is also presented, where the phonetic content and speaker subspaces are learned in a decoupled way. Covariance adaptation is also examined, where the covariance matrices are reestimated rather than copied from the UBM. A set of primary experimental results is reported on NIST-SRE 2010, with modest improvement when fused with the standard i-vectors.


doi: 10.21437/Odyssey.2016-49

Cite as: Stafylakis, T., Kenny, P., Gupta, V., Alam, J., Kockmann, M. (2016) Compensation for phonetic nuisance variability in speaker recognition using DNNs. Proc. The Speaker and Language Recognition Workshop (Odyssey 2016), 340-345, doi: 10.21437/Odyssey.2016-49

@inproceedings{stafylakis16_odyssey,
  author={Themos Stafylakis and Patrick Kenny and Vishwa Gupta and Jahangir Alam and Marcel Kockmann},
  title={{Compensation for phonetic nuisance variability in speaker recognition using DNNs}},
  year=2016,
  booktitle={Proc. The Speaker and Language Recognition Workshop (Odyssey 2016)},
  pages={340--345},
  doi={10.21437/Odyssey.2016-49}
}