Abstract
In this paper, a novel cosine similarity metric learning based on large margin nearest neighborhood (LMNN) is proposed for an i-vector based speaker verification system. Generally, in an i-vector based speaker verification system, the decision is based on the cosine distance between the test i-vector and target i-vector. Metric learning methods are employed to reduce the within class variation and maximize the between class variation. In this proposed method, cosine similarity large margin nearest neighborhood (CSLMNN) metric is learned from the development data. The test and target i-vectors are linearly transformed using the learned metric. The objective of learning the metric is to ensure that the k-nearest neighbors that belong to the same speaker are clustered together, while impostors are moved away by a large margin. Experiments conducted on the NIST-2008 and YOHO databases show improved performance compared to speaker verification system, where no learned metric is used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Tomi, K., Li, H.: A tutorial on text-independent speaker verification. Speech Communication 52, 12–40 (2010)
Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Transactions on Audio, Speech, and Language Processing 19(4), 788–798 (2011)
R.D.A.Q.T.F.,, D.R.: Speaker verification using adapted gaussian mixture models. Digital Signal Processing 10(1), 19–41
S.M., J.T.: Learning a distance metric from relative comparisons. In: Advances in Neural Information Processing Systems, vol. 16, p. 41 (2004)
Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information theoretic metric learning. In: Proc. Int. Conf. Mach. Learn., pp. 209–216 (2007)
B.J.W.K.Q., S.L.K.: Distance metric learning for large margin nearest neighbor classification. In: Advances in Neural Information Processing Systems, pp. 1473–1480 (2005)
Yang, L.: An overview of distance metric learning. In: Proceedings of the Computer Vision and Pattern Recognition Conference (2007)
Xing, E.P., Jordan, M.I., Russell, S., Ng, A.: Distance metric learning with application to clustering with side-information. In: Advances in Neural Information Processing Systems, pp. 505–512 (2002)
Scheffer, K.S.S.N., Graciarena, M., Shriberg, E., Stolcke, A., Ferrer, L., Bocklet, T.: The sri nist 2008 speaker recognition evaluation system. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009, pp. 4205–4208 (2009)
Li, H., Ma, B., Lee, K.-A., Sun, H., Zhu, D., Sim, K.C., You, C.: The i4u system in nist 2008 speaker recognition evaluation. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009, pp. 4201–4204 (2009)
Campbell Jr., J.P.: Testing with the yoho cd-rom voice verification corpus. In: International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1995., vol. 1, pp. 341–344. IEEE (1995)
Martin, A., Doddington, G.: The det curve in assessment of detection task performance. In: Proc. Eurospeech, vol. 97(4), pp. 1895–1898 (1997)
B.N., de Villiers, E.: The bosaris toolkit: Theory, algorithms and code for surviving the new dcf. arXiv preprint arXiv, 1304.2865 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ahmad, W., Karnick, H., Hegde, R.M. (2014). Cosine Distance Metric Learning for Speaker Verification Using Large Margin Nearest Neighbor Method. In: Ooi, W.T., Snoek, C.G.M., Tan, H.K., Ho, CK., Huet, B., Ngo, CW. (eds) Advances in Multimedia Information Processing – PCM 2014. PCM 2014. Lecture Notes in Computer Science, vol 8879. Springer, Cham. https://doi.org/10.1007/978-3-319-13168-9_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-13168-9_33
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13167-2
Online ISBN: 978-3-319-13168-9
eBook Packages: Computer ScienceComputer Science (R0)