Abstract
In this paper, an alternative dot scoring based agglomerative hierarchical clustering approach for speaker diarization is presented. Dot-scoring is a simple and fast technique used in speaker verification that makes use of a linearized procedure to score test segments against target models. In our speaker diarization approach speech segments are represented by MAP-adapted GMM zero and first order statistics, dot scoring is applied to compute a similarity measure between segments (or clusters) and finally an agglomerative clustering algorithm is applied until no pair of clusters exceeds a similarity threshold. This diarization system was developed for the Albayzin 2010 Speaker Diarization Evaluation on broadcast news. Results show that the lowest error rate that the clustering algorithm could attain for the evaluation set was around 20% and that over-segmentation was the main source of degradation, due to the lack of robustness in the estimation of statistics for short segments.
This work has been supported by the University of the Basque Country under Grant GIU10/18 and the Government of the Basque Country, under program SAIOTEK (project S-PE10UN87), and the Spanish MICINN, under Plan Nacional de I+D+i (project TIN2009-07446, partially financed by FEDER funds).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Tranter, S.E., Reynolds, D.A.: An overview of automatic speaker diarization systems. IEEE Transactions on Audio, Speech, and Language Processing 14(5), 1979–1986 (2006)
Rodriguez-Fuentes, L.J., Penagarikano, M., Varona, A., Diez, M., Bordel, G.: GTTS Systems for the Albayzin 2010 Audio Segmentation Evaluation. In: FALA 2010 ”VI Jornadas en Tecnología del Habla” and II Iberian SLTech Workshop, Vigo, Spain (November 2010)
Rodríguez, L.J., Peñagarikano, M., Bordel, G.: A simple but effective approach to speaker tracking in broadcast news. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds.) IbPRIA 2007. LNCS, vol. 4478, pp. 48–55. Springer, Heidelberg (2007)
Penagarikano, M., Varona, A., Diez, M., Rodriguez-Fuentes, L.J., Bordel, G.: University of the Basque Country System for NIST 2010 Speaker Recognition Evaluation. In: Proceedings of the II Iberian SLTech Workshop, Vigo, Spain (2010)
Penagarikano, M., Bordel, G.: Sautrela: A Highly Modular Open Source Speech Recognition Framework. In: Proceedings of the ASRU Workshop, San Juan, Puerto Rico, pp. 386–391 (December 2005)
Strasheim, A., Brümmer, N.: SUNSDV system description: NIST SRE 2008. In: NIST Speaker Recognition Evaluation Workshop Booklet (2008)
Rodriguez-Fuentes, L.J., Penagarikano, M., Bordel, G., Varona, A., Diez, M.: KALAKA: A TV Broadcast Speech Database for the Evaluation of Language Recognition Systems. In: 7th International Conference on Language Resources and Evaluation, Valleta, Malta, May 17-23 (2010)
Zelenak, M., Schulz, H., Hernando, J.: Albayzin 2010 Evaluation Campaign: Speaker Diarization. In: FALA 2010 ”VI Jornadas en Tecnología del Habla” and II Iberian SLTech Workshop, Vigo, Spain (November 2010)
The 2009 NIST Rich Transcription Evaluation, http://www.itl.nist.gov/iad/mig/tests/rt/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Diez, M., Penagarikano, M., Varona, A., Rodriguez-Fuentes, L.J., Bordel, G. (2011). On the Use of Dot Scoring for Speaker Diarization. In: Vitrià, J., Sanches, J.M., Hernández, M. (eds) Pattern Recognition and Image Analysis. IbPRIA 2011. Lecture Notes in Computer Science, vol 6669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21257-4_76
Download citation
DOI: https://doi.org/10.1007/978-3-642-21257-4_76
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21256-7
Online ISBN: 978-3-642-21257-4
eBook Packages: Computer ScienceComputer Science (R0)