Skip to main content

On the Use of Dot Scoring for Speaker Diarization

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6669))

Abstract

In this paper, an alternative dot scoring based agglomerative hierarchical clustering approach for speaker diarization is presented. Dot-scoring is a simple and fast technique used in speaker verification that makes use of a linearized procedure to score test segments against target models. In our speaker diarization approach speech segments are represented by MAP-adapted GMM zero and first order statistics, dot scoring is applied to compute a similarity measure between segments (or clusters) and finally an agglomerative clustering algorithm is applied until no pair of clusters exceeds a similarity threshold. This diarization system was developed for the Albayzin 2010 Speaker Diarization Evaluation on broadcast news. Results show that the lowest error rate that the clustering algorithm could attain for the evaluation set was around 20% and that over-segmentation was the main source of degradation, due to the lack of robustness in the estimation of statistics for short segments.

This work has been supported by the University of the Basque Country under Grant GIU10/18 and the Government of the Basque Country, under program SAIOTEK (project S-PE10UN87), and the Spanish MICINN, under Plan Nacional de I+D+i (project TIN2009-07446, partially financed by FEDER funds).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tranter, S.E., Reynolds, D.A.: An overview of automatic speaker diarization systems. IEEE Transactions on Audio, Speech, and Language Processing 14(5), 1979–1986 (2006)

    Article  Google Scholar 

  2. Rodriguez-Fuentes, L.J., Penagarikano, M., Varona, A., Diez, M., Bordel, G.: GTTS Systems for the Albayzin 2010 Audio Segmentation Evaluation. In: FALA 2010 ”VI Jornadas en Tecnología del Habla” and II Iberian SLTech Workshop, Vigo, Spain (November 2010)

    Google Scholar 

  3. Rodríguez, L.J., Peñagarikano, M., Bordel, G.: A simple but effective approach to speaker tracking in broadcast news. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds.) IbPRIA 2007. LNCS, vol. 4478, pp. 48–55. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Penagarikano, M., Varona, A., Diez, M., Rodriguez-Fuentes, L.J., Bordel, G.: University of the Basque Country System for NIST 2010 Speaker Recognition Evaluation. In: Proceedings of the II Iberian SLTech Workshop, Vigo, Spain (2010)

    Google Scholar 

  5. Penagarikano, M., Bordel, G.: Sautrela: A Highly Modular Open Source Speech Recognition Framework. In: Proceedings of the ASRU Workshop, San Juan, Puerto Rico, pp. 386–391 (December 2005)

    Google Scholar 

  6. Strasheim, A., Brümmer, N.: SUNSDV system description: NIST SRE 2008. In: NIST Speaker Recognition Evaluation Workshop Booklet (2008)

    Google Scholar 

  7. Rodriguez-Fuentes, L.J., Penagarikano, M., Bordel, G., Varona, A., Diez, M.: KALAKA: A TV Broadcast Speech Database for the Evaluation of Language Recognition Systems. In: 7th International Conference on Language Resources and Evaluation, Valleta, Malta, May 17-23 (2010)

    Google Scholar 

  8. Zelenak, M., Schulz, H., Hernando, J.: Albayzin 2010 Evaluation Campaign: Speaker Diarization. In: FALA 2010 ”VI Jornadas en Tecnología del Habla” and II Iberian SLTech Workshop, Vigo, Spain (November 2010)

    Google Scholar 

  9. The 2009 NIST Rich Transcription Evaluation, http://www.itl.nist.gov/iad/mig/tests/rt/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Diez, M., Penagarikano, M., Varona, A., Rodriguez-Fuentes, L.J., Bordel, G. (2011). On the Use of Dot Scoring for Speaker Diarization. In: Vitrià, J., Sanches, J.M., Hernández, M. (eds) Pattern Recognition and Image Analysis. IbPRIA 2011. Lecture Notes in Computer Science, vol 6669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21257-4_76

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21257-4_76

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21256-7

  • Online ISBN: 978-3-642-21257-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics