Skip to main content

The Diarization System for an Unknown Number of Speakers

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8113))

Abstract

This paper presents a system for speaker diarization that can be used if the number of speakers is unknown. The proposed system is based on the ag-glomerative clustering approach in conjunction with factor analysis, Total Variability approach and linear discriminant analysis. We present the results of the proposed diarization system. The results demonstrate that our system can be used both if an answering machine or handset transfer is present in telephone recordings and in the case of a summed channel in telephone or meeting recordings.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-End Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech, and Language Processing 19(4), 788–798 (2011)

    Article  Google Scholar 

  2. Jin, Q., Laskowski, K., Schultz, T., Waibel, A.: Speaker segmentation and clustering in meetings. In: Proceedings of the 8th International Conference on Spoken Language Processing, Jeju Island, Korea (2004)

    Google Scholar 

  3. Reynolds, D., Kenny, P., Castaldo, F.: A Study of New Approaches to Speaker Diarization. In: Proc. Interspeech, pp. 1047–1050 (2009)

    Google Scholar 

  4. Tranter, S., Reynolds, D.: An Overview of Automatic Speaker Diarisation Systems. IEEE Transactions on Audio, Speech, and Language Processing 14(5), 1557–1565 (2006)

    Article  Google Scholar 

  5. Kenny, P.: Bayesian Analysis of Speaker Diarization with Eigenvoice Priors. Technical report. Centre de recherche informatique de Montreal (CRIM), Montreal, Canada (2008)

    Google Scholar 

  6. 2008 NIST Speaker Recognition Evaluation Test Set, http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2011S08

  7. AMI Meeting Corpus, http://corpus.amiproject.org/

  8. Rich Transcription Evaluation Project, http://www.itl.nist.gov/iad/mig//tests/rt/

  9. Rich Transcription Spring 2006 Evaluation, http://www.itl.nist.gov/iad/mig/tests/rt/2006-spring/

  10. Vijayasenan, D., Valente, F., Bourlard, H.: An Information Theoretic Approach to Speaker Diarization of Meeting Data. IEEE Transactions on Audio, Speech, and Language Processing 17(7), 1382–1393 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Kudashev, O., Kozlov, A. (2013). The Diarization System for an Unknown Number of Speakers. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-01931-4_45

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-01930-7

  • Online ISBN: 978-3-319-01931-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics