Abstract
This paper presents a system for speaker diarization that can be used if the number of speakers is unknown. The proposed system is based on the ag-glomerative clustering approach in conjunction with factor analysis, Total Variability approach and linear discriminant analysis. We present the results of the proposed diarization system. The results demonstrate that our system can be used both if an answering machine or handset transfer is present in telephone recordings and in the case of a summed channel in telephone or meeting recordings.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-End Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech, and Language Processing 19(4), 788–798 (2011)
Jin, Q., Laskowski, K., Schultz, T., Waibel, A.: Speaker segmentation and clustering in meetings. In: Proceedings of the 8th International Conference on Spoken Language Processing, Jeju Island, Korea (2004)
Reynolds, D., Kenny, P., Castaldo, F.: A Study of New Approaches to Speaker Diarization. In: Proc. Interspeech, pp. 1047–1050 (2009)
Tranter, S., Reynolds, D.: An Overview of Automatic Speaker Diarisation Systems. IEEE Transactions on Audio, Speech, and Language Processing 14(5), 1557–1565 (2006)
Kenny, P.: Bayesian Analysis of Speaker Diarization with Eigenvoice Priors. Technical report. Centre de recherche informatique de Montreal (CRIM), Montreal, Canada (2008)
2008 NIST Speaker Recognition Evaluation Test Set, http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2011S08
AMI Meeting Corpus, http://corpus.amiproject.org/
Rich Transcription Evaluation Project, http://www.itl.nist.gov/iad/mig//tests/rt/
Rich Transcription Spring 2006 Evaluation, http://www.itl.nist.gov/iad/mig/tests/rt/2006-spring/
Vijayasenan, D., Valente, F., Bourlard, H.: An Information Theoretic Approach to Speaker Diarization of Meeting Data. IEEE Transactions on Audio, Speech, and Language Processing 17(7), 1382–1393 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Kudashev, O., Kozlov, A. (2013). The Diarization System for an Unknown Number of Speakers. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-01931-4_45
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01930-7
Online ISBN: 978-3-319-01931-4
eBook Packages: Computer ScienceComputer Science (R0)