The Diarization System for an Unknown Number of Speakers

Kudashev, Oleg; Kozlov, Alexander

doi:10.1007/978-3-319-01931-4_45

Oleg Kudashev²² &
Alexander Kozlov²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8113))

Included in the following conference series:

International Conference on Speech and Computer

1227 Accesses
2 Citations

Abstract

This paper presents a system for speaker diarization that can be used if the number of speakers is unknown. The proposed system is based on the ag-glomerative clustering approach in conjunction with factor analysis, Total Variability approach and linear discriminant analysis. We present the results of the proposed diarization system. The results demonstrate that our system can be used both if an answering machine or handset transfer is present in telephone recordings and in the case of a summed channel in telephone or meeting recordings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Diarization Based on Identification with X-Vectors

Speaker Diarization: A Top-Down Approach Using Syllabic Phonology

Modelling Speaker Variability Using Covariance Learning

References

Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-End Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech, and Language Processing 19(4), 788–798 (2011)
Article Google Scholar
Jin, Q., Laskowski, K., Schultz, T., Waibel, A.: Speaker segmentation and clustering in meetings. In: Proceedings of the 8th International Conference on Spoken Language Processing, Jeju Island, Korea (2004)
Google Scholar
Reynolds, D., Kenny, P., Castaldo, F.: A Study of New Approaches to Speaker Diarization. In: Proc. Interspeech, pp. 1047–1050 (2009)
Google Scholar
Tranter, S., Reynolds, D.: An Overview of Automatic Speaker Diarisation Systems. IEEE Transactions on Audio, Speech, and Language Processing 14(5), 1557–1565 (2006)
Article Google Scholar
Kenny, P.: Bayesian Analysis of Speaker Diarization with Eigenvoice Priors. Technical report. Centre de recherche informatique de Montreal (CRIM), Montreal, Canada (2008)
Google Scholar
2008 NIST Speaker Recognition Evaluation Test Set, http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2011S08
AMI Meeting Corpus, http://corpus.amiproject.org/
Rich Transcription Evaluation Project, http://www.itl.nist.gov/iad/mig//tests/rt/
Rich Transcription Spring 2006 Evaluation, http://www.itl.nist.gov/iad/mig/tests/rt/2006-spring/
Vijayasenan, D., Valente, F., Bourlard, H.: An Information Theoretic Approach to Speaker Diarization of Meeting Data. IEEE Transactions on Audio, Speech, and Language Processing 17(7), 1382–1393 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Mechanics and Optics, National Research University of Information Technologies, St. Petesburg, Russia
Oleg Kudashev
STC-innovations Ltd., St. Petersburg, Russia
Alexander Kozlov

Authors

Oleg Kudashev
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Kozlov
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Applied Sciences, Department of Cybernetics, University of West Bohemia, Univerzitní 8, 306 14, Plzeň, Czech Republic
Miloš Železný
University of West Bohemia, 306 14, Pilsen, Czech Republic
Ivan Habernal
Speech and Multimodal Interfaces Laboratory, St. Petersburg Institute of Informatics and Automation for the Russian Academy of Sciences, 14-th line, 39, 199178, St. Petersburg, Russia
Andrey Ronzhin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kudashev, O., Kozlov, A. (2013). The Diarization System for an Unknown Number of Speakers. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_45

Download citation

DOI: https://doi.org/10.1007/978-3-319-01931-4_45
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01930-7
Online ISBN: 978-3-319-01931-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics