Abstract
Difference between acoustic characteristics of speakers can be applied to segment conversational speech. In this paper, an unsupervised speech segmentation algorithm is emphasized while Euclidean distance measure and the distance measure based on GLR (Generalized Likelihood Ratio) and duration model are compared. The latter measure makes use of the likelihood ratio to describe the similarity and text-independent two-speaker verification system shows it is effective in verifying segment points as the result of being sensitive to speaker changes.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Gish, H., Siu, M-H., Rohlicek, R.: Segregation of Speakers for Speech Recognition and Speaker Identification. In: ICASSP. Proceeding of the International Conference on Acoustics, Speech and Signal Processing, Toronto, pp. 873–876 (1991)
Meignier, S., Bonastre, J.F., Chagnolleau, I.M.: Speaker Utterances Tying Among Speaker Segmented Audio Documents Using Hierarchical Classification: Towards Speaker Indexing of Audio Databases. In: ICSLP. Proceeding of the International Conference on Speech Language Processing, Denver, pp. 577–580 (2002)
Jin, H., Kubala, F., Schwartz, R.: Automatic Speaker Clustering. In: Jin, H., Kubala, F., Schwartz, R. (eds.) Proceeding of the DARPA Speech Recognition Workshop, Chantilly, pp. 108–111 (1997)
Reynolds, D.A., Singer, E.: Blind Clustering of Speech Utterances Based on Speaker and Language characteristics. In: ICSLP. Proceeding of the International Conference on Speech and Language Processing, Sydney, pp. 3193–3196 (1998)
Delacourt, P., Wellekens, C.: DISTBIC: A speaker-based segmentation for audio data indexing. Speech Communications 32, 111–126 (2000)
Bonastre, J.F., Delacourt, P., Fredouille, C.: A Speaker Tracking System Based on Speaker Turn Detection for NIST Evaluation. In: ICASSP. Proceeding of the International Conference on Acoustics, Speech and Signal Processing, Istanbul, pp. 1177–1180 (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chen, Y., Wang, Q. (2007). A Speaker Based Unsupervised Speech Segmentation Algorithm Used in Conversational Speech. In: Zhang, Z., Siekmann, J. (eds) Knowledge Science, Engineering and Management. KSEM 2007. Lecture Notes in Computer Science(), vol 4798. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76719-0_39
Download citation
DOI: https://doi.org/10.1007/978-3-540-76719-0_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76718-3
Online ISBN: 978-3-540-76719-0
eBook Packages: Computer ScienceComputer Science (R0)