Error Prediction-Based Semi-automatic Segmentation of Speech Databases

Szymański, Marcin; Grocholewski, Stefan

doi:10.1007/978-3-540-87391-4_69

Error Prediction-Based Semi-automatic Segmentation of Speech Databases

Marcin Szymański^1,2 &
Stefan Grocholewski²

Conference paper

948 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5246))

Abstract

The manual segmentation of speech databases still outperforms the automatic segmentation algorithms and, at the same time, the quality of resulting synthetic voice depends on the accuracy of the phonetic segmentation. In this paper we describe a semi-automatic speech segmentation procedure, in which a human expert manually allocates the selected boundaries prior to the automatic segmentation of the rest of the corpus. Segmentation error predictor is designed, estimated and then used to generate a sequence of manual annotations done by an expert. The obtained error response curves are significantly better than random segmentation strategies. The results are presented for two different Polish corpora.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ostendorf, M., Digalakis, V., Kimball, O.: From HMM’s to Segment Models: A unified view of stochastic modeling for speech recognition. IEEE Trans. on Speech and Audio Proc. 4(5), 360–378 (1996)
Article Google Scholar
Szymański, M., Grocholewski, S.: Post-processing of automatic segmentation of speech using dynamic programming. In: Proc. 9th International Conference on Text, Speech and Dialogue (2006)
Google Scholar
Szymański, M., Grocholewski, S.: Semi-automatic segmentation of speech: manual segmentation strategy; problem space analysis. In: Proc. CORES 2005, Wroclaw (2005)
Google Scholar
Kvale, K.: Segmentation and labelling of speech. Ph.D. thesis, Institutt for Teleteknikk, Trondheim (1993)
Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)
MATH Google Scholar
Grocholewski, S.: Corpora speech database for Polish diphones. In: Eurospeech 1997, pp. 1735–1738 (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Speech and Language Technology, Adam Mickiewicz University Foundation, ul. Rubież 46, 61–612, Poznań, Poland
Marcin Szymański
Institute of Computing Science, Poznan University of Technology, ul. Piotrowo 2, 60–965, Poznań, Poland
Marcin Szymański & Stefan Grocholewski

Authors

Marcin Szymański
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Grocholewski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Petr Sojka Aleš Horák Ivan Kopeček Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Szymański, M., Grocholewski, S. (2008). Error Prediction-Based Semi-automatic Segmentation of Speech Databases. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2008. Lecture Notes in Computer Science(), vol 5246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87391-4_69

Download citation

DOI: https://doi.org/10.1007/978-3-540-87391-4_69
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87390-7
Online ISBN: 978-3-540-87391-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics