Skip to main content

Error Prediction-Based Semi-automatic Segmentation of Speech Databases

  • Conference paper
  • 948 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5246))

Abstract

The manual segmentation of speech databases still outperforms the automatic segmentation algorithms and, at the same time, the quality of resulting synthetic voice depends on the accuracy of the phonetic segmentation. In this paper we describe a semi-automatic speech segmentation procedure, in which a human expert manually allocates the selected boundaries prior to the automatic segmentation of the rest of the corpus. Segmentation error predictor is designed, estimated and then used to generate a sequence of manual annotations done by an expert. The obtained error response curves are significantly better than random segmentation strategies. The results are presented for two different Polish corpora.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ostendorf, M., Digalakis, V., Kimball, O.: From HMM’s to Segment Models: A unified view of stochastic modeling for speech recognition. IEEE Trans. on Speech and Audio Proc. 4(5), 360–378 (1996)

    Article  Google Scholar 

  2. Szymański, M., Grocholewski, S.: Post-processing of automatic segmentation of speech using dynamic programming. In: Proc. 9th International Conference on Text, Speech and Dialogue (2006)

    Google Scholar 

  3. Szymański, M., Grocholewski, S.: Semi-automatic segmentation of speech: manual segmentation strategy; problem space analysis. In: Proc. CORES 2005, Wroclaw (2005)

    Google Scholar 

  4. Kvale, K.: Segmentation and labelling of speech. Ph.D. thesis, Institutt for Teleteknikk, Trondheim (1993)

    Google Scholar 

  5. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)

    MATH  Google Scholar 

  6. Grocholewski, S.: Corpora speech database for Polish diphones. In: Eurospeech 1997, pp. 1735–1738 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Petr Sojka Aleš Horák Ivan Kopeček Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Szymański, M., Grocholewski, S. (2008). Error Prediction-Based Semi-automatic Segmentation of Speech Databases. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2008. Lecture Notes in Computer Science(), vol 5246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87391-4_69

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87391-4_69

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87390-7

  • Online ISBN: 978-3-540-87391-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics