Abstract
Building unit-selection speech synthesisers requires a precise annotation of large speech corpora. Manual segmentation of speech is a very laborious task, hence there is the need for automatic segmentation algorithms. As it was observed that the common HMM-based method is prone to systematical errors, some boundary refinement approaches, like boundary-specific correction, were introduced.
Last year, a dynamic programming fine-tuning approach was proposed, that combined two sources information, boundary error distribution and boundary MFCC statistical models. In this paper we verify the usefulness of incorporating several other data, boundary energy dynamics models and the signal periodicity information.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Adell, J., Bonafonte, A.: Towards phone segmentation for concatenation speech synthesis. In: Proc. 5th Speech Synthesis Workshop, Pittsburgh, pp. 139–144 (2004)
Grocholewski, S.: CORPORA – Speech Database for Polish Diphones. In: Proc. Eurospeech 1997, pp. 1735–1738 (1997)
Klabbers, E., Stoeber, K., Veldhuis, R., Wagner, P., Breuer, S.: Speech Synthesis Development Made Easy: The Bonn Open Synthesis System. In: Proc. Eurospeech 2001, pp. 521–525 (2001)
Kvale, K.: Segmentation and Labelling of Speech, Ph.D. Thesis, Inst. for Teleteknikk, Trondheim (1993)
Matousek, J., Tihelka, D., Psutka, J.: Automatic Segmentation for Czech Concatenative Speech Synthesis Using Statistical Approach with Boundary-Specific Correction. In: Proc. Eurospeech 2003, Geneva, pp. 301–304 (2003)
Ostendorf, M., Digalakis, V.V., Kimball, O.A.: From HMM’s to Segment Models: A Unified View of Stochastic Modeling for Speech Recognition. IEEE Trans. on Speech and Audio Proc. 4(5) (September 1996)
Szymański, M., Grocholewski, S.: Dynamic programming method for fine-tuning the boundary points in automatic segmentation of speech. In: Proc. Speech Analysis, Synthesis and Recognition Workshop, Krakow, Poland (2005)
Taylor, P.A., Isard, S.D.: Automatic phone segmentation. In: Proc. Eurospeech, Genova, pp. 709–711 (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Szymański, M., Grocholewski, S. (2006). Post-processing of Automatic Segmentation of Speech Using Dynamic Programming. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2006. Lecture Notes in Computer Science(), vol 4188. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846406_66
Download citation
DOI: https://doi.org/10.1007/11846406_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-39090-9
Online ISBN: 978-3-540-39091-6
eBook Packages: Computer ScienceComputer Science (R0)