The Sound Database Formation for the Allophone-Based Model for English Concatenative Speech Synthesis

Evgrafova, Karina

doi:10.1007/11551874_28

Karina Evgrafova¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3658))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

715 Accesses
2 Citations

Abstract

The goal of this paper is to describe the development of the sound database for the allophone-based model for English concatenative speech synthesis. The procedure of the sound unit inventory construction is described and its main results are presented. At present moment the optimized sound units inventory of the allophonic database for English concatenative speech synthesis contains 1200 elements (1000 vowel allophones and 200 consonant allophones). The smoothness of junctions between the allophones shows high quality of the segmentation made. The decrease in the number of the database components in the result of optimization does not affect the quality of the resulting synthesized speech. At the level of segments it can be evaluated as fairly high in terms of both naturalness and intelligibility.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Algorithms for Automatic Selection of Allophones to the Acoustic Units Database

The Algorithms of Automation of the Process of Creating Acoustic Units Databases in the Polish Speech Synthesis

The Phonetic Alphabet of the Chechen Language as a Basis of a Speech-Synthesis System

Article 01 January 2018

References

Bondarko, L.V., Kuznetsov, V.I., Skrelin, P.A.: The Sound System of the Russian Language from the point of view of the objectives of Russian Speech Concatenative Synthesis. In: Bulleten’ foneticheskogo fonda russkogo jazyka, N 6. St-Petersburg-Bochum (1997) (in Russian)
Google Scholar
Evgrafova, K.V.: The Principles of the English Allophonic Database Formation. In: Foneticheskij litsej. St-Petersburg, pp. 23–36 (2004) (in Russian)
Google Scholar
Gimson, A.C.: An Introduction to the Pronunciation of English. London (1962)
Google Scholar
O’Connor, J.D.: Phonetics. London (1977)
Google Scholar
Shalonova, K.B.: The Acoustical Characteristics of the Transitions between Sounds, St-Petersburg (1996) (in Russian)
Google Scholar
Skrelin, P.A.: Concatenative Russian Speech Synthesis: Sound Database Formation Principles. In: Proc. of the SPECOM 1997, Cluj-Napoka (1997)
Google Scholar
Skrelin, P.A.: The Phonetic Aspects of Speech Technologies, St-Petersburg (1999) (in Russian)
Google Scholar
Skrelin, P.A.: The Segmentation and Transcription, St-Petersburg (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Phonetics St-Petersburg State University, Universitetskaya emb. 11, Saint-Petersburg, Russia
Karina Evgrafova

Authors

Karina Evgrafova
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of West Bohemia in Pilsen, Univerzitni 8, 30614, Plzen, Czech Republic
Václav Matoušek , Pavel Mautner & Tomáš Pavelka , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Evgrafova, K. (2005). The Sound Database Formation for the Allophone-Based Model for English Concatenative Speech Synthesis. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_28

Download citation

DOI: https://doi.org/10.1007/11551874_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28789-6
Online ISBN: 978-3-540-31817-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics