Hungarian Speech Synthesis Using a Phase Exact HNM Approach

Kovács, Kornél; Kocsor, András; Tóth, László

doi:10.1007/3-540-36137-5_12

Kornél Kovács^6,7,
András Kocsor^6,7 &
László Tóth^6,7

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2540))

Included in the following conference series:

International Conference on Current Trends in Theory and Practice of Computer Science

Abstract

Unnaturally sounding speech prevents the listeners from recognizing the message of the signal. In this paper we demonstrate how a precise initial phase approximation can improve the naturalness of artificially generated speech. Using the Harmonic plus Noise Model provided by Stylianou as a framework for a Hungarian speech synthesis, the exact initial phase extension of the system can be easily performed. The proposed method turns out to be more effective in preserving the sound characteristics and quality than the original one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Excitation Modeling Method Based on Inverse Filtering for HMM-Based Speech Synthesis

An Open Source Speech Synthesis Frontend for HTS

Surgery of Speech Synthesis Models to Overcome the Scarcity of Training Data

References

Allen, J.: Overview of Text-to-Speech systems, In S. Furui and M. Sondhi, editors, Advances in Speech Signal Processing, pp. 741–790, 1991. 181
Google Scholar
Dutoit, T.: High quality text-to-speech synthesis: A comparison of four candidate algorithms, Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 565–568, 1994. 181
Google Scholar
Dutoit, T., Leich, H.: Text-To-Speech synthesis based on a MBE re-synthesis of the segments database, Speech Communication, pp. 13:435–440, 1993. 181
Article Google Scholar
Gimenez de los Galanes, F.M., Savoji, M.H., Pardo, J. M.: New algorithm for spectral smoothing and envelope modification for LP-PSOLA synthesis, Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 573–576, 1994. 181
Google Scholar
Klatt, D.R.: Review of text-to-speech conversion for English, J. Acoust. Soc. Am., pp. 82(3):737–793, September 1987. 181
Article Google Scholar
Kocsor, A., Tóth, L., Bálint I.,: On the Optimal Parameters of a Sinusoidal Representation of Signals, Acta Cybernetica 14, pp. 315–330, 1999. 183
MATH MathSciNet Google Scholar
McAulay, R. J., Quatieri, T. F.: Speech Analysis/Synthesis based on a sinusoidal representation, IEEE Trans. Acoust., Speech, Signal Processing, pp. ASSP-34(4):744–754, August 1986. 181
Article Google Scholar
Moulines, E., Charpentier, F.: Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones, Speech Communication, pp. 9(5/6):453–467, December 1990. 181
Article Google Scholar
Rabiner, L.R.: Applications of Voice Processsing to Telecommunications, Proc. IEEE, pp. 82(2):199–228, February 1994. 181
Google Scholar
Serra, X.: A System for Sound Analysis/Transformation/Synthesis Based on a Deterministic Plus Stochastic Decomposition, PhD thesis, Stanford University, Stanford, CA 1989. 181
Google Scholar
Stylianou, Y.: Harmonic plus Noise Model for Speech, combined with Statistical Methods, for Speech and Speaker Modification, PhD Thesis, 1996. 181
Google Scholar

Download references

Author information

Authors and Affiliations

Research Group on Artificial Intelligence of the Hungarian Academy of Sciences, Hungary
Kornél Kovács, András Kocsor & László Tóth
University of Szeged, H-6720 Szeged, Aradi vértanúk tere 1, Hungary
Kornél Kovács, András Kocsor & László Tóth

Authors

Kornél Kovács
View author publications
You can also search for this author in PubMed Google Scholar
András Kocsor
View author publications
You can also search for this author in PubMed Google Scholar
László Tóth
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer and Information Science, University of Michigan - Dearborn, 4901 Evergreen Road, Dearborn, 48128, Michigan, USA
William I. Grosky
Department of Software Engineering School of Computer Science, Charles University, Malostranské nám. 25, 118 00, Prague, Czech Republic
František Plášil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kovács, K., Kocsor, A., Tóth, L. (2002). Hungarian Speech Synthesis Using a Phase Exact HNM Approach. In: Grosky, W.I., Plášil, F. (eds) SOFSEM 2002: Theory and Practice of Informatics. SOFSEM 2002. Lecture Notes in Computer Science, vol 2540. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36137-5_12

Download citation

DOI: https://doi.org/10.1007/3-540-36137-5_12
Published: 27 November 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00145-4
Online ISBN: 978-3-540-36137-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics