Abstract
Unnaturally sounding speech prevents the listeners from recognizing the message of the signal. In this paper we demonstrate how a precise initial phase approximation can improve the naturalness of artificially generated speech. Using the Harmonic plus Noise Model provided by Stylianou as a framework for a Hungarian speech synthesis, the exact initial phase extension of the system can be easily performed. The proposed method turns out to be more effective in preserving the sound characteristics and quality than the original one.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Allen, J.: Overview of Text-to-Speech systems, In S. Furui and M. Sondhi, editors, Advances in Speech Signal Processing, pp. 741–790, 1991. 181
Dutoit, T.: High quality text-to-speech synthesis: A comparison of four candidate algorithms, Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 565–568, 1994. 181
Dutoit, T., Leich, H.: Text-To-Speech synthesis based on a MBE re-synthesis of the segments database, Speech Communication, pp. 13:435–440, 1993. 181
Gimenez de los Galanes, F.M., Savoji, M.H., Pardo, J. M.: New algorithm for spectral smoothing and envelope modification for LP-PSOLA synthesis, Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, pp. 573–576, 1994. 181
Klatt, D.R.: Review of text-to-speech conversion for English, J. Acoust. Soc. Am., pp. 82(3):737–793, September 1987. 181
Kocsor, A., Tóth, L., Bálint I.,: On the Optimal Parameters of a Sinusoidal Representation of Signals, Acta Cybernetica 14, pp. 315–330, 1999. 183
McAulay, R. J., Quatieri, T. F.: Speech Analysis/Synthesis based on a sinusoidal representation, IEEE Trans. Acoust., Speech, Signal Processing, pp. ASSP-34(4):744–754, August 1986. 181
Moulines, E., Charpentier, F.: Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones, Speech Communication, pp. 9(5/6):453–467, December 1990. 181
Rabiner, L.R.: Applications of Voice Processsing to Telecommunications, Proc. IEEE, pp. 82(2):199–228, February 1994. 181
Serra, X.: A System for Sound Analysis/Transformation/Synthesis Based on a Deterministic Plus Stochastic Decomposition, PhD thesis, Stanford University, Stanford, CA 1989. 181
Stylianou, Y.: Harmonic plus Noise Model for Speech, combined with Statistical Methods, for Speech and Speaker Modification, PhD Thesis, 1996. 181
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kovács, K., Kocsor, A., Tóth, L. (2002). Hungarian Speech Synthesis Using a Phase Exact HNM Approach. In: Grosky, W.I., Plášil, F. (eds) SOFSEM 2002: Theory and Practice of Informatics. SOFSEM 2002. Lecture Notes in Computer Science, vol 2540. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36137-5_12
Download citation
DOI: https://doi.org/10.1007/3-540-36137-5_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00145-4
Online ISBN: 978-3-540-36137-4
eBook Packages: Springer Book Archive