Abstract
In this paper we present a speech analysis/synthesis coder based on a combination of linear prediction with nonlinear modeling of the residual using a regularized radial basis function (RBF) network. The model has been applied to synthesis of sustained vowel signals and has been found to preserve the dynamics and spectra of the original speech signal. While several nonlinear speech models reportedly suffer from high-frequency losses in the synthesized speech due to system inherent low-pass behavior, our approach achieves good speech signal reproduction even in the higher frequency ranges. The decomposition of the speech signal by linear prediction analysis supports processing during synthesis such as pitch modifications while the nonlinear modeling provides the means for adequate reproduction of the fine-grained dynamic characteristics of speech.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Floris Takens, “On the numerical determination of the dimension of an attractor,” inDynamic Systems and Turbulence, D. Rand and L.S. Young, Eds., vol. 898 of Warwick 1980 Lecture Notes in Mathematics, pp. 366–381. Springer, Berlin, 1981.
Hans-Peter Bernhard, The Mutual Information Function and its Application to Signal Processing, Ph.D. thesis, Vienna University of Technology, 1997.
Kurt Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, pp. 359–366, 1989.
Gunnar Fant, Acoustic Theory of Speech Production, Mouton, The Hague, Paris, 1970.
John D. Markel and Augustine H. Gray, Jr., Linear Prediction of Speech, Springer, Berlin, Heidelberg, New York, 1976.
Gernot Kubin, “Nonlinear processing of speech,” in Speech Coding and Synthesis, W. Bastiaan Kleijn and K.K. Paliwal, Eds., pp. 557–610. Elsevier, Amsterdam, 1995.
José Principe, Ludong Wang, and Jyh-Ming Kuo, “Nonlinear dynamic modeling with neural networks,” in The first European Conference on Signal Analysis and Prediction, 1997.
Simon Haykin, “Neural networks expand SP’s horizon,” IEEE Signal Processing Magazine, vol. 13,no. 2, pp. 24–49, Mar. 1996.
Martin Birgmeier, “A fully Kalman-trained radial basis function network for nonlinear speech modeling,” in Proc. of the IEEE ICNN’95, Perth, Australia, 1995, pp. 259–264.
Martin Birgmeier, Kalman-trained Neural Networks for Signal Processing Applications, Ph.D. thesis, Vienna University of Technology, 1996.
Gernot Kubin, Signal Analysis and Prediction, chapter Signal Analysis and Speech Processing, pp. 375–394, Birkhaeuser, Boston, 1998.
Iain Mann and Steve McLaughlin, “Stable speech synthesis using recurrent radial basis functions,” in Proc. of EuroSpeech’99, Budapest, Hungary, 1999.
Iain Mann, An Investigation of Nonlinear Speech Synthesis and Pitch Modification Techniques, Ph.D. thesis, University of Edinburgh, 1999.
Karthik Narasimhan, José C. Principe, and Donald G. Childers, “Nonlinear dynamic modeling of the voiced excitation for improved speech synthesis,” in Proc. of ICASSP’99, 1999.
Simon Haykin, Neural Networks. A Comprehensive Foundation, Macmillan College Publishing Company, New York, Toronto, Oxford, 1994.
Hans-Peter Bernhard and Gernot Kubin, “Detection of chaotic behaviour in speech signals using Fraser’s mutual information algorithm,” in Proc. of 13th GRETSI Symposium on Signal and Image Processing, Juan-les-Pins, France, Sept. 1991, pp. 1301–1311.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rank, E., Kubin, G. (2001). Nonlinear Synthesis of Vowels in the LP Residual Domain with a Regularized RBF Network. In: Mira, J., Prieto, A. (eds) Bio-Inspired Applications of Connectionism. IWANN 2001. Lecture Notes in Computer Science, vol 2085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45723-2_90
Download citation
DOI: https://doi.org/10.1007/3-540-45723-2_90
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42237-2
Online ISBN: 978-3-540-45723-7
eBook Packages: Springer Book Archive