Nonlinear Synthesis of Vowels in the LP Residual Domain with a Regularized RBF Network

Rank, Erhard; Kubin, Gernot

doi:10.1007/3-540-45723-2_90

Erhard Rank⁶ &
Gernot Kubin⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2085))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

452 Accesses
5 Citations

Abstract

In this paper we present a speech analysis/synthesis coder based on a combination of linear prediction with nonlinear modeling of the residual using a regularized radial basis function (RBF) network. The model has been applied to synthesis of sustained vowel signals and has been found to preserve the dynamics and spectra of the original speech signal. While several nonlinear speech models reportedly suffer from high-frequency losses in the synthesized speech due to system inherent low-pass behavior, our approach achieves good speech signal reproduction even in the higher frequency ranges. The decomposition of the speech signal by linear prediction analysis supports processing during synthesis such as pitch modifications while the nonlinear modeling provides the means for adequate reproduction of the fine-grained dynamic characteristics of speech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Czech Speech Synthesis with Generative Neural Vocoder

Voice conversion system using salient sub-bands and radial basis function

Article 25 August 2015

Deep Recurrent Neural Networks in Speech Synthesis Using a Continuous Vocoder

References

Floris Takens, “On the numerical determination of the dimension of an attractor,” inDynamic Systems and Turbulence, D. Rand and L.S. Young, Eds., vol. 898 of Warwick 1980 Lecture Notes in Mathematics, pp. 366–381. Springer, Berlin, 1981.
Chapter Google Scholar
Hans-Peter Bernhard, The Mutual Information Function and its Application to Signal Processing, Ph.D. thesis, Vienna University of Technology, 1997.
Google Scholar
Kurt Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal approximators,” Neural Networks, vol. 2, pp. 359–366, 1989.
Article Google Scholar
Gunnar Fant, Acoustic Theory of Speech Production, Mouton, The Hague, Paris, 1970.
Google Scholar
John D. Markel and Augustine H. Gray, Jr., Linear Prediction of Speech, Springer, Berlin, Heidelberg, New York, 1976.
MATH Google Scholar
Gernot Kubin, “Nonlinear processing of speech,” in Speech Coding and Synthesis, W. Bastiaan Kleijn and K.K. Paliwal, Eds., pp. 557–610. Elsevier, Amsterdam, 1995.
Google Scholar
José Principe, Ludong Wang, and Jyh-Ming Kuo, “Nonlinear dynamic modeling with neural networks,” in The first European Conference on Signal Analysis and Prediction, 1997.
Google Scholar
Simon Haykin, “Neural networks expand SP’s horizon,” IEEE Signal Processing Magazine, vol. 13,no. 2, pp. 24–49, Mar. 1996.
Article Google Scholar
Martin Birgmeier, “A fully Kalman-trained radial basis function network for nonlinear speech modeling,” in Proc. of the IEEE ICNN’95, Perth, Australia, 1995, pp. 259–264.
Google Scholar
Martin Birgmeier, Kalman-trained Neural Networks for Signal Processing Applications, Ph.D. thesis, Vienna University of Technology, 1996.
Google Scholar
Gernot Kubin, Signal Analysis and Prediction, chapter Signal Analysis and Speech Processing, pp. 375–394, Birkhaeuser, Boston, 1998.
Google Scholar
Iain Mann and Steve McLaughlin, “Stable speech synthesis using recurrent radial basis functions,” in Proc. of EuroSpeech’99, Budapest, Hungary, 1999.
Google Scholar
Iain Mann, An Investigation of Nonlinear Speech Synthesis and Pitch Modification Techniques, Ph.D. thesis, University of Edinburgh, 1999.
Google Scholar
Karthik Narasimhan, José C. Principe, and Donald G. Childers, “Nonlinear dynamic modeling of the voiced excitation for improved speech synthesis,” in Proc. of ICASSP’99, 1999.
Google Scholar
Simon Haykin, Neural Networks. A Comprehensive Foundation, Macmillan College Publishing Company, New York, Toronto, Oxford, 1994.
MATH Google Scholar
Hans-Peter Bernhard and Gernot Kubin, “Detection of chaotic behaviour in speech signals using Fraser’s mutual information algorithm,” in Proc. of 13th GRETSI Symposium on Signal and Image Processing, Juan-les-Pins, France, Sept. 1991, pp. 1301–1311.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Communications and Radio-Frequency Engineering, Vienna University of Technology, Gusshausstrasse 25/E389, A, 1040, Vienna, Austria
Erhard Rank
Institute of Communications and Wave Propagation, Graz University of Technology, Inffeldgasse 16c, A, 8010, Graz, Austria
Gernot Kubin

Authors

Erhard Rank
View author publications
You can also search for this author in PubMed Google Scholar
Gernot Kubin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Inteligencia Artificial, Universidad Nacional de Educación a Distancia, Senda del Rey, s/n., 28040, Madrid, Spain
José Mira
Departamento de Arquitectura y Tecnología de Computadores, Universidad de Granada, Campus Fuentenueva, 18071, Granada, Spain
Alberto Prieto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rank, E., Kubin, G. (2001). Nonlinear Synthesis of Vowels in the LP Residual Domain with a Regularized RBF Network. In: Mira, J., Prieto, A. (eds) Bio-Inspired Applications of Connectionism. IWANN 2001. Lecture Notes in Computer Science, vol 2085. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45723-2_90

Download citation

DOI: https://doi.org/10.1007/3-540-45723-2_90
Published: 12 June 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42237-2
Online ISBN: 978-3-540-45723-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Nonlinear Synthesis of Vowels in the LP Residual Domain with a Regularized RBF Network

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Czech Speech Synthesis with Generative Neural Vocoder

Voice conversion system using salient sub-bands and radial basis function

Deep Recurrent Neural Networks in Speech Synthesis Using a Continuous Vocoder

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Nonlinear Synthesis of Vowels in the LP Residual Domain with a Regularized RBF Network

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Czech Speech Synthesis with Generative Neural Vocoder

Voice conversion system using salient sub-bands and radial basis function

Deep Recurrent Neural Networks in Speech Synthesis Using a Continuous Vocoder

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation