Abstract
In this paper we give a brief overview of nonlinear predictive models, with special emphasis on neural nets. Several well known strategies are discussed, such as multi-start random weights initialization, regularization, early stop with validation, committee of neural nets, different architectures, etc. Although the paper is devoted to ADPCM speech coding (scalar and vectorial schemes), this study offers a good chance to deal with nonlinear predictors, as a first step towards a more sophisticated applications. Thus, our main purpose is to state new possibilities for speech coding.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Makhoul, J.: Linear prediction: a tutorial review. Proceedings of the IEEE 63, 561–580 (1975)
Jain, A.K., Mao, J.: Artificial neural networks: a tutorial. IEEE Computer, 31–44 (March 1996)
Faundez-Zanuy, M., McLaughlin, S., Esposito, A., Hussain, A., Schoentgen, J., Kubin, G., Kleijn, W.B., Maragos, P.: Nonlinear speech processing: overview and applications. In: Control and intelligent systems, vol. 30(1), pp. 1–10. ACTA Press (2002)
Thyssen, J., Nielsen, H., Hansen S.D.: Non-linear short-term prediction in speech coding. In: IEEE ICASSP 1994, pp.I–185 , I–188 (1994)
Townshend, B.: Nonlinear prediction of speech. IEEE ICASSP 1991 1, 425–428 (1991)
Teager, H.M.: Some observations on oral air flow vocalization. IEEE trans. ASSP 82, 559–601 (1980)
Kubin, G.: Speech coding and synthesis. In: Kleijn, W.B., Paliwal, K.K. (eds.) Nonlinear processing of speech, ch. 16, Elsevier, Amsterdam (1995)
Thyssen, J., Nielsen, H., Hansen, S.D.: Non-linearities in speech. In: Proceedings IEEE workshop Nonlinear Signal & Image Processing, NSIP 1995 (June 1995)
Kumar, A., Gersho, A.: LD-CELP speech coding with nonlinear prediction. IEEE Signal Processing letters 4(4), 89–91 (1997)
Wu, L., Niranjan, M., Fallside, F.: Fully vector quantized neural network-based codeexcited nonlinear predictive speech coding. IEEE transactions on speech and audio processing 2(4) ( October 1994)
Wang, S., Paksoy E., Gersho, A.: Performance of nonlinear prediction of speech. In: Proceedings ICSLP 1990, pp.29-32 (1990)
Lee, Y.K., Johnson, D.H.: Nonparametric prediction of non-gaussian time series. IEEE ICASSP 1993 IV, 480–483 (1993)
Ma, N., Wei, G.: Speech coding with nonlinear local prediction model. IEEE ICASSP 1998 II, 1101–1104 (1998)
Pitas, I., Venetsanopoulos, A.N.: Non-linear digital filters: principles and applications. Kluwer (ed.) (1990)
Lippmann, R.P.: An introduction to computing with neural nets. IEEE trans. ASSP 3(4), 4–22 (1988)
Mumolo, E., Francescato, D.: Adaptive predictive coding of speech by means of Volterra predictors. In: Workshop on nonlinear digital signal processing. Tampere 1993, pp. 2.1-4.1 to 2.1-4.4 (1993)
Mumolo, E., Carini, A., Francescato, D.: ADPCM with nonlinear predictors. Signal Processing VII: Theories and applications EUSIPCO 1994, pp. 387–390 (1999)
Niranjan, M., Kadirkamanathan, V.: A nonlinear model for time series prediction and signal interpolation. IEEE ICASSP 1991, 1713–1716 (1991)
Vesin, J.M.: An alternative scheme for adaptive nonlinear prediction using radial basis functions. In: Signal Processing VI: Theories and applications EUSIPCO 1992, pp. 1069–1072 (1992)
Diaz-de-Maria, F., Figueiras, A.: Nonlinear prediction for speech coding using radial basis functions. IEEE ICASSP 1995, 788–791 (1995)
Diaz-de-Maria, F., Figueiras, A.: Radial basis functions for nonlinear prediction of speech in analysis by synthesis coders. In: Proceedings of the IEEE Workshop on Nonlinear signal and image processing. NSIP, June 1995, pp. 66–69 (1995)
Yee, P., Haykin, S.: A dynamic regularized Gaussian radial basis function network for nonlinear, nonstationary time series prediction. In: IEEE ICASSP 1995, 3419–3422 (1995)
Birgmeier, M.: Nonlinear prediction of speech signals using radial basis function networks. Signal Processing VIII: Theories and applications EUSIPCO 1996 1, 459–462 (1996)
Faundez-Zanuy, M.: Adaptive Hybrid Speech coding with a MLP/LPC structure. In: Mira, J. (ed.) IWANN 1999. LNCS, vol. 1607, pp. 814–823. Springer, Heidelberg (1999)
Shang, Y., Wah, B.: Global optimization for neural network training. IEEE Computer, 45–54 (March 1996)
Bishop, C.M.: Neural networks for pattern recognition. Ed. Clarendon Press (1995)
Foresee, F.D., Hagan, M.T.: Gauss-Newton approximation to Bayesian regularization. In: Proceedings of the 1997 International Joint Conference on Neural Networks, pp. 1930–1935 (1997)
Mackay, D.J.C.: Bayesian interpolation. Neural Computation 4(3), 415–447 (1992)
Faundez-Zanuy, M., Vallverdu, F., Monte, E.: Nonlinear prediction with neural nets in ADPCM. In: IEEE ICASSP 1998 .SP11.3.Seattle, USA (1998)
Faundez-Zanuy, M.: Nonlinear predictive models computation in ADPCM schemes. Signal Processing X: Theories and applications EUSIPCO 2000 II, 813–816 (2000)
Faundez-Zanuy, M., Oliva, O.: ADPCM with nonlinear prediction. In: Signal Processing IX: Theories and applications EUSIPCO 1998. pp 1205–1208 (1999)
Oliva, O., Faundez-Zanuy, M.: A comparative study of several ADPCM schemes with linear and nonlinear prediction. EUROSPEECH 1999, Budapest 3, 1467–1470 (1999)
Jayant, N.S., Noll, P.: Digital compression of waveforms. Ed. Prentice Hall, Englewood Cliffs (1984)
Haykin, S.: Neural nets. A comprehensive foundation, 2nd edn. Prentice-Hall, Englewood Cliffs (1999)
Faundez-Zanuy, M.: Data fusion in biometrics. Accepted for publication. IEEE Aerospace and Electronic Systems Magazine (2004) (in press)
Faundez-Zanuy, M.: On the vulnerability of biometric security systems. IEEE Aerospace and Electronic Systems Magazine 19(6), 3–8 (2004)
Faundez-Zanuy, M.: Biometric recognition: why not massively adopted yet?. Accepted for publication. IEEE Aerospace and Electronic Systems Magazine (2004) (in press)
Gersho, A., Gray, R.M.: Vector Quantization and signal compression. Ed. Kluwer, Dordrecht (1992)
Montacié, C., Le Floch, J.L.: Discriminant AR-Vector models for free-text speaker verification. In: EROSPEECH 1993, pp. 161–164 (1993)
Faundez-Zanuy, M.: Vectorial Nonlinear prediction with neural nets. In: Mira, J., Prieto, A.G. (eds.) IWANN 2001. LNCS, vol. 2085, pp. 754–761. Springer, Heidelberg (2001)
Cuperman, V., Gersho, A.: Vector Predictive coding of speech at 16 kbits/s. IEEE Trans. on Comm. COM-33, 685–696 (1985)
Faundez-Zanuy, M.: Nonlinear predictive vector quantization of speech. In: Proceedings 7th European Conference on speech communication and technology, EUROSPEECH 2001, vol. 3, pp. 1977–1980 (2001)
Faundez-Zanuy, M.: N-dimensional nonlinear prediction with MLP. Signal Processing XI: Theories and applications EUSIPCO 2002 III, 537–540 (2002)
Wells, R.B.: Applied coding and information theory for engineers. Ed. Prentice Hall, Englewood Cliffs (1999)
Picone, J.W.: Signal Modeling techniques in speech recognition. Proceedings of the IEEE 79(4), 1215–1247 (1991)
Mammone, R., Zhang, X., Ramachandran, R.: Robust speaker recognition. IEEE signal processing magazine, 58–71 (September 1996)
Soong, F.K., Rosenberg, A.E.: On the use of instantaneous and transitional spectral information in speaker recognition. IEEE Trans. On ASSP 36(6), 871–879 (1988)
Faundez-Zanuy, M.: What can predictive speech coders learn from speaker recognizers?. In: ISCA tutorial and research workshop on non-linear speech processing NOLISP, Le Croisic, France (May 2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Faundez-Zanuy, M. (2005). Nonlinear Speech Processing: Overview and Possibilities in Speech Coding. In: Chollet, G., Esposito, A., Faundez-Zanuy, M., Marinaro, M. (eds) Nonlinear Speech Modeling and Applications. NN 2004. Lecture Notes in Computer Science(), vol 3445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11520153_2
Download citation
DOI: https://doi.org/10.1007/11520153_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27441-4
Online ISBN: 978-3-540-31886-6
eBook Packages: Computer ScienceComputer Science (R0)