Skip to main content

Nonlinear Speech Processing: Overview and Possibilities in Speech Coding

  • Conference paper
Nonlinear Speech Modeling and Applications (NN 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3445))

Included in the following conference series:

Abstract

In this paper we give a brief overview of nonlinear predictive models, with special emphasis on neural nets. Several well known strategies are discussed, such as multi-start random weights initialization, regularization, early stop with validation, committee of neural nets, different architectures, etc. Although the paper is devoted to ADPCM speech coding (scalar and vectorial schemes), this study offers a good chance to deal with nonlinear predictors, as a first step towards a more sophisticated applications. Thus, our main purpose is to state new possibilities for speech coding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Makhoul, J.: Linear prediction: a tutorial review. Proceedings of the IEEE 63, 561–580 (1975)

    Article  Google Scholar 

  2. Jain, A.K., Mao, J.: Artificial neural networks: a tutorial. IEEE Computer, 31–44 (March 1996)

    Google Scholar 

  3. Faundez-Zanuy, M., McLaughlin, S., Esposito, A., Hussain, A., Schoentgen, J., Kubin, G., Kleijn, W.B., Maragos, P.: Nonlinear speech processing: overview and applications. In: Control and intelligent systems, vol. 30(1), pp. 1–10. ACTA Press (2002)

    Google Scholar 

  4. Thyssen, J., Nielsen, H., Hansen S.D.: Non-linear short-term prediction in speech coding. In: IEEE ICASSP 1994, pp.I–185 , I–188 (1994)

    Google Scholar 

  5. Townshend, B.: Nonlinear prediction of speech. IEEE ICASSP 1991 1, 425–428 (1991)

    Google Scholar 

  6. Teager, H.M.: Some observations on oral air flow vocalization. IEEE trans. ASSP 82, 559–601 (1980)

    Google Scholar 

  7. Kubin, G.: Speech coding and synthesis. In: Kleijn, W.B., Paliwal, K.K. (eds.) Nonlinear processing of speech, ch. 16, Elsevier, Amsterdam (1995)

    Google Scholar 

  8. Thyssen, J., Nielsen, H., Hansen, S.D.: Non-linearities in speech. In: Proceedings IEEE workshop Nonlinear Signal & Image Processing, NSIP 1995 (June 1995)

    Google Scholar 

  9. Kumar, A., Gersho, A.: LD-CELP speech coding with nonlinear prediction. IEEE Signal Processing letters 4(4), 89–91 (1997)

    Article  Google Scholar 

  10. Wu, L., Niranjan, M., Fallside, F.: Fully vector quantized neural network-based codeexcited nonlinear predictive speech coding. IEEE transactions on speech and audio processing 2(4) ( October 1994)

    Google Scholar 

  11. Wang, S., Paksoy E., Gersho, A.: Performance of nonlinear prediction of speech. In: Proceedings ICSLP 1990, pp.29-32 (1990)

    Google Scholar 

  12. Lee, Y.K., Johnson, D.H.: Nonparametric prediction of non-gaussian time series. IEEE ICASSP 1993 IV, 480–483 (1993)

    Google Scholar 

  13. Ma, N., Wei, G.: Speech coding with nonlinear local prediction model. IEEE ICASSP 1998 II, 1101–1104 (1998)

    Google Scholar 

  14. Pitas, I., Venetsanopoulos, A.N.: Non-linear digital filters: principles and applications. Kluwer (ed.) (1990)

    Google Scholar 

  15. Lippmann, R.P.: An introduction to computing with neural nets. IEEE trans. ASSP 3(4), 4–22 (1988)

    Google Scholar 

  16. Mumolo, E., Francescato, D.: Adaptive predictive coding of speech by means of Volterra predictors. In: Workshop on nonlinear digital signal processing. Tampere 1993, pp. 2.1-4.1 to 2.1-4.4 (1993)

    Google Scholar 

  17. Mumolo, E., Carini, A., Francescato, D.: ADPCM with nonlinear predictors. Signal Processing VII: Theories and applications EUSIPCO 1994, pp. 387–390 (1999)

    Google Scholar 

  18. Niranjan, M., Kadirkamanathan, V.: A nonlinear model for time series prediction and signal interpolation. IEEE ICASSP 1991, 1713–1716 (1991)

    Google Scholar 

  19. Vesin, J.M.: An alternative scheme for adaptive nonlinear prediction using radial basis functions. In: Signal Processing VI: Theories and applications EUSIPCO 1992, pp. 1069–1072 (1992)

    Google Scholar 

  20. Diaz-de-Maria, F., Figueiras, A.: Nonlinear prediction for speech coding using radial basis functions. IEEE ICASSP 1995, 788–791 (1995)

    Google Scholar 

  21. Diaz-de-Maria, F., Figueiras, A.: Radial basis functions for nonlinear prediction of speech in analysis by synthesis coders. In: Proceedings of the IEEE Workshop on Nonlinear signal and image processing. NSIP, June 1995, pp. 66–69 (1995)

    Google Scholar 

  22. Yee, P., Haykin, S.: A dynamic regularized Gaussian radial basis function network for nonlinear, nonstationary time series prediction. In: IEEE ICASSP 1995, 3419–3422 (1995)

    Google Scholar 

  23. Birgmeier, M.: Nonlinear prediction of speech signals using radial basis function networks. Signal Processing VIII: Theories and applications EUSIPCO 1996 1, 459–462 (1996)

    Google Scholar 

  24. Faundez-Zanuy, M.: Adaptive Hybrid Speech coding with a MLP/LPC structure. In: Mira, J. (ed.) IWANN 1999. LNCS, vol. 1607, pp. 814–823. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  25. Shang, Y., Wah, B.: Global optimization for neural network training. IEEE Computer, 45–54 (March 1996)

    Google Scholar 

  26. Bishop, C.M.: Neural networks for pattern recognition. Ed. Clarendon Press (1995)

    Google Scholar 

  27. Foresee, F.D., Hagan, M.T.: Gauss-Newton approximation to Bayesian regularization. In: Proceedings of the 1997 International Joint Conference on Neural Networks, pp. 1930–1935 (1997)

    Google Scholar 

  28. Mackay, D.J.C.: Bayesian interpolation. Neural Computation 4(3), 415–447 (1992)

    Article  Google Scholar 

  29. Faundez-Zanuy, M., Vallverdu, F., Monte, E.: Nonlinear prediction with neural nets in ADPCM. In: IEEE ICASSP 1998 .SP11.3.Seattle, USA (1998)

    Google Scholar 

  30. Faundez-Zanuy, M.: Nonlinear predictive models computation in ADPCM schemes. Signal Processing X: Theories and applications EUSIPCO 2000 II, 813–816 (2000)

    Google Scholar 

  31. Faundez-Zanuy, M., Oliva, O.: ADPCM with nonlinear prediction. In: Signal Processing IX: Theories and applications EUSIPCO 1998. pp 1205–1208 (1999)

    Google Scholar 

  32. Oliva, O., Faundez-Zanuy, M.: A comparative study of several ADPCM schemes with linear and nonlinear prediction. EUROSPEECH 1999, Budapest  3, 1467–1470 (1999)

    Google Scholar 

  33. Jayant, N.S., Noll, P.: Digital compression of waveforms. Ed. Prentice Hall, Englewood Cliffs (1984)

    Google Scholar 

  34. Haykin, S.: Neural nets. A comprehensive foundation, 2nd edn. Prentice-Hall, Englewood Cliffs (1999)

    Google Scholar 

  35. Faundez-Zanuy, M.: Data fusion in biometrics. Accepted for publication. IEEE Aerospace and Electronic Systems Magazine (2004) (in press)

    Google Scholar 

  36. Faundez-Zanuy, M.: On the vulnerability of biometric security systems. IEEE Aerospace and Electronic Systems Magazine 19(6), 3–8 (2004)

    Article  Google Scholar 

  37. Faundez-Zanuy, M.: Biometric recognition: why not massively adopted yet?. Accepted for publication. IEEE Aerospace and Electronic Systems Magazine (2004) (in press)

    Google Scholar 

  38. Gersho, A., Gray, R.M.: Vector Quantization and signal compression. Ed. Kluwer, Dordrecht (1992)

    Google Scholar 

  39. Montacié, C., Le Floch, J.L.: Discriminant AR-Vector models for free-text speaker verification. In: EROSPEECH 1993, pp. 161–164 (1993)

    Google Scholar 

  40. Faundez-Zanuy, M.: Vectorial Nonlinear prediction with neural nets. In: Mira, J., Prieto, A.G. (eds.) IWANN 2001. LNCS, vol. 2085, pp. 754–761. Springer, Heidelberg (2001)

    Google Scholar 

  41. Cuperman, V., Gersho, A.: Vector Predictive coding of speech at 16 kbits/s. IEEE Trans. on Comm. COM-33, 685–696 (1985)

    Article  Google Scholar 

  42. Faundez-Zanuy, M.: Nonlinear predictive vector quantization of speech. In: Proceedings 7th European Conference on speech communication and technology, EUROSPEECH 2001, vol. 3, pp. 1977–1980 (2001)

    Google Scholar 

  43. Faundez-Zanuy, M.: N-dimensional nonlinear prediction with MLP. Signal Processing XI: Theories and applications EUSIPCO 2002 III, 537–540 (2002)

    Google Scholar 

  44. Wells, R.B.: Applied coding and information theory for engineers. Ed. Prentice Hall, Englewood Cliffs (1999)

    Google Scholar 

  45. Picone, J.W.: Signal Modeling techniques in speech recognition. Proceedings of the IEEE 79(4), 1215–1247 (1991)

    Google Scholar 

  46. Mammone, R., Zhang, X., Ramachandran, R.: Robust speaker recognition. IEEE signal processing magazine, 58–71 (September 1996)

    Google Scholar 

  47. Soong, F.K., Rosenberg, A.E.: On the use of instantaneous and transitional spectral information in speaker recognition. IEEE Trans. On ASSP 36(6), 871–879 (1988)

    Article  MATH  Google Scholar 

  48. Faundez-Zanuy, M.: What can predictive speech coders learn from speaker recognizers?. In: ISCA tutorial and research workshop on non-linear speech processing NOLISP, Le Croisic, France (May 2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Faundez-Zanuy, M. (2005). Nonlinear Speech Processing: Overview and Possibilities in Speech Coding. In: Chollet, G., Esposito, A., Faundez-Zanuy, M., Marinaro, M. (eds) Nonlinear Speech Modeling and Applications. NN 2004. Lecture Notes in Computer Science(), vol 3445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11520153_2

Download citation

  • DOI: https://doi.org/10.1007/11520153_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27441-4

  • Online ISBN: 978-3-540-31886-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics