Nonlinear Predictive Models: Overview and Possibilities in Speaker Recognition

Faundez-Zanuy, Marcos; Chetouani, Mohamed

doi:10.1007/978-3-540-71505-4_10

Marcos Faundez-Zanuy¹ &
Mohamed Chetouani¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4391))

1131 Accesses
1 Citations

Abstract

In this paper we give a brief overview of speaker recognition with special emphasis on nonlinear predictive models, based on neural nets. Main challenges and possibilities for nonlinear feature extraction are described, and experimental results of several strategies are provided. This paper is presented as a starting point for the non-linear model for speaker recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Faundez-Zanuy, M.: On the vulnerability of biometric security systems. IEEE Aerospace and Electronic Systems Magazine 19(6), 3–8 (2004)
Article Google Scholar
Faundez-Zanuy, M.: Biometric recognition: why not massively adopted yet? IEEE Aerospace and Electronic Systems Magazine 20(8), 25–28 (2005)
Article Google Scholar
Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET curve in assessment of detection performance. In: European speech Processing Conference Eurospeech, vol. 4, pp. 1895–1898 (1997)
Google Scholar
Furui, S.: Digital Speech Processing, synthesis, and recognition. Marcel Dekker, New York (1989)
Google Scholar
Campbell, J.P., Reynolds, D.A., Dunn, R.B.: Fusing high- and low-level features for speaker recognition. In: Eurospeech 2003, Geneva (2003)
Google Scholar
Faundez-Zanuy, M.: Data fusion in biometrics. IEEE Aerospace and Electronic Systems Magazine 20(1), 34–38 (2005)
Article Google Scholar
Faundez-Zanuy, M., Monte-Moreno, E.: State-of-the-art in speaker recognition. IEEE Aerospace and Electronic Systems Magazine 20(5), 7–12 (2005)
Article Google Scholar
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. On Speech and Audio Processing 3(1), 72–83 (1995)
Article Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Google Scholar
http://www.praat.org
Ortega-García, J., González-Rodríguez, J., Marrero-Aguiar, V.: AHUMADA: A Large Speech Corpus in Spanish for Speaker Characterization and Identification. Speech communication 31, 255–264 (2000)
Article Google Scholar
Doddington, G.: Speaker Recognition based on Idiolectal Differences between Speakers. In: Eurospeech, vol. 4, Aalborg, pp. 2521–2524 (2001)
Google Scholar
Manning, C.D., Schtze, H.: Foundations of Statistical Natural Language Processing, 1st edn. MIT Press, Cambridge (June 18, 1999)
MATH Google Scholar
Thyssen, J., Nielsen, H., Hansen, S.D.: Non-linear short-term prediction in speech coding. IEEE ICASSP, pp. I-185–I-188 (1994)
Google Scholar
Townshend, B.: Nonlinear prediction of speech. In: IEEE ICASSP-1991, vol. 1, pp. 425–428 (1991)
Google Scholar
Teager, H.M.: Some observations on oral air flow vocalization. IEEE trans. ASSP 82, 559–601 (1980)
Google Scholar
Kubin, G.: Nonlinear processing of speech. In: Kleijn, W.B., Paliwal, K.K. (eds.) Speech coding and synthesis, Elsevier, Amsterdam (1995)
Google Scholar
Thyssen, J., Nielsen, H., Hansen, S.D.: Non-linearities in speech. In: Proceedings IEEE workshop Nonlinear Signal & Image Processing, NSIP’95 (June 1995)
Google Scholar
Faúndez-Zanuy, M.: Nonlinear Speech Processing: Overview and Possibilities in Speech Coding. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 15–42. Springer, Heidelberg (2005)
Google Scholar
Kumar, A., Gersho, A.: LD-CELP speech coding with nonlinear prediction. IEEE Signal Processing letters 4(4), 89–91 (1997)
Article Google Scholar
Wu, L., Niranjan, M., Fallside, F.: Fully vector quantized neural network-based code-excited nonlinear predictive speech coding. IEEE transactions on speech and audio processing 2(4) (1994)
Google Scholar
Wang, S., Paksoy, E., Gersho, A.: Performance of nonlinear prediction of speech. In: Proceedings ICSLP-1990, pp. 29–32 (1990)
Google Scholar
Lee, Y.K., Johnson, D.H.: Nonparametric prediction of non-gaussian time series. In: IEEE ICASSP 1993, vol. IV, pp. 480–483 (1993)
Google Scholar
Ma, N., Wei, G.: Speech coding with nonlinear local prediction model. In: IEEE ICASSP 1998, vol. II, pp. 1101–1104 (1998)
Google Scholar
Pitas, I., Venetsanopoulos, A.N.: Non-linear digital filters: principles and applications. Kluwer Academic Publishers, Dordrecht (1990)
Google Scholar
Lippmann, R.P.: An introduction to computing with neural nets. IEEE trans. ASSP 3(4), 4–22 (1988)
Google Scholar
Jain, A.K., Mao, J.: Artificial neural networks: a tutorial. IEEE Computer, 31–44 (March 1996)
Google Scholar
Faundez-Zanuy, M., Rodriguez, D.: Speaker recognition using residual signal of linear and nonlinear prediction models. In: 5th International Conference on spoken language processing (ICSLP’98), vol. 2, Sydney, pp. 121–124 (1998)
Google Scholar
Faundez-Zanuy, M.: Speaker recognition by means of a combination of linear and nonlinear predictive models. In: EUROSPEECH’99, vol. 2, Budapest, pp. 763–766 (1999)
Google Scholar
Soong, F.K., Rosenberg, A.E., Rabiner, L.R., Juang, B.H.: A vector quantization approach to speaker recognition. In: ICASSP 1985, pp. 387–390 (1985)
Google Scholar
Gas, B., Zarader, J.L., Chavy, C., Chetouani, M.: Discriminant neural predictive coding applied to phoneme recognition. Neurocomputing 56, 141–166 (2004)
Article Google Scholar
Chetouani, M., Faúndez-Zanuy, M., Gas, B., Zarader, J.-L.: Non-linear Speech Feature Extraction for Phoneme Classification and Speaker Recognition. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 344–350. Springer, Heidelberg (2005)
Google Scholar
Kleijn, W.B.: Signal processing representations of speech. IEICE Trans. Inf. And Syst. E86-D(3), 359–376 (2003)
Google Scholar
Kolmogorov, A.N.: On the representation of continuous functions of several variables by superposition of continuous functions of one variable and addition. Dokl., 679–681 (1957)
Google Scholar
Kurkova, V.: Kolmogorov’s theorem is relevant. Neural Computation 3(4), 617–622 (1991)
Article Google Scholar
Hecht-Nielsen, R.: Kolmogorov’s mapping neural network existence theorem. In: Proc. of International Conference on Neural Networks, pp. 11–13 (1987)
Google Scholar
Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
Google Scholar
Gas, B., Chetouani, M., Zarader, J.L., Feiz, F.: The Predictive Self_Organizing Map: application to speech features extraction. In: WSOM’05 (2005)
Google Scholar
Burrows, T.L.: Speech processing with linear and neural networks models. PhD Cambridge (1996)
Google Scholar
Chetouani, M., Faundez-Zanuy, M., Gas, B., Zarader, J.L.: A new nonlinear speaker parameterization algorithm for speaker identification. In: Speaker Odyssey’04: Speaker Recognition Workshop, Toledo, Spain (May 2004)
Google Scholar
Ross, A.A., Nandakumar, K., Jain, A.K.: Handbook of multibiometrics. Springer, Heidelberg (2006)
Google Scholar
Faundez-Zanuy, M.: On-line signature recognition based on VQ-DTW. Pattern Recognition 40, 981–992 (2007)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Escola Universitària Politècnica de Mataró (BARCELONA), Spain, Université Pierre and Marie Curie, Paris VI, France
Marcos Faundez-Zanuy & Mohamed Chetouani

Authors

Marcos Faundez-Zanuy
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Chetouani
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Yannis Stylianou Marcos Faundez-Zanuy Anna Esposito

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Faundez-Zanuy, M., Chetouani, M. (2007). Nonlinear Predictive Models: Overview and Possibilities in Speaker Recognition. In: Stylianou, Y., Faundez-Zanuy, M., Esposito, A. (eds) Progress in Nonlinear Speech Processing. Lecture Notes in Computer Science, vol 4391. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71505-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-540-71505-4_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71503-0
Online ISBN: 978-3-540-71505-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics