Abstract
In this paper we give a brief overview of speaker recognition with special emphasis on nonlinear predictive models, based on neural nets. Main challenges and possibilities for nonlinear feature extraction are described, and experimental results of several strategies are provided. This paper is presented as a starting point for the non-linear model for speaker recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Faundez-Zanuy, M.: On the vulnerability of biometric security systems. IEEE Aerospace and Electronic Systems Magazine 19(6), 3–8 (2004)
Faundez-Zanuy, M.: Biometric recognition: why not massively adopted yet? IEEE Aerospace and Electronic Systems Magazine 20(8), 25–28 (2005)
Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET curve in assessment of detection performance. In: European speech Processing Conference Eurospeech, vol. 4, pp. 1895–1898 (1997)
Furui, S.: Digital Speech Processing, synthesis, and recognition. Marcel Dekker, New York (1989)
Campbell, J.P., Reynolds, D.A., Dunn, R.B.: Fusing high- and low-level features for speaker recognition. In: Eurospeech 2003, Geneva (2003)
Faundez-Zanuy, M.: Data fusion in biometrics. IEEE Aerospace and Electronic Systems Magazine 20(1), 34–38 (2005)
Faundez-Zanuy, M., Monte-Moreno, E.: State-of-the-art in speaker recognition. IEEE Aerospace and Electronic Systems Magazine 20(5), 7–12 (2005)
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. On Speech and Audio Processing 3(1), 72–83 (1995)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Ortega-García, J., González-Rodríguez, J., Marrero-Aguiar, V.: AHUMADA: A Large Speech Corpus in Spanish for Speaker Characterization and Identification. Speech communication 31, 255–264 (2000)
Doddington, G.: Speaker Recognition based on Idiolectal Differences between Speakers. In: Eurospeech, vol. 4, Aalborg, pp. 2521–2524 (2001)
Manning, C.D., Schtze, H.: Foundations of Statistical Natural Language Processing, 1st edn. MIT Press, Cambridge (June 18, 1999)
Thyssen, J., Nielsen, H., Hansen, S.D.: Non-linear short-term prediction in speech coding. IEEE ICASSP, pp. I-185–I-188 (1994)
Townshend, B.: Nonlinear prediction of speech. In: IEEE ICASSP-1991, vol. 1, pp. 425–428 (1991)
Teager, H.M.: Some observations on oral air flow vocalization. IEEE trans. ASSP 82, 559–601 (1980)
Kubin, G.: Nonlinear processing of speech. In: Kleijn, W.B., Paliwal, K.K. (eds.) Speech coding and synthesis, Elsevier, Amsterdam (1995)
Thyssen, J., Nielsen, H., Hansen, S.D.: Non-linearities in speech. In: Proceedings IEEE workshop Nonlinear Signal & Image Processing, NSIP’95 (June 1995)
Faúndez-Zanuy, M.: Nonlinear Speech Processing: Overview and Possibilities in Speech Coding. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 15–42. Springer, Heidelberg (2005)
Kumar, A., Gersho, A.: LD-CELP speech coding with nonlinear prediction. IEEE Signal Processing letters 4(4), 89–91 (1997)
Wu, L., Niranjan, M., Fallside, F.: Fully vector quantized neural network-based code-excited nonlinear predictive speech coding. IEEE transactions on speech and audio processing 2(4) (1994)
Wang, S., Paksoy, E., Gersho, A.: Performance of nonlinear prediction of speech. In: Proceedings ICSLP-1990, pp. 29–32 (1990)
Lee, Y.K., Johnson, D.H.: Nonparametric prediction of non-gaussian time series. In: IEEE ICASSP 1993, vol. IV, pp. 480–483 (1993)
Ma, N., Wei, G.: Speech coding with nonlinear local prediction model. In: IEEE ICASSP 1998, vol. II, pp. 1101–1104 (1998)
Pitas, I., Venetsanopoulos, A.N.: Non-linear digital filters: principles and applications. Kluwer Academic Publishers, Dordrecht (1990)
Lippmann, R.P.: An introduction to computing with neural nets. IEEE trans. ASSP 3(4), 4–22 (1988)
Jain, A.K., Mao, J.: Artificial neural networks: a tutorial. IEEE Computer, 31–44 (March 1996)
Faundez-Zanuy, M., Rodriguez, D.: Speaker recognition using residual signal of linear and nonlinear prediction models. In: 5th International Conference on spoken language processing (ICSLP’98), vol. 2, Sydney, pp. 121–124 (1998)
Faundez-Zanuy, M.: Speaker recognition by means of a combination of linear and nonlinear predictive models. In: EUROSPEECH’99, vol. 2, Budapest, pp. 763–766 (1999)
Soong, F.K., Rosenberg, A.E., Rabiner, L.R., Juang, B.H.: A vector quantization approach to speaker recognition. In: ICASSP 1985, pp. 387–390 (1985)
Gas, B., Zarader, J.L., Chavy, C., Chetouani, M.: Discriminant neural predictive coding applied to phoneme recognition. Neurocomputing 56, 141–166 (2004)
Chetouani, M., Faúndez-Zanuy, M., Gas, B., Zarader, J.-L.: Non-linear Speech Feature Extraction for Phoneme Classification and Speaker Recognition. In: Chollet, G., Esposito, A., Faúndez-Zanuy, M., Marinaro, M. (eds.) Nonlinear Speech Modeling and Applications. LNCS (LNAI), vol. 3445, pp. 344–350. Springer, Heidelberg (2005)
Kleijn, W.B.: Signal processing representations of speech. IEICE Trans. Inf. And Syst. E86-D(3), 359–376 (2003)
Kolmogorov, A.N.: On the representation of continuous functions of several variables by superposition of continuous functions of one variable and addition. Dokl., 679–681 (1957)
Kurkova, V.: Kolmogorov’s theorem is relevant. Neural Computation 3(4), 617–622 (1991)
Hecht-Nielsen, R.: Kolmogorov’s mapping neural network existence theorem. In: Proc. of International Conference on Neural Networks, pp. 11–13 (1987)
Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
Gas, B., Chetouani, M., Zarader, J.L., Feiz, F.: The Predictive Self_Organizing Map: application to speech features extraction. In: WSOM’05 (2005)
Burrows, T.L.: Speech processing with linear and neural networks models. PhD Cambridge (1996)
Chetouani, M., Faundez-Zanuy, M., Gas, B., Zarader, J.L.: A new nonlinear speaker parameterization algorithm for speaker identification. In: Speaker Odyssey’04: Speaker Recognition Workshop, Toledo, Spain (May 2004)
Ross, A.A., Nandakumar, K., Jain, A.K.: Handbook of multibiometrics. Springer, Heidelberg (2006)
Faundez-Zanuy, M.: On-line signature recognition based on VQ-DTW. Pattern Recognition 40, 981–992 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this chapter
Cite this chapter
Faundez-Zanuy, M., Chetouani, M. (2007). Nonlinear Predictive Models: Overview and Possibilities in Speaker Recognition. In: Stylianou, Y., Faundez-Zanuy, M., Esposito, A. (eds) Progress in Nonlinear Speech Processing. Lecture Notes in Computer Science, vol 4391. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71505-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-71505-4_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71503-0
Online ISBN: 978-3-540-71505-4
eBook Packages: Computer ScienceComputer Science (R0)