Skip to main content

Spoken-digit recognition using self-organizing maps with perceptual pre-processing

  • Neural Networks for Perception
  • Conference paper
  • First Online:
Biological and Artificial Computation: From Neuroscience to Technology (IWANN 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1240))

Included in the following conference series:

Abstract

One of the current challenges in Pattern Recognition is the treatment of Speech, the complexity of this task being due fundamentally to the great statistical variability of the speakers, and to the temporal structure of the speech signal in itself. Throughout the current bibliography multiple discriminative methods and coding algorithms can be found claiming slight advances in the recognition rates, which may be considered important advances, as the field is reaching a verge difficult to move over. In this experiment a representation in bidimensional selforganizing maps of the decimal digits spoken in English (from one to nine is carried out. This representation has been checked taking data from the TIMIT database, starting from a previous code based in Perceptual Linear Prediction coefficients (PLP). Subsequently, a heuristic algorithm for recognition has been defined. The application of this algorithm to both a training data set and a test data set produces acceptable recognition rates, even for low-dimension maps with the benefit of the reduction in the computational costs. The basic methodology used and the mentioned results are presented and discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Furui, S., “Speaker-Independent Isolated Word Recognition using Dynamic Features of Speech Spectrum”, IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 34, no. 1, February 1986, pp. 52–59.

    Google Scholar 

  2. Hermansky, H., “Perceptual Linear Predictive (PLP) Analysis of Speech”, Journal of the Acoustical Society of America, vol. 87, no. 4, pp 1738–1752, 1990.

    Google Scholar 

  3. Hermansky, H., Morgan, N., Bayya, A. andKohn, P., “RASTA-PLP Speech Analysis technique”, Proc. of the ICASSP'92, pp. I-121-124,1992.

    Google Scholar 

  4. Hermansky, H. and Morgan, N., “RASTA processing of Speech”, IEEE Transactions on Speech and Audio Processing, vol. 2, no. 4, October, pp. 578–589, 1994.

    Google Scholar 

  5. Kohonen, T., “The ‘Neural’ Phonetic Typewriter”, Computer, March, 1988, pp. 11–24.

    Google Scholar 

  6. Kohonen, T., “Physiological Interpretation of the Self-Organizing Map Algorithm”, Neural Networks, vol. 6, pp. 895–905, 1993.

    Google Scholar 

  7. Picone, J., “Signal Modeling Techniques in Speech Recognition”, Proceedings of the IEEE, vol. 81, no. 9, September, 1993, pp. 1215–1247.

    Google Scholar 

  8. Robinson T., “Speech Analysis. Notes”, Department of Electrical Engineering, University Cambridge, 1996.

    Google Scholar 

  9. Schreiner C. E., “Order and Disorder in Auditory Cortical Maps”, Current Opinion in Neurobiology, vol. 5, pp. 489–496, 1995.

    Google Scholar 

  10. Valtchev, V., “Discriminative Methods in HMM-based Speech Recognition”, PhD. Thesis, University of Cambridge, March 1995.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

José Mira Roberto Moreno-Díaz Joan Cabestany

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Díaz, F., Ferrández, J.M., Gómez, P., Rodellar, V., Nieto, V. (1997). Spoken-digit recognition using self-organizing maps with perceptual pre-processing. In: Mira, J., Moreno-Díaz, R., Cabestany, J. (eds) Biological and Artificial Computation: From Neuroscience to Technology. IWANN 1997. Lecture Notes in Computer Science, vol 1240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032580

Download citation

  • DOI: https://doi.org/10.1007/BFb0032580

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63047-0

  • Online ISBN: 978-3-540-69074-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics