Abstract
This paper concerns an influence of a filter shape and a benefit of the Hertz-Bark transformation to the word error rate (WER) obtained in a telephone-based speech recognition application working with the Perceptually-based Linear Predictive (PLP) parameterization. Five various shapes of filters (rectangular, narrow and wide trapezium, triangular and the classical PLP filter shape [1]) were compared and an effect of a nonlinear frequency transformation between Hertz and generalized Bark axis was explored. Experiments with 100 speakers and with the vocabulary size of 475 words were performed. During all experiments only the zero-gram language model was used to see better an influence of particular variables to changes of the WER.
Support for this work was provided by the Ministry of Education of the Czech Republic, project No. MSM234200004, and by the Grant Agency of the Czech Republic, project No. 102/96/K087.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hermansky, H.: Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Amer., 87, (1990), pp.1738–1752.
Hermansky, H.: Should recognisers have ears? Speech Communication, 25, (1998), pp.3–27.
Müller, L., Psutka, J., Šmídl, L.: Design of Speech Recognition Engine.-In: Text, Speech and Dialogue. The 3rd International Workshop on TSD’2000. Berlin, Heidelberg, Springer-Verlag 2000. pp.259–264.
Psutka, J., Müller, L., Psutka, J.V.: Comparison of MFCC and PLP Parameterisations in the Speaker Independent Continuous Speech Recognition Task. (prepared for EUROSPEECH2001).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Psutka, J., Müller, L., Psutka, J.V. (2001). The Influence of a Filter Shape in Telephone-Based Recognition Module Using PLP Parameterization. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds) Text, Speech and Dialogue. TSD 2001. Lecture Notes in Computer Science(), vol 2166. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44805-5_29
Download citation
DOI: https://doi.org/10.1007/3-540-44805-5_29
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42557-1
Online ISBN: 978-3-540-44805-1
eBook Packages: Springer Book Archive