Abstract
In this article, we describe an automatic speech recognizer developed for Portuguese telephone speech. For this, we employed the Portuguese SpeechDat database which will be described in detail, giving its recording conditions, speaker characteristics and contents categories. The automatic recognizer is a state-of-the-art HMM/MLP hybrid system employing different kinds of robust acoustic features. Training and testing was carried out on the clean digits and numbers part of the database. The recognition results show competitive performance to similar systems developed for other languages.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Center for Spoken Language Understanding, Department of Computer Science and Engineering, Oregon Graduate Institute. Numbers Corpus, Release 1.0, 1995.
H. Bourlard and N. Morgan. Connectionist Speech Recognition. A Hybrid Approach. Kluwer Academic Publishers, 101 Philip Drive, Assinippi Park, Norwell, Massachusetts 02061 USA, 1994.
S. Greenberg and B.E.D. Kingsbury. The modulation spectrogram: In pursuit of an invariant representation of speech. Proc. Int. Conf. on Acoustics, Speech and Signal Processing, pages 1647–1650, 1997.
Astrid Hagen. Robust speech recognition based on multi-stream processing. PhD thesis, Département d’informatique, École Polytechnique Fédérale de Lausanne, Switzerland, 2001.
H. Hermansky. Perceptual linear predictive (PLP) analysis of speech. Journal of the Acoustical Society of America, 87(4):1738–1752, April 1990.
H. Hermansky, N. Morgan, A. Bayya, and P. Kohn. RASTA-PLP speech analysis technique. IEEE Trans. on Signal Processing, 1:121–124, 1992.
N. Morgan and H. Bourlard. Continuous speech recognition. IEEE Trans. on Signal Processing, pages 25–41, 1995.
SPEECHDAT. European speech databases for telephone applications (EU-project LRE-633140). In Proc. Int. Conf. on Acoustics, Speech and Signal Processing, 1997.
S.L. Wu, B. Kingsbury, N. Morgan, and S. Greenberg. Incorporating information from syllable-length time scales into automatic speech recognition. Proc. Int. Conf. on Acoustics, Speech and Signal Processing, 1:721–724, 1998.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hagen, A., Neto, J.P. (2003). HMM/MLP Hybrid Speech Recognizer for the Portuguese Telephone SpeechDat Corpus. In: Mamede, N.J., Trancoso, I., Baptista, J., das Graças Volpe Nunes, M. (eds) Computational Processing of the Portuguese Language. PROPOR 2003. Lecture Notes in Computer Science(), vol 2721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45011-4_19
Download citation
DOI: https://doi.org/10.1007/3-540-45011-4_19
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40436-1
Online ISBN: 978-3-540-45011-5
eBook Packages: Springer Book Archive