HMM/MLP Hybrid Speech Recognizer for the Portuguese Telephone SpeechDat Corpus

Hagen, Astrid; Neto, João P.

doi:10.1007/3-540-45011-4_19

Astrid Hagen⁴ &
João P. Neto^4,5

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2721))

Included in the following conference series:

International Workshop on Computational Processing of the Portuguese Language

441 Accesses

Abstract

In this article, we describe an automatic speech recognizer developed for Portuguese telephone speech. For this, we employed the Portuguese SpeechDat database which will be described in detail, giving its recording conditions, speaker characteristics and contents categories. The automatic recognizer is a state-of-the-art HMM/MLP hybrid system employing different kinds of robust acoustic features. Training and testing was carried out on the clean digits and numbers part of the database. The recognition results show competitive performance to similar systems developed for other languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Building Automatic Speech Recognition Systems for Moroccan Dialect: A Phoneme-Based Approach

Article 25 July 2024

KALDI Recipes for the Czech Speech Recognition Under Various Conditions

Automatic Speech Recognition for Moroccan Dialects: A Review

References

Center for Spoken Language Understanding, Department of Computer Science and Engineering, Oregon Graduate Institute. Numbers Corpus, Release 1.0, 1995.
Google Scholar
H. Bourlard and N. Morgan. Connectionist Speech Recognition. A Hybrid Approach. Kluwer Academic Publishers, 101 Philip Drive, Assinippi Park, Norwell, Massachusetts 02061 USA, 1994.
Google Scholar
S. Greenberg and B.E.D. Kingsbury. The modulation spectrogram: In pursuit of an invariant representation of speech. Proc. Int. Conf. on Acoustics, Speech and Signal Processing, pages 1647–1650, 1997.
Google Scholar
Astrid Hagen. Robust speech recognition based on multi-stream processing. PhD thesis, Département d’informatique, École Polytechnique Fédérale de Lausanne, Switzerland, 2001.
Google Scholar
H. Hermansky. Perceptual linear predictive (PLP) analysis of speech. Journal of the Acoustical Society of America, 87(4):1738–1752, April 1990.
Article Google Scholar
H. Hermansky, N. Morgan, A. Bayya, and P. Kohn. RASTA-PLP speech analysis technique. IEEE Trans. on Signal Processing, 1:121–124, 1992.
Google Scholar
N. Morgan and H. Bourlard. Continuous speech recognition. IEEE Trans. on Signal Processing, pages 25–41, 1995.
Google Scholar
SPEECHDAT. European speech databases for telephone applications (EU-project LRE-633140). In Proc. Int. Conf. on Acoustics, Speech and Signal Processing, 1997.
Google Scholar
S.L. Wu, B. Kingsbury, N. Morgan, and S. Greenberg. Incorporating information from syllable-length time scales into automatic speech recognition. Proc. Int. Conf. on Acoustics, Speech and Signal Processing, 1:721–724, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

L2F Spoken Language Systems Lab, INESC-ID, Rua Alves Redol 9, Lisbon, Portugal
Astrid Hagen & João P. Neto
Instituto Superior Técnico, Portugal
João P. Neto

Authors

Astrid Hagen
View author publications
You can also search for this author in PubMed Google Scholar
João P. Neto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

L2F, INESC-ID Lisboa, Technical University of Lisbon, Rua Alves Redol, 9, 1000-029, Lisbon, Portugal
Nuno J. Mamede & Isabel Trancoso &
Faculty of Humanities and Social Sciences, University of Algarve, Campus de Gambelas, 8005-139, Faro, Portugal
Jorge Baptista
NILC, ICMC-USP São-Carlos, Av. do Trabalhador São-Carlense, 400, 13560-970, São Carlos, SP, Brazil
Maria das Graças Volpe Nunes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hagen, A., Neto, J.P. (2003). HMM/MLP Hybrid Speech Recognizer for the Portuguese Telephone SpeechDat Corpus. In: Mamede, N.J., Trancoso, I., Baptista, J., das Graças Volpe Nunes, M. (eds) Computational Processing of the Portuguese Language. PROPOR 2003. Lecture Notes in Computer Science(), vol 2721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45011-4_19

Download citation

DOI: https://doi.org/10.1007/3-540-45011-4_19
Published: 18 June 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40436-1
Online ISBN: 978-3-540-45011-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

HMM/MLP Hybrid Speech Recognizer for the Portuguese Telephone SpeechDat Corpus

Abstract

Access this chapter

Preview

Similar content being viewed by others

Building Automatic Speech Recognition Systems for Moroccan Dialect: A Phoneme-Based Approach

KALDI Recipes for the Czech Speech Recognition Under Various Conditions

Automatic Speech Recognition for Moroccan Dialects: A Review

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

HMM/MLP Hybrid Speech Recognizer for the Portuguese Telephone SpeechDat Corpus

Abstract

Access this chapter

Preview

Similar content being viewed by others

Building Automatic Speech Recognition Systems for Moroccan Dialect: A Phoneme-Based Approach

KALDI Recipes for the Czech Speech Recognition Under Various Conditions

Automatic Speech Recognition for Moroccan Dialects: A Review

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation