Skip to main content

A Baseline System for Continuous Speech Recognition of Brazilian Portuguese Using the West Point Brazilian Portuguese Speech Corpus

  • Conference paper
  • 617 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6001))

Abstract

Despite the availability of several speech corpora that can be used to build automatic speech recognition systems, there are only a few corpora for the Brazilian Portuguese (BP) language. This lack of corpora does not allow an extensive and deep research on continuous speech recognition systems for BP. In this work, we present a baseline system for continuous speech recognition for BP and its results using the West Point Brazilian Portuguese Corpus. In addition to the results, the resources developed to build the system are made available for continuing the research on such systems for BP.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sampaio Neto, N., Patrick, C., Adami, A.G., Klautau, A.: Spoltech and ogi-22 baseline systems for speech recognition in brazilian portuguese. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 256–259. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  2. Teruszkin, R., Junior, F.: Implementation of a Large Vocabulary Continuous Speech Recognition System for Brazilian Portuguese. Journal of Communication and Information Systems 21(3), 204–218 (2006)

    Google Scholar 

  3. Neto, N.S., Sousa, E., Macedo, V., Adami, A.G., Klautau, A.: Desenvolvimento de software livre usando reconhecimento e síntese de voz: O estado da arte para o português brasileiro. In: 6 Workshop Software Livre, Anais da Trilha Nacional do Workshop Software Livre, Porto Alegre, vol. 1 (2005)

    Google Scholar 

  4. Young, S., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book. Entropic Cambridge Research Laboratory (1997)

    Google Scholar 

  5. Linguateca: Corpus de extractos de textos electrónicos nilc/folha (2008), http://www.linguateca.pt/cetenfolha/

  6. Morgan, J., Ackerlind, S., Packer, S.: West Point Brazilian Portuguese Speech (2008), http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2008S04

  7. Bisani, M., Ney, H.: Joint-sequence models for grapheme-to-phoneme conversion. Speech Communication (2008)

    Google Scholar 

  8. Sequitur G2P: Sequitur G2P - A trainable Grapheme-to-Phoneme converter (2008), http://www-i6.informatik.rwth-aachen.de/web/Software/g2p.html

  9. Santos, F., Barone, D., Adami, A.: Validação de Corpus para Reconhecimento de Fala Contínua em Português Brasileiro. In: Proc. V Workshop em Tecnologia da Informação e da Linguagem Humana, TIL 2008 (2008)

    Google Scholar 

  10. dos Santos, F.W.: Validação de corpus para reconhecimento de fala contínua em português brasileiro. Master’s thesis, Universidade Federal do Rio Grande do Sul (2009)

    Google Scholar 

  11. Stolcke, A.: SRILM-an Extensible Language Modeling Toolkit. In: Seventh International Conference on Spoken Language Processing, vol. 2, pp. 901–904. ISCA, Denver (2002)

    Google Scholar 

  12. Young, S.: ATK-An Application Toolkit for HTK (2007)

    Google Scholar 

  13. VoxForge: Read Prompts and Submit Recordings (2008), http://www.voxforge.org/pt_br/read

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

dos Santos, F.W., Barone, D.A.C., Adami, A.G. (2010). A Baseline System for Continuous Speech Recognition of Brazilian Portuguese Using the West Point Brazilian Portuguese Speech Corpus. In: Pardo, T.A.S., Branco, A., Klautau, A., Vieira, R., de Lima, V.L.S. (eds) Computational Processing of the Portuguese Language. PROPOR 2010. Lecture Notes in Computer Science(), vol 6001. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12320-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12320-7_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12319-1

  • Online ISBN: 978-3-642-12320-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics