Skip to main content

LIUM and CRIM ASR System Combination for the REPERE Evaluation Campaign

  • Conference paper
Text, Speech and Dialogue (TSD 2014)

Abstract

This paper describes the ASR system proposed by the SODA consortium to participate in the ASR task of the French REPERE evaluation campaign. The official test REPERE corpus is composed of TV shows. The entire ASR system was produced by combining two ASR systems built by two members of the consortium. Each ASR system has some specificities: one uses an i-vector-based speaker adaptation of deep neural networks for acoustic modeling, while the other one rescores word-lattices with continuous space language models. The entire ASR system won the REPERE evaluation campaign on the ASR task. On the REPERE test corpus, this composite ASR system reaches a word error rate of 13.5%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Bougares, F., Deléglise, P., Estève, Y., Rouvier, M.: LIUM ASR system for Etape French evaluation campaign: experiments on system combination using open-source recognizers. In: Habernal, I. (ed.) TSD 2013. LNCS, vol. 8082, pp. 319–326. Springer, Heidelberg (2013)

    Google Scholar 

  2. Galibert, O., Kahn, J.: The first official REPERE evaluation. In: First Workshop on Speech, Language and Audio in Multimedia (SLAM), Marseille, France, pp. 43–48 (2013)

    Google Scholar 

  3. Galliano, S., Geoffrois, E., Gravier, G.F., Bonastre, J., Mostefa, D., Choukri, K.: Corpus description of the Ester evaluation campaign for the rich transcription of French broadcast news. In: 5th International Conference on Language Resources and Evaluation (LREC), pp. 315–320 (2006)

    Google Scholar 

  4. Galliano, S., Gravier, G., Chaubard, L.: The Ester 2 evaluation campaign for the rich transcription of french radio broadcasts. In: Interspeech (2009)

    Google Scholar 

  5. Gravier, G., Adda, G., Paulsson, N., Carré, M., Giraudel, A., Galibert, O.: The ETAPE corpus for the evaluation of speech-based TV content processing in the French language. In: Eighth International Conference on Language Resources and Evaluation (LREC), Istanbul, Turkey, pp. 114–118 (2012)

    Google Scholar 

  6. Grézl, F.: TRAP-based Probabilistic Features for Automatic Speech Recognition. Ph.D. thesis, dept. Computer Graphics & Multimedia, Brno University of Technology (2007)

    Google Scholar 

  7. Gupta, V., Boulianne, G., Osterrath, F., Ouellet, P.: CRIM’s french speech transcription system for ETAPE 2011. In: WOSSPA (2013)

    Google Scholar 

  8. Gupta, V., Kenny, P., Ouellet, P., Stafylakis, T.: I-vector-based speaker adaptation of deep neural networks for french broadcast audio transcription. In: ICASSP, Florence, Italy (2014)

    Google Scholar 

  9. Kahn, J., Galibert, O., Quintard, L., Carre, M., Giraudel, A., Joly, P.: A presentation of the REPERE challenge. In: International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6 (2012)

    Google Scholar 

  10. Kingsbury, B.: Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling. In: ICASSP, pp. 3761–3764 (2009)

    Google Scholar 

  11. Mangu, L., Brill, E., Stolcke, A.: Finding consensus in speech recognition: word error minimization and other applications of confusion networks. Computer Speech & Language 14(4), 373–400 (2000)

    Article  Google Scholar 

  12. Meignier, S., Merlin, T.: LIUM SpkDiarization: an open source toolkit for diarization. In: CMU SPUD Workshop, Dallas, Texas, USA (2010)

    Google Scholar 

  13. Mohri, M., Pereira, F., Riley, M.: Speech recognition with weighted finite-state transducers. In: Springer Handbook of Speech Processing, pp. 559–584. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlíček, P., Qian, Y., Schwarz, P., Silovský, J., Stemmer, G., Veselý, K.: The Kaldi Speech Recognition Toolkit. In: ASRU Workshop, pp. 1–4 (2011)

    Google Scholar 

  15. Schwenk, H.: CSLM – a modular open-source continuous space language modeling toolkit. In: Interspeech, Lyon, France, pp. 1198–1202 (2013)

    Google Scholar 

  16. Stolcke, A.: SRILM – an extensible language modeling toolkit. In: Interspeech, pp. 901–904 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Rousseau, A., Boulianne, G., Deléglise, P., Estève, Y., Gupta, V., Meignier, S. (2014). LIUM and CRIM ASR System Combination for the REPERE Evaluation Campaign. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2014. Lecture Notes in Computer Science(), vol 8655. Springer, Cham. https://doi.org/10.1007/978-3-319-10816-2_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10816-2_53

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10815-5

  • Online ISBN: 978-3-319-10816-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics