Abstract
This paper describes the ASR system proposed by the SODA consortium to participate in the ASR task of the French REPERE evaluation campaign. The official test REPERE corpus is composed of TV shows. The entire ASR system was produced by combining two ASR systems built by two members of the consortium. Each ASR system has some specificities: one uses an i-vector-based speaker adaptation of deep neural networks for acoustic modeling, while the other one rescores word-lattices with continuous space language models. The entire ASR system won the REPERE evaluation campaign on the ASR task. On the REPERE test corpus, this composite ASR system reaches a word error rate of 13.5%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bougares, F., Deléglise, P., Estève, Y., Rouvier, M.: LIUM ASR system for Etape French evaluation campaign: experiments on system combination using open-source recognizers. In: Habernal, I. (ed.) TSD 2013. LNCS, vol. 8082, pp. 319–326. Springer, Heidelberg (2013)
Galibert, O., Kahn, J.: The first official REPERE evaluation. In: First Workshop on Speech, Language and Audio in Multimedia (SLAM), Marseille, France, pp. 43–48 (2013)
Galliano, S., Geoffrois, E., Gravier, G.F., Bonastre, J., Mostefa, D., Choukri, K.: Corpus description of the Ester evaluation campaign for the rich transcription of French broadcast news. In: 5th International Conference on Language Resources and Evaluation (LREC), pp. 315–320 (2006)
Galliano, S., Gravier, G., Chaubard, L.: The Ester 2 evaluation campaign for the rich transcription of french radio broadcasts. In: Interspeech (2009)
Gravier, G., Adda, G., Paulsson, N., Carré, M., Giraudel, A., Galibert, O.: The ETAPE corpus for the evaluation of speech-based TV content processing in the French language. In: Eighth International Conference on Language Resources and Evaluation (LREC), Istanbul, Turkey, pp. 114–118 (2012)
Grézl, F.: TRAP-based Probabilistic Features for Automatic Speech Recognition. Ph.D. thesis, dept. Computer Graphics & Multimedia, Brno University of Technology (2007)
Gupta, V., Boulianne, G., Osterrath, F., Ouellet, P.: CRIM’s french speech transcription system for ETAPE 2011. In: WOSSPA (2013)
Gupta, V., Kenny, P., Ouellet, P., Stafylakis, T.: I-vector-based speaker adaptation of deep neural networks for french broadcast audio transcription. In: ICASSP, Florence, Italy (2014)
Kahn, J., Galibert, O., Quintard, L., Carre, M., Giraudel, A., Joly, P.: A presentation of the REPERE challenge. In: International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6 (2012)
Kingsbury, B.: Lattice-based optimization of sequence classification criteria for neural-network acoustic modeling. In: ICASSP, pp. 3761–3764 (2009)
Mangu, L., Brill, E., Stolcke, A.: Finding consensus in speech recognition: word error minimization and other applications of confusion networks. Computer Speech & Language 14(4), 373–400 (2000)
Meignier, S., Merlin, T.: LIUM SpkDiarization: an open source toolkit for diarization. In: CMU SPUD Workshop, Dallas, Texas, USA (2010)
Mohri, M., Pereira, F., Riley, M.: Speech recognition with weighted finite-state transducers. In: Springer Handbook of Speech Processing, pp. 559–584. Springer, Heidelberg (2008)
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlíček, P., Qian, Y., Schwarz, P., Silovský, J., Stemmer, G., Veselý, K.: The Kaldi Speech Recognition Toolkit. In: ASRU Workshop, pp. 1–4 (2011)
Schwenk, H.: CSLM – a modular open-source continuous space language modeling toolkit. In: Interspeech, Lyon, France, pp. 1198–1202 (2013)
Stolcke, A.: SRILM – an extensible language modeling toolkit. In: Interspeech, pp. 901–904 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Rousseau, A., Boulianne, G., Deléglise, P., Estève, Y., Gupta, V., Meignier, S. (2014). LIUM and CRIM ASR System Combination for the REPERE Evaluation Campaign. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2014. Lecture Notes in Computer Science(), vol 8655. Springer, Cham. https://doi.org/10.1007/978-3-319-10816-2_53
Download citation
DOI: https://doi.org/10.1007/978-3-319-10816-2_53
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10815-5
Online ISBN: 978-3-319-10816-2
eBook Packages: Computer ScienceComputer Science (R0)