Automatic Correction of ASR Outputs by Using Machine Translation

D’Haro, Luis Fernando; Banchs, Rafael E.

doi:10.21437/Interspeech.2016-299

Automatic Correction of ASR Outputs by Using Machine Translation

Luis Fernando D’Haro, Rafael E. Banchs

One of the main challenges when working with a domain-independent automatic speech recognizers (ASR) is to correctly transcribe rare or out-of-vocabulary words that are not included in the language model or whose probabilities are sub-estimated. Although the common solution would be to adapt the language models and pronunciation vocabularies, in some conditions, like when using free online recognizers, that is not possible and therefore it is necessary to apply post-recognition rectifications. In this paper, we propose an automatic correction procedure based on using a phrase-based machine translation system trained using words and phonetic encoding representations to the generated n-best lists of ASR results. Our experiments on two different datasets: human computer interfaces for robots, and human to human dialogs about tourism information show that the proposed methodology can provide a quick and robust mechanism to improve the performance of the ASR by reducing the word error rate (WER) and character error rate (CER).

doi: 10.21437/Interspeech.2016-299

Cite as: D’Haro, L.F., Banchs, R.E. (2016) Automatic Correction of ASR Outputs by Using Machine Translation. Proc. Interspeech 2016, 3469-3473, doi: 10.21437/Interspeech.2016-299

@inproceedings{dharo16_interspeech,
  author={Luis Fernando D’Haro and Rafael E. Banchs},
  title={{Automatic Correction of ASR Outputs by Using Machine Translation}},
  year=2016,
  booktitle={Proc. Interspeech 2016},
  pages={3469--3473},
  doi={10.21437/Interspeech.2016-299}
}