ISCA Archive Interspeech 2022
ISCA Archive Interspeech 2022

On Adaptive Weight Interpolation of the Hybrid Autoregressive Transducer

Ehsan Variani, Michael Riley, David Rybach, Cyril Allauzen, Tongzhou Chen, Bhuvana Ramabhadran

This paper explores rescoring strategies to improve a two-pass speech recognition system when first-pass is a hybrid autoregressive transducer model and second-pass is a neural language model. The main focus is on the scores provided by each of these models, their quantitative analysis, how to improve them and the best way to integrate them with the objective of better recognition accuracy. Several analyses are presented to emphasise the importance of the choice of the integration weights for combining the first-pass and the second-pass scores. A sequence level combination weight estimation model along with four training criteria are proposed which allows adaptive integration of the scores per acoustic sequence. The effectiveness of this algorithm is demonstrated by constructing and analyzing models on the Librispeech data set.


doi: 10.21437/Interspeech.2022-4

Cite as: Variani, E., Riley, M., Rybach, D., Allauzen, C., Chen, T., Ramabhadran, B. (2022) On Adaptive Weight Interpolation of the Hybrid Autoregressive Transducer. Proc. Interspeech 2022, 1646-1650, doi: 10.21437/Interspeech.2022-4

@inproceedings{variani22_interspeech,
  author={Ehsan Variani and Michael Riley and David Rybach and Cyril Allauzen and Tongzhou Chen and Bhuvana Ramabhadran},
  title={{On Adaptive Weight Interpolation of the Hybrid Autoregressive Transducer}},
  year=2022,
  booktitle={Proc. Interspeech 2022},
  pages={1646--1650},
  doi={10.21437/Interspeech.2022-4}
}