Abstract:
In this work, multiple hierarchical language modeling strategies for a zero OOV rate large vocabulary continuous speech recognition system are investigated. In our previo...Show MoreMetadata
Abstract:
In this work, multiple hierarchical language modeling strategies for a zero OOV rate large vocabulary continuous speech recognition system are investigated. In our previously proposed hierarchical approach, a full-word language model and a context independent character-level LM (CLM) are directly used during search. The novelty of this work is to jointly model the character-level prior and the pronunciation probabilities, to introduce across-word context into the characterlevel LM, and to properly normalize the character-level LM using prefix-tree based normalization for the hierarchical approach. Significant reductions in-terms of word error rates (WER) on the best full-word Quaero Polish LVCSR system are reported.
Published in: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 19-24 April 2015
Date Added to IEEE Xplore: 06 August 2015
Electronic ISBN:978-1-4673-6997-8