Abstract:
Unsupervised acoustic model adaptation for large vocabulary speech recognition is typically accomplished by using an estimated transcription of the adaptation data. The e...Show MoreMetadata
Abstract:
Unsupervised acoustic model adaptation for large vocabulary speech recognition is typically accomplished by using an estimated transcription of the adaptation data. The effectiveness of the technique is limited by errors in the estimated transcription. Previous work has mitigated this negative effect by using only those sections of the adaptation data which are transcribed with relatively high confidence. In this work, phoneme correctness predictions are integrated into a discriminative unsupervised acoustic model adaptation procedure. Small but significant performance improvements (over the equivalent maximum likelihood adaptation technique) are observed when using unsupervised discriminative adaptation in combination with support vector machines to predict phoneme correctness.
Published in: IEEE Transactions on Audio, Speech, and Language Processing ( Volume: 20, Issue: 10, December 2012)