Abstract:
We describe the Arabic broadcast transcription system fielded by IBM in the GALE Phase 4 machine translation evaluation. Key advances over our Phase 3.5 system include im...Show MoreMetadata
Abstract:
We describe the Arabic broadcast transcription system fielded by IBM in the GALE Phase 4 machine translation evaluation. Key advances over our Phase 3.5 system include improvements to context-dependent modeling in vowelized Arabic acoustic models; the use of neural-network features provided by the International Computer Science Institute; Model M language models; a neural network language model that uses syntactic and morphological features; and improvements to our system combination strategy. These advances were instrumental in achieving a word error rate of 8.9% on the Phase 4 evaluation set, and an absolute improvement of 1.6% word error rate over our 2008 system on the unsequestered Phase 3.5 evaluation data.
Published in: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 22-27 May 2011
Date Added to IEEE Xplore: 11 July 2011
ISBN Information: