Abstract:
The heuristic estimates of conditional phrase translation probabilities are based on frequency counts in a word-aligned parallel corpus. Earlier attempts at more principl...Show MoreMetadata
Abstract:
The heuristic estimates of conditional phrase translation probabilities are based on frequency counts in a word-aligned parallel corpus. Earlier attempts at more principled estimation using Expectation-Maximization (EM) under perform this heuristic. This paper shows that a recently introduced novel estimator based on smoothing might provide a good alternative. When all phrase pairs are estimated (no length cut-off), this estimator slightly outperforms the heuristic estimator.
Published in: 2008 IEEE Spoken Language Technology Workshop
Date of Conference: 15-19 December 2008
Date Added to IEEE Xplore: 06 February 2009
ISBN Information: