Abstract:
In machine translation evaluation, the traditional wisdom measures model's generalization ability in an average sense, for example by using corpus BLEU. However, the stat...Show MoreMetadata
Abstract:
In machine translation evaluation, the traditional wisdom measures model's generalization ability in an average sense, for example by using corpus BLEU. However, the statistics of corpus BLEU cannot provide comprehensive understanding and fine-grained analysis on model's generalization ability. As a remedy, this paper attempts to understand NMT at fine-grained level, by detecting contextual barriers within an unseen input sentence that cause the degradation in model's translation quality. It proposes a principled definition of source contextual barriers as well as its modified version which is tractable in computation and operates at word-level. Based on the modified one, three simple methods are proposed for barrier detection by search-aware risk estimation through counterfactual generation. Extensive analyses are conducted on those detected contextual barrier words on both Zh \Leftrightarrow En NIST benchmarks. Potential usages motivated from barrier words are also discussed.
Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing ( Volume: 29)