Abstract
The purpose of this paper is to provide guidelines for building a word alignment evaluation scheme. The notion of word alignment quality depends on the application: here we review standard scoring metrics for full text alignment and give explanations on how to use them better. We discuss strategies to build a reference corpus, and show that the ratio between ambiguous and unambiguous links in the reference has a great impact on scores measured with these metrics. In particular, automatically computed alignments with higher precision or higher recall can be favoured depending on the value of this ratio. Finally, we suggest a strategy to build a reference corpus particularly adapted to applications where recall plays a significant role, like in machine translation. The manually aligned corpus we built for the Spanish-English European Parliament corpus is also described. This corpus is freely available.
Similar content being viewed by others
References
Ahrenberg L., Merkel M., Hein A.S., Tiedemann J. (2000) In: Proc. of the 2nd International Conference on Linguistic Resources and Evaluation (LREC). Athens, Greece, Vol. III: pp. 1255–1261.
Brown P., Della Pietra S., Della Pietra V., Mercer R. (1993) The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19(2), pp. 263–311.
Crego J.M., Mariño J., de Gispert A. (2004) Finite-state-based and Phrase-based Statistical Machine Translation. Proc. of the 8th Int. Conf. on Spoken Language Processing, ICSLP’04 pp. 37–40.
David Yarowsky G.N., Wicentowski, R. (2001) Inducing Multilingual Text Analysis Tools via Robust Projection Across Aligned Corpora. In: Proc. of the 1st International Conference on Human Language Technology Research (HLT), pp. 161–168.
de Gispert A., Mariño J., Crego J.M. (2004) Phrase-based Alignment Combining Corpus Cooccurrences and Linguistic Knowledge. Proc. of the Int. Workshop on Spoken Language Translation, IWSLT’04, pp. 107–114.
Diab M., Resnik P. (2002) An Unsupervised Method for Word Sense Tagging Using Parallel Corpora. In: Proc. of the Annual Meeting of the Association for Computational Linguistics. Philadelphia, PA, pp. 255–262.
Kuhn J. (2004) Experiments in Parallel-Text Based Grammar Induction. In: Proc. of the 42th Annual Meeting of the Association for Computational Linguistics. Barcelona, Spain, pp. 470–477.
Lambert P. (2004) The Alignment Set Toolkit. http://gps-tsc.upc.es/veu/personal/lambert/software/AlignmentSet.html.
Martin J., Mihalcea R., Pedersen T. (2005) Word Alignment for Languages with Scarce Resources. In: Proceedings of the ACL Workshop on Building and Using Parallel Texts. Ann Arbor, Michigan.
Melamed I.D. (1998a) Annotation Style Guide for the Blinker Project. Technical Report 98-06, IRCS.
Melamed I.D. (1998b) Manual Annotation of Translational Equivalence. Technical Report 98-07, IRCS.
Mihalcea R. and Pedersen T. (2003). An Evaluation Exercise for Word Alignment. In: Mihalcea, R. and Pedersen, T. (eds) HLT-NAACL 2003 Workshop: Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, pp 1–10. Edmonton, Alberta, Canada, Association for Computational Linguistics
Och F. and Ney H. (2003). A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics 29(1): 19–51
Och F. and Ney H. (2004). The Alignment Template Approach to Statistical Machine Translation. Computational Linguistics 30(4): 417–449
Och F.J., Ney H. (2000a) A Comparison of Alignment Models for Statistical Machine Translation. In: Proc. of the 18th Int. Conf. on Computational Linguistics. Saarbrucken,Germany, pp. 1086–1090.
Och F.J., Ney H. (2000b) Improved Statistical Alignment Models. In: Proc. of the 38th Annual Meeting of the Association for Computational Linguistics. Hongkong, China, pp. 440–447
Pedersen T., Rassier B. (2003) Aligner for Parallel Corpora. http://www.d.umn.edu/∼tpederse/parallel.html.
Ribeiro A., Lopes G. and Mexia J. (2001). Extracting Translation Equivalents from Portuguese–Chinese Parallel Texts. Journal of Studies in Lexicography 11(1): 118–194
Smadja F.A., McKeown K.R. and Hatzivassiloglou V. (1996). Translating Collocations for Bilingual Lexicons: A Statistical Approach. Computational Linguistics 22(1): 1–38
(1979). Information Retrieval. Butterworths, London
Véronis J. (2000) Evaluation of Parallel Text Alignment Systems: The ARCADE Project. In: Parallel Text Processing: Alignment and Use of Translation Corpora. Kluwer Academic Publishers, pp. 369–388.
Wang Y.-Y., Waibel A. (1998) Modeling with Structures in Statistical Machine Translation. In: Proc. of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics. Montreal, Canada, pp. 1357–1363.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lambert, P., De Gispert, A., Banchs, R. et al. Guidelines for Word Alignment Evaluation and Manual Alignment. Lang Resources & Evaluation 39, 267–285 (2005). https://doi.org/10.1007/s10579-005-4822-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-005-4822-5