Skip to main content
Log in

Improvement of Machine Translation Evaluation by Simple Linguistically Motivated Features

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Adopting the regression SVM framework, this paper proposes a linguistically motivated feature engineering strategy to develop an MT evaluation metric with a better correlation with human assessments. In contrast to current practices of “greedy” combination of all available features, six features are suggested according to the human intuition for translation quality. Then the contribution of linguistic features is examined and analyzed via a hill-climbing strategy. Experiments indicate that, compared to either the SVM-ranking model or the previous attempts on exhaustive linguistic features, the regression SVM model with six linguistic information based features generalizes across different datasets better, and augmenting these linguistic features with proper non-linguistic metrics can achieve additional improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Papineni K, Roukos S, Ward T, Zhu W J. BLEU: A method for automatic evaluation of machine translation. IBM Research Report, RC22176 (W0109-022), 2001.

  2. George D. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proc. the 2nd International Conference of Human Language Technology Research, San Diego, USA, Mar. 24-27, 2002, pp. 138–145.

  3. Kulesza A, Shieber S M. A learning approach to improving sentence-level MT evaluation. In Proc. the 10th International Conference on Theoretical and Methodological Issues in Machine Translation, Baltimore, USA, Oct. 4-6, 2004, pp. 75–84.

  4. Leusch G, Ueffing N, Nev H. CDER: Efficient MT evaluation using block movements. In Proc. the 13th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy, Apr. 3-7, 2006, pp. 241–248

  5. Russo-Lassner G, Lin J, Resnik P. A paraphrase-based approach to machine translation evaluation. Technical Report, LAMP-TR-125/CS-TR-4754/UMIACS-TR-2005-57, University of Maryland, College Park, USA, August.

  6. Lin C Y, Och F J. Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In Proc. the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, Jul. 21-26, 2004, pp. 605–612.

  7. Banerjee S, Lavie A, Meteor: An automatic metric for MT evaluation with improved correlation with human judgments. In Proc. ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, USA, Jun. 29-30, 2005, pp. 65–72.

  8. Corston-Oliver S, Gamon M, Chris B. A machine learning approach to the automatic evaluation of machine translation. In Proc. the 39th Annual Meeting of the Association for Computational Linguistics, Toulouse, France, Jul. 9-11, 2001, pp. 148–155

  9. Albrecht J S, Hwa R. A re-examination of machine learning approaches for sentence-level MT evaluation. In Proc. the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech, Jun. 23-30, 2007, pp. 880–887.

  10. Ye Y, Zhou M, Lin C Y. Sentence level machine translation evaluation as a ranking. In Proc. ACL Second Workshop on Statistical Machine Translation, Prague, Czech, Jun. 23-30, 2007, pp. 240–247.

  11. Duh K. Ranking vs. regression in machine translation evaluation. In Proc. ACL 3rd Workshop on Statistical Machine Translation, Columbus, USA, Jun. 15-20, 2008, pp. 191–194.

  12. Giménez J, Mμarquez L. Linguistic features for automatic evaluation of heterogenous MT systems. In Proc. ACL 2nd Workshop on Statistical Machine Translation, Prague, Czech, Jun. 23-30, 2007, pp. 256–264.

  13. Blatz J, Fitzgerald E, Foster G, Gandrabur S, Goutte C, Kulesza A, Sanchis A, Ueffing N. Confidence estimation for machine translation. Natural Language Engineering Work-shop Final Report, Johns Hopkins University, 2003.

  14. Amigó E, Giménez J, Gonzalo J, Mμarquez L. MT evaluation: Human-like vs. human acceptable. In Proc. the COLING/ACL 2006 Main Conference Poster Sessions, Sydney, Australia, Jul. 17-21, 2006, pp. 17–24.

  15. Nießn S, Och F J, Leusch G, Ney H. An evaluation tool for machine translation: Fast evaluation for MT research. In Proc. the 2nd International Conference on Language Resources & Evaluation, Athens, Greek, May 30-Jun. 2, 2000, pp. 39–45.

  16. Tillmann C, Vogel S, Ney H, Zubiaga A, Sawaf H. Accelerated DP based search for statistical translation. In Proc. European Conference on Speech Communication and Technology, Rhodes, Greece, Sept. 22-25, 1997, pp. 2667–2670.

  17. Giménez J, Mμaquez L. Linguistic features for automatic evaluation of heterogeneous MT systems. In Proc. ACL Second Workshop on Statistical Machine Translation, Prague, Czech, Jun. 23-30, 2007, pp. 256–264.

  18. Catford J. A Linguistic Theory of Translation. London: Oxford University Press, 1965.

    Google Scholar 

  19. Collins M. Head-driven statistical models for natural language parsing [Ph.D. Dissertation]. University of Pennsylvania, 1999.

  20. Gale W A, Church K W. A program for aligning sentences in bilingual corpora. Computational Linguistics, 1993, 19(1): 75–102.

    Google Scholar 

  21. Abramowitz M, Stegun I. Handbook of Mathematical Functions. US Government Printing Office. 1964.

  22. Liu D, Gildea D. Syntactic features for evaluation of machine translation. In Proc. ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, USA, Jun. 25-30, 2005, pp. 25–32.

  23. Quirk C B. Training a sentence-level machine translation confidence measure. In Proc. the 4th International Conference on Language Resources and Evaluation, Lisbon, May, 2004, pp. 825–828.

  24. Koehn P. Statistical significance tests for machine translation evaluation. In Proc. Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain Jul. 25-26, 2004.

  25. Giménez J, Mμarquez L. A smorgasbord of features for automatic MT evaluation. In Proc. ACL Third Workshop on Statistical Machine Translation, Columbus, USA, Jun. 15-20, 2008, pp. 195–198.

  26. Zhu X, Yang M, Wang L, Wang J, Li S. A quantitative analysis of linguistic factors in human translation evaluation. In Proc. the 2nd International Symposium on Knowledge Acquisition Modeling (KAM 2009), Wuhan, China, Nov. 30-Dec. 1, 2009, pp. 410–413.

  27. Callison-Burch C, Fordyce C, Koehn P, Monz C, Schroeder J. Further meta-evaluation of machine translation. In Proc. ACL Third Workshop on Statistical Machine Translation, Columbus, USA, Jun. 15-20, 2008, pp. 70–106.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mu-Yun Yang.

Additional information

Supported by the National Natural Science Foundation of China under Grant Nos. 60773066 and 60736014, the National High Technology Development 863 Program of China under Grant No. 2006AA010108, and the Natural Scientific Research Innovation Foundation in Harbin Institute of Technology under Grant No. HIT.NSFIR.20009070.

This fact partially reflects the difficulty of getting the rich linguistics even for the researchers.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(PDF 99.4 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, MY., Sun, SQ., Zhu, JG. et al. Improvement of Machine Translation Evaluation by Simple Linguistically Motivated Features. J. Comput. Sci. Technol. 26, 57–67 (2011). https://doi.org/10.1007/s11390-011-9415-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-011-9415-8

Keywords

Navigation