Using target-language information to train part-of-speech taggers for machine translation

Sánchez-Martínez, Felipe; Pérez-Ortiz, Juan Antonio; Forcada, Mikel L.

doi:10.1007/s10590-008-9044-3

Using target-language information to train part-of-speech taggers for machine translation

Published: 25 November 2008

Volume 22, pages 29–66, (2008)
Cite this article

Machine Translation

Felipe Sánchez-Martínez¹,
Juan Antonio Pérez-Ortiz¹ &
Mikel L. Forcada¹

173 Accesses
5 Citations
Explore all metrics

Abstract

Although corpus-based approaches to machine translation (MT) are growing in interest, they are not applicable when the translation involves less-resourced language pairs for which there are no parallel corpora available; in those cases, the rule-based approach is the only applicable solution. Most rule-based MT systems make use of part-of-speech (PoS) taggers to solve the PoS ambiguities in the source-language texts to translate; those MT systems require accurate PoS taggers to produce reliable translations in the target language (TL). The standard statistical approach to PoS ambiguity resolution (or tagging) uses hidden Markov models (HMM) trained in a supervised way from hand-tagged corpora, an expensive resource not always available, or in an unsupervised way through the Baum-Welch expectation-maximization algorithm; both methods use information only from the language being tagged. However, when tagging is considered as an intermediate task for the translation procedure, that is, when the PoS tagger is to be embedded as a module within an MT system, information from the TL can be (unsupervisedly) used in the training phase to increase the translation quality of the whole MT system. This paper presents a method to train HMM-based PoS taggers to be used in MT; the new method uses not only information from the source language (SL), as general-purpose methods do, but also information from the TL and from the remaining modules of the MT system in which the PoS tagger is to be embedded. We find that the translation quality of the MT system embedding a PoS tagger trained in an unsupervised manner through this new method is clearly better than that of the same MT system embedding a PoS tagger trained through the Baum-Welch algorithm, and comparable to that obtained by embedding a PoS tagger trained in a supervised way from hand-tagged corpora.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parse and Corpus-Based Machine Translation

A Comparative Study on Effective Approaches for Unsupervised Statistical Machine Translation

Reassessing the value of resources for cross-lingual transfer of POS tagging models

Article 27 June 2016

Nicolas Pécheux, Guillaume Wisniewski & François Yvon

References

Armentano-Oller C, Carrasco RC, Corbí-Bellot AM, Forcada ML, Ginestí-Rosell M, Ortiz-Rojas S, Pérez-Ortiz JA, Ramírez-Sánchez G, Sánchez-Martínez F, Scalco MA (2006) Open-source Portuguese-Spanish machine translation. In: Computational processing of the Portuguese language, proceedings of the 7th international workshop on computational processing of written and spoken Portuguese, vol 3960 of lecture notes in computer science. Itatiaia, RJ, Brazil: Springer-Verlag, pp 50–59
Armentano-Oller C, Forcada ML (2006) Open-source machine translation between small languages: Catalan and Aranese Occitan. In: Proceedings of strategies for developing machine translation for minority languages (5th workshop on speech and language technology for minority languages), Genoa, Italy, pp 51–54
Arnold D (2003) Why translation is difficult for computers. In: Somers H (eds) Computers and translation: a translator’s guide. John Benjamins, Amsterdam/Philadelphia, pp 119–142
Google Scholar
Baum LE (1972) An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process. Inequalities 3: 1–8
Google Scholar
Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann Math Stat 37(6): 1554–1563
Article Google Scholar
Brants T, Samuelsson C (1995) Tagging the Teleman corpus. In: Proceedings of the 10th Nordic conference of computational linguistics, Helsinki, Finland, pp 7–20
Brill E (1992) A simple rule-based part-of-speech tagger. In: Proceedings of the 3rd applied natural language processing conference, Trento, Italy, pp 152–155
Brill E (1995a) Transformation-based error-driven learning and natural language processing: a case study in part of speech tagging. Comput Linguist 21(4): 543–565
Google Scholar
Brill E (1995b) Unsupervised learning of disambiguation rules for part of speech tagging. In: Proceedings of the third workshop on very large corpora, Somerset, NJ, pp 1–13
Brown PF, Della Pietra SA, Della Pietra VJ, Mercer RL (1993) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19(2): 263–311
Google Scholar
Carbonell J, Klein S, Miller D, Steinbaum M, Grassiany T, Frei J (2006) Context-based machine translation. In: Proceedings of the 7th conference of the association for machine translation in the Americas. Visions for the future of machine translation, Cambridge, MA, pp 19–28
Carl, M, Way, A (eds) (2003) Recent advances in example-based machine translation, vol 21. Kluwer Academic Publishers, Dordrecht/Boston/London
Google Scholar
Cutting D, Kupiec J, Pedersen J, Sibun P (1992) A practical part-of-speech tagger. In: Proceedings of the 3rd applied natural language processing conference, Trento, Italy, pp 133–140
Dermatas E, Kokkinakis G (1995) Automatic stochastic tagging of natural language texts. Comput Linguist 21(2): 137–163
Google Scholar
Dien D, Kiem H (2003) POS-tagger for English-Vietnamese bilingual corpus. In: Proceedings of the workshop on building and using parallel texts: data driven machine translation and beyond, at the human language technology and the north American chapter of the association for computational linguistics joint conference, Edmonton, Canada, pp 88–95
Efron B, Tibshirani RJ (1993) An introduction to the bootstrap Vol. 57 of monographs on statistics and applied probability. Chapman & Hall/CRC, London, UK
Google Scholar
Foster G, Isabelle P, Plamondon P (1997) Target text mediated interactive machine translation. Mach Transl 2(1–2): 175–194
Article Google Scholar
Gale WA, Church KW (1990) Poor estimates of context are worse than none. In: Proceedings of the third DARPA workshop on speech and natural language. San Mateo, CA: Morgan Kaufmann Publishers Inc., pp 283–287
Gale WA, Sampson G (1995) Good-turing frequency estimation without tears. J Quant Linguist 2(3): 217–237
Article Google Scholar
Jelinek F (1997) Statistical methods for speech recognition. MIT Press, Cambridge, MA
Google Scholar
Kim JD, Lee SZ, Rim HC (1999) HMM specialization with selective kexicalization. In: Proceedings of the joint SIGDAT conference on empirical methods in natural language processing and very large corpora, College Park, MD, pp 121–127
Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of the conference on empirical methods in natural language processing. Barcelona, Spain, pp 388–395
Koehn P (2008) Statistical machine translation. Cambridge University Press, Cambridge, UK
Google Scholar
Kupiec J (1992) Robust part-of-speech tagging using a hidden Markov model. Comput Speech Lang 6(3): 225–242
Article Google Scholar
Levenshtein VI (1965) Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163(4):845–848. English translation in Soviet Physics Doklady 10(8):707–710 (1966)
Manning CD, Schütze (1999) Foundations of statistical natural language processing. MIT Press, Cambridge, MA
Google Scholar
Merialdo B (1994) Tagging English text with a probabilistic model. Comput Linguist 20(2): 155–171
Google Scholar
Nagao M (1984) Framework of a mechanical translation between Japanese and English by analogy principle. In: Elithorn A, Banerji R (eds) Artificial and human intelligence. Amsterdam, The Netherlands, North Holland, pp 173–180
Google Scholar
Och FJ (2005) Statistical machine translation: foundations and recent advances. Tutorial at MT Summit X, Phuket, Thailand
Papineni K, Roukos S, Ward T, Zhu WJ (2002) BLEU: a method for automatic evaluation of machine translation. In: 40th Annual meeting of the association for computational linguistics. Association for Computational Linguistics, Philadelphia, PA, pp 311–318
Pla F, Molina A (2004) Improving part-of-speech tagging using lexicalized HMMs. Nat Lang Eng 10(2): 167–189
Article Google Scholar
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc Inst Electr Electron Eng (IEEE) 77(2): 257–286
Google Scholar
Sánchez-Villamil E, Forcada ML, Carrasco RC (2004) Unsupervised training of a finite-state sliding-window part-of-speech tagger. In: Advances in natural language processing, proceedings of the 4th international conference EsTAL (España for Natural Language Processing), Vol 3230 of lecture notes in computer science. Alicante, Spain: Springer-Verlag, pp 454–463
Sánchez-Martínez F, Pérez-Ortiz JA, Forcada ML (2004a) Cooperative unsupervised training of the part-of-speech taggers in a bidirectional machine translation system. In: Proceedings of the tenth conference on theoretical and methodological issues in machine translation, Baltimore, MD, pp 135–144
Sánchez-Martínez F, Pérez-Ortiz JA, Forcada ML (2004b) Exploring the use of target-language information to train the part-of-speech tagger of machine translation systems. In: Advances in natural language processing, proceedings of the 4th international conference EsTAL (España for Natural Language Processing), vol 3230 of lecture notes in computer science. Alicante, Spain: Springer-Verlag, pp 137–148
Sánchez-Martínez F, Pérez-Ortiz JA, Forcada ML (2006) Speeding up target-language driven part-of-speech tagger training for machine translation. In: Advances in artificial intelligence, proceedings of the 5th Mexican international conference on artificial intelligence, vol 4293 of lecture notes in computer science. Apizaco, Tlaxcala, Mexico: Springer-Verlag, pp 844–854
Snover M, Dorr B, Schwartz R, Micciulla L, Makhoul J (2006) A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th conference of the association for machine translation in the Americas. Visions for the future of machine translation, Cambridge, MA, pp 223–231
Stolcke A (2002) SRILM—an extensible language modeling toolkit. In: Proceedings of the international conference on spoken language processing, Denver, CO, pp 901–904
Yarowsky D, Ngai G (2001) Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora. In: Proceedings of the second meeting of the North American chapter of the association for computational linguistics, Pittsburgh, PA, pp 200–207

Download references

Author information

Authors and Affiliations

Dept. de Llenguatges i Sistemes Informàtics, Universitat d’Alacant, 03071, Alacant, Spain
Felipe Sánchez-Martínez, Juan Antonio Pérez-Ortiz & Mikel L. Forcada

Authors

Felipe Sánchez-Martínez
View author publications
You can also search for this author in PubMed Google Scholar
Juan Antonio Pérez-Ortiz
View author publications
You can also search for this author in PubMed Google Scholar
Mikel L. Forcada
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Felipe Sánchez-Martínez.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sánchez-Martínez, F., Pérez-Ortiz, J.A. & Forcada, M.L. Using target-language information to train part-of-speech taggers for machine translation. Machine Translation 22, 29–66 (2008). https://doi.org/10.1007/s10590-008-9044-3

Download citation

Received: 28 January 2008
Accepted: 27 October 2008
Published: 25 November 2008
Issue Date: March 2008
DOI: https://doi.org/10.1007/s10590-008-9044-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using target-language information to train part-of-speech taggers for machine translation

Abstract

Access this article

Similar content being viewed by others

Parse and Corpus-Based Machine Translation

A Comparative Study on Effective Approaches for Unsupervised Statistical Machine Translation

Reassessing the value of resources for cross-lingual transfer of POS tagging models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using target-language information to train part-of-speech taggers for machine translation

Abstract

Access this article

Similar content being viewed by others

Parse and Corpus-Based Machine Translation

A Comparative Study on Effective Approaches for Unsupervised Statistical Machine Translation

Reassessing the value of resources for cross-lingual transfer of POS tagging models

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation