Abstract
An implementation of a non-structural example-based translation system that translates sentences from Arabic to English, using a bilingual parallel corpus, is described. Each new input sentence is fragmented into phrases, and those phrases are matched to example patterns, using various levels of morphological data. We study the effect of forcing the system to match only fragments that do not break base phrases in the middle, and the results for small corpora are encouraging.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Banerjee, S., Lavie, A.: Meteor: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In: Proceedings of the Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization, 43rd Annual Meeting of the Association for Computational Linguistics (ACL), Ann Arbor, MI, pp. 65–72 (2005)
Bar, K., Dershowitz, N.: Semantics for Example-Based Arabic Machine Translation. In: Soudi, A., Vogel, S., Neumann, G., Farghaly, A. (eds.) Challenges for Arabic Machine Translation. Natural Language Processing Series, pp. 49–72. John Benjamins, Amsterdam (2012)
Brown, R.D.: Adding Linguistic Knowledge to a Lexical Example-Based Translation System. In: Proceedings of the International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), pp. 22–32 (1999)
Brown, P.F., Cocke, J., Pietra, S.A.D., Pietra, V.J.D., Jelinek, F., Lafferty, J.D., Mercer, R.L., Roossin, P.S.: A Statistical Approach to Machine Translation. Computational Linguistics 6(2), 79–85 (1990)
Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., Mercer, R.L.: The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics 19(2), 263–311 (1993)
Buckwalter, T.: Buckwalter Arabic Morphological Analyzer Version 1.0. Linguistic Data Consortium, Philadelphia (2002)
Diab, M., Hacioglu, K., Jurafsky, D.: Automatic Tagging of Arabic Text: From Raw Text to Base Phrase Chunks. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL), pp. 149–152. The National Science Foundation, Washington, DC (2004)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Groves, D., Way, A.: Hybrid Example-Based SMT: The Best of Both Worlds? In: Proceedings of the ACL 2005 Workshop on Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, Ann Arbor, MI, pp. 183–190 (2005)
Habash, N., Rambow, O.: Arabic Tokenization, Morphological Analysis, and Part-of-Speech Tagging in One Fell Swoop. In: Proceedings of the Conference of American Association for Computational Linguistics, Ann Arbor, MI, pp. 578–580 (2005)
Habash, N., Rambow, O., Roth, R.: MADA+TOKAN: A Toolkit for Arabic Tokenization, Diacritization, Morphological Disambiguation, POS Tagging, Stemming and Lemmatization. In: Proceedings of the Second International Conference on Arabic Language Resources and Tools, pp. 102–109. The MEDAR Consortium, Cairo (2009)
Koehn, P., Hoang, H.: Factored Translation Models. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 868–876 (2007)
Koehn, P., Och, F.J., Marcu, D.: Statistical Phrase-Based Translation. In: Proceedings of the Human Language Technology Conference (HLT-NAACL 2003), Edmonton, Canada, pp. 48–54 (2003)
Maruyama, H., Watanabe, H.: Tree Cover Search Algorithm for Example-Based Translation. In: Proceedings of the International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), pp. 173–184 (1992)
Miller, G.A.: WordNet: A Lexical Database for English. Communications of the ACM 38(11), 39–41 (1995)
Nagao, M.: A Framework of Mechanical Translation between Japanese and English by Analogy Principle. In: Elithorn, A., Banerji, R. (eds.) Artificial and Human Intelligence, pp. 351–354. North-Holland (1984)
Nirenburg, S., Beale, S., Domashnev, C.: A Full-Text Experiment in Example-Based Machine Translation. In: International Conference on New Methods in Language Processing (NeMLaP), Manchester, UK, pp. 78–87 (1994)
Och, F.J., Ney, H.: The Alignment Template Approach to Statistical Machine Translation. Computational Linguistics 30(4), 418–449 (2003)
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: A Method for Automatic Evaluation of Machine Translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, PA, pp. 311–318 (2002)
Phillips, A.B., Violetta, C.-S., Brown, R.D.: Improving Example-Based Machine Translation through Morphological Generalization and Adaptation. In: Proceedings of Machine Translation Summit XI, Copenhagen, Denmark, pp. 369–375 (2006)
Ramshaw, L.A., Marcus, M.P.: Text Chunking Using Transformation Based Learning. In: Proceedings of the Third Workshop on Very Large Corpora in the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 82–94 (1995)
Roth, R., Rambow, O., Habash, N., Diab, M., Rudin, C.: Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking. In: Proceedings of Association for Computational Linguistics (ACL 2008), Columbus, OH, pp. 117–120 (2008)
Sato, S., Nagao, M.: Toward Memory-Based Translation. In: Proceedings of the International Conference on Computational Linguistics (COLING), vol. 13(3), pp. 247–252 (1990)
Somers, H.: Review Article: Example-Based Machine Translation. Machine Translation 14, 113–157 (1999)
Sumita, E., Iida, H.: Heterogeneous Computing for Example-Based Translation of Spoken Language. In: Proceedings of the International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), pp. 273–286 (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bar, K., Choueka, Y., Dershowitz, N. (2014). Matching Phrases for Arabic-to-English Example-Based Translation System. In: Dershowitz, N., Nissan, E. (eds) Language, Culture, Computation. Computational Linguistics and Linguistics. Lecture Notes in Computer Science, vol 8003. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45327-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-45327-4_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45326-7
Online ISBN: 978-3-642-45327-4
eBook Packages: Computer ScienceComputer Science (R0)