Abstract
Translation of Multi-word expressions (MWEs) is one of the most challenging tasks of a Machine translation (MT) system. In this paper, we present an innovative technique for dealing with MWEs in the context of MT. The technique permits bilinguals to give translations of MWEs in the form of patterns, without requiring them to be trained linguistically. The interpretation of the patterns is done by a dynamic machine learning algorithm, which allows the main rule-based MT system to operate based on linguistic rules. Thus, the bilingual patterns (without any explicit linguistic input) are used in conjunction with the main linguistic system. This is made possible by the learning pathway templates. These templates need to be specially prepared by trained linguists only once. After that they help to process potentially a large number of patterns.
The implemented system is being used with a large-scale rule-based MT system to improve its performance. This framework can also be extended to help example-based or statistical MT systems to deal with MWEs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anne, A., Yves, S.: Parsing idioms in lexicalized TAGs. In: Proceedings of the 4th EACL, Manchester, UK (1989)
Lin, D.: Automatic identification of non-compositional phrases. In: Proceedings of ACL 1999, College Park, USA (1999)
Wehrli, E.: Translating idioms. In: Proceedings of COLING ACL 1998, Montreal, Canada (1998)
Dias, G.: Multiword Unit Hybrid Extraction. In: Proceedings of the ACL 2003, Workshop on Multi-word Expressions: Analysis, Acquisition and Treatement (2003)
Frederique, S., Pasi, T.: Using a finite-state based formalism to identify and generate multiword expressions. Technical Report MLTT-019, Rank Xerox Research Center, Grenoble, France (1995)
Segond, D., Valetto, G., Breidt, E.: Formal Description of Multi-Word Lexemes with Finite-State Formalism IDAREX. In: Proceedings of COLING 1996 (1996)
Baldwin, T., Bannard, C., Tanaka, T., Widdows, D.: An Emperical Model of Multiword Expression Decomposability. In: Proceedings of the ACL 2003, Workshop on Multiword Expressions: Analysis, Acquisition and Treatement (2003)
Brown, R.D.: Adding Linguistic Knowledge to a Lexical Example-Based Translation System. In: Proceedings of the 8th International Conference on Theoretical and Methodological issues in Machine translation (1999)
Imamura, K., Sumita, E., Matsumoto, Y.: Feedback cleaning of Machine Translation Rules Using Automatic Evaluation. In: Proceedings of 41st Annual Meeting of the Association for Computational Linguistics (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bharati, A., Sangal, R., Mishra, D., Venkatapathy, S., Reddy, T.P. (2004). Handling Multi-word Expressions Without Explicit Linguistic Rules in an MT System. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science(), vol 3206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30120-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-30120-2_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23049-6
Online ISBN: 978-3-540-30120-2
eBook Packages: Springer Book Archive