Skip to main content

Handling Multi-word Expressions Without Explicit Linguistic Rules in an MT System

  • Conference paper
Text, Speech and Dialogue (TSD 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3206))

Included in the following conference series:

Abstract

Translation of Multi-word expressions (MWEs) is one of the most challenging tasks of a Machine translation (MT) system. In this paper, we present an innovative technique for dealing with MWEs in the context of MT. The technique permits bilinguals to give translations of MWEs in the form of patterns, without requiring them to be trained linguistically. The interpretation of the patterns is done by a dynamic machine learning algorithm, which allows the main rule-based MT system to operate based on linguistic rules. Thus, the bilingual patterns (without any explicit linguistic input) are used in conjunction with the main linguistic system. This is made possible by the learning pathway templates. These templates need to be specially prepared by trained linguists only once. After that they help to process potentially a large number of patterns.

The implemented system is being used with a large-scale rule-based MT system to improve its performance. This framework can also be extended to help example-based or statistical MT systems to deal with MWEs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anne, A., Yves, S.: Parsing idioms in lexicalized TAGs. In: Proceedings of the 4th EACL, Manchester, UK (1989)

    Google Scholar 

  2. Lin, D.: Automatic identification of non-compositional phrases. In: Proceedings of ACL 1999, College Park, USA (1999)

    Google Scholar 

  3. Wehrli, E.: Translating idioms. In: Proceedings of COLING ACL 1998, Montreal, Canada (1998)

    Google Scholar 

  4. Dias, G.: Multiword Unit Hybrid Extraction. In: Proceedings of the ACL 2003, Workshop on Multi-word Expressions: Analysis, Acquisition and Treatement (2003)

    Google Scholar 

  5. Frederique, S., Pasi, T.: Using a finite-state based formalism to identify and generate multiword expressions. Technical Report MLTT-019, Rank Xerox Research Center, Grenoble, France (1995)

    Google Scholar 

  6. Segond, D., Valetto, G., Breidt, E.: Formal Description of Multi-Word Lexemes with Finite-State Formalism IDAREX. In: Proceedings of COLING 1996 (1996)

    Google Scholar 

  7. Baldwin, T., Bannard, C., Tanaka, T., Widdows, D.: An Emperical Model of Multiword Expression Decomposability. In: Proceedings of the ACL 2003, Workshop on Multiword Expressions: Analysis, Acquisition and Treatement (2003)

    Google Scholar 

  8. Brown, R.D.: Adding Linguistic Knowledge to a Lexical Example-Based Translation System. In: Proceedings of the 8th International Conference on Theoretical and Methodological issues in Machine translation (1999)

    Google Scholar 

  9. Imamura, K., Sumita, E., Matsumoto, Y.: Feedback cleaning of Machine Translation Rules Using Automatic Evaluation. In: Proceedings of 41st Annual Meeting of the Association for Computational Linguistics (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bharati, A., Sangal, R., Mishra, D., Venkatapathy, S., Reddy, T.P. (2004). Handling Multi-word Expressions Without Explicit Linguistic Rules in an MT System. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science(), vol 3206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30120-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30120-2_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23049-6

  • Online ISBN: 978-3-540-30120-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics