Skip to main content

Syntactic Structure Transfer in a Tamil to Hindi MT System – A Hybrid Approach

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6008))

Abstract

We describe the syntactic structure transfer, a central design question in machine translation, between two languages Tamil (source) and Hindi (target), belonging to two different language families, Dravidian and Indo-Aryan respectively. Tamil and Hindi differ extensively at the clausal construction level and transferring the structure is difficult. The syntactic structure transfer described here is a hybrid approach where we use CRFs for identifying the clause boundaries in the source language, Transformation Based Learning (TBL) for extracting the rules and use semantic classification of Postpositions (PSP) for choosing semantically appropriate structure in constructions where there are one to many mapping in the target language. We have evaluated the system using web data and the results are encouraging.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brill, E.: Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging. Computational Linguistics 21(4), 543–566 (1995)

    Google Scholar 

  2. Chris, Q.: Arul Menezes, and Colin C.: Dependency tree let translation: Syntactically informed phrasal smt. In: Proceedings of the 43rd ACL

    Google Scholar 

  3. Collins, M., Koehn, P., Ivona, K.: Clause restructuring for statistical machine translation. In: ACL, Ann Arbor,MI, pp. 531–540

    Google Scholar 

  4. Dien, Z.D., Ngan, T., Quang, X., Nam, C.: A hybrid approach to word-order transfer in the english – vietnamese machine translation system. In: Proceedings of the MT Summit IX, Louisiana, USA, pp. 79–86 (2003)

    Google Scholar 

  5. Ding, Y., Palmer, M.: Machine translation using probablisitic synchronous dependency insertion grammars. In: Proceedings of the 43rd ACL

    Google Scholar 

  6. Koehn, P., Josef, O.F., Marcu, D.: Statistical Phrase-Based Translation. In: Proc of HLT/NAACL 2003, pp. 127–133 (2003)

    Google Scholar 

  7. Lavie, A.: Stat-XFER: A general search-based syntax-driven framework for machine translation. In: Gelbukh, A. (ed.) CICLing 2008. LNCS, vol. 4919, pp. 362–375. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Lin, D.: A path-based transfer model for machine translation. In: Proceedings of the 20th COLING 2004 (2004)

    Google Scholar 

  9. McCallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web enhanced lexicons. In: Proceedings of CoNLL 2003, Edmonton, Canada, pp. 188–191 (2003)

    Google Scholar 

  10. Och, F.J., Tillmann, C., Ney, H.: Improved Alignment Models for Statistical Machine Translation. In: EMNLP (1999)

    Google Scholar 

  11. Probst, K., Levin, L.: Challenges in Automated Elicitation of a Controlled Bilingual Corpus. In: Proceedings of TMI (2002)

    Google Scholar 

  12. Slocum, J.: Machine Translation: its history, current status, and future prospects. In: Proceedings of the 10th international conference on Computational linguistics, Stanford, California, July 02-06, pp. 546–561 (1984)

    Google Scholar 

  13. Kudo, T.: CRF++, an open source toolkit for CRF (2005), http://crfpp.sourceforge.net

  14. Xia, F., Michael, M.: Improving a statistical MT system with automatically learned rewrite patterns. In: COLING 2004 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lalitha Devi, S., Ram R, V.S., Pralayankar, P., T, B. (2010). Syntactic Structure Transfer in a Tamil to Hindi MT System – A Hybrid Approach. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2010. Lecture Notes in Computer Science, vol 6008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12116-6_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12116-6_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12115-9

  • Online ISBN: 978-3-642-12116-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics