Skip to main content

Systematic Processing of Long Sentences in Rule Based Portuguese-Chinese Machine Translation

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6008))

Abstract

The translation quality and parsing efficiency are often disappointed when Rule based Machine Translation systems deal with long sentences. Due to the complicated syntactic structure of the language, many ambiguous parse trees can be generated during the translation process, and it is not easy to select the most suitable parse tree for generating the correct translation. This paper presents an approach to parse and translate long sentences efficiently in application to Rule based Portuguese-Chinese Machine Translation. A systematic approach to break down the length of the sentences based on patterns, clauses, conjunctions, and punctuation is considered to improve the performance of the parsing analysis. On the other hand, Constraint Synchronous Grammar is used to model both source and target languages simultaneously at the parsing stage to further reduce ambiguities and the parsing efficiency.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bennett, W.S., Slocum, J.: The LRC Machine Translation System. Computational Linguistics 11(2-3), 111–121 (1985)

    Google Scholar 

  2. Macao Special Administrative Region Government Portal, http://www.gov.mo

  3. Jin, M.X., Kim, M.Y., Kim, D., Lee, J.H.: Segmentation of Chinese Long Sentences Using Commas. In: SIGHAN Workshop on Chinese Language Processing, pp. 1–8 (2004)

    Google Scholar 

  4. Xiong, H., Xu, W., Mi, H., Liu, Y., Liu, Q.: Sub-Sentence Division for Tree-Based Machine Translation. In: Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP, Short Papers, Singapore, pp. 137–140 (2009)

    Google Scholar 

  5. Li, X., Zong, C., Hu, R.: A Hierarchical Parsing Approach with Punctuation Processing for Long Chinese Sentences. In: Proceedings of the Second International Joint Conference on Natural Language Processing, Companion Volume including Posters/Demos and tutorial abstracts, Jeju Island, Republic of Korea, pp. 7–12 (2005)

    Google Scholar 

  6. Abney, S.: Parsing by Chunks. Principle-Based Parsing, pp. 257–278. Kluwer Academic Publishers, Dordrecht (1991)

    Google Scholar 

  7. Garrido-Alenda, A., Gilabert-Zarco, P., Pérez-Ortiz, J., Pertusa-Ibáñez, A., Ramírez-Sánchez, G., Sánchez-Martínez, F., Scalco, M.A., Forcada, M.L.: Shallow Parsing for Portuguese-Spanish Machine Translation. In: Branco, A., Mendes, A., Ribeiro, R. (eds.) Language technology for Portuguese: shallow processing tools and resources, pp. 135–144 (2003)

    Google Scholar 

  8. Yang, J.: Phrase Chunking for Efficient Parsing in Machine Translation System. In: Monroy, R., Arroyo-Figueroa, G., Sucar, L.E., Sossa, H. (eds.) MICAI 2004. LNCS (LNAI), vol. 2972, pp. 478–487. Springer, Heidelberg (2004)

    Google Scholar 

  9. Kim, Y.B., Ehara, T.: A Method for Partitioning of Long Japanese Sentences with Subject Resolution in J/E Machine Translation. In: Proceedings of the 1994 International Conference on Computer Processing of Oriental Languages, Taejon, Korea, pp. 467–473 (1994)

    Google Scholar 

  10. Kim, Y.S., Oh, Y.J.: Intra-sentence segmentation based on support vector machines in English-Korean machine translation systems. Expert Systems with Applications: An International Journal 34, 2673–2682 (2008)

    Article  Google Scholar 

  11. Kim, S.D., Zhang, B.T., Kim, Y.T.: Reducing parsing complexity by intra-sentence segmentation based on maximum entropy model. In: Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora, Hong Kong, pp. 164–171 (2000)

    Google Scholar 

  12. Wong, F., Hu, D.C., Mao, Y.H., Dong, M.C., Li, Y.P.: Machine Translation Based on Constraint-Based Synchronous Grammar. In: Proceedings of the 2nd International Joint Conference on Natural Language (IJCNLP 2005), Jeju Island, Republic of Korea, pp. 612–623 (2005)

    Google Scholar 

  13. Wang, S., Lu, Y.: Gramática da Língua Portuguesa. Shanghai Foreign Language Education Press (1999)

    Google Scholar 

  14. Gee, J., Grosjean, F.: Performance structures: A psycholinguistic and linguistic appraisal. Cognitive Psychology 15, 411–458 (1983)

    Article  Google Scholar 

  15. Costa, F.N.Q.M.C.: Deep Linguistic Processing of Portuguese Noun Phrases. Master Thesis, University of Lisbon, Portugal (2007)

    Google Scholar 

  16. Tomita, M.: An efficient augmented-context-free parsing algorithm. Computational Linguistics 13(1-2), 31–46 (1987)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oliveira, F., Wong, F., Hong, IS. (2010). Systematic Processing of Long Sentences in Rule Based Portuguese-Chinese Machine Translation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2010. Lecture Notes in Computer Science, vol 6008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12116-6_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12116-6_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12115-9

  • Online ISBN: 978-3-642-12116-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics