Skip to main content

Machine Translation Method Based on Non-compositional Semantics (Word-Level Sentence-Pattern-Based MT)

  • Conference paper
  • First Online:
Computational Linguistics (PACLING 2015)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 593))

Included in the following conference series:

Abstract

To overcome the conventional machine translation method, Ikehara et al. proposed a machine translation scheme based on non-compositional semantics. This machine translation scheme requires many sentence patterns which can preserve the semantics of the expression structure. To use this machine translation scheme for Japanese-English machine translation, a compound and complex sentence pattern dictionary, called “ToribankSPD”, have been developed. This dictionary has three levels of sentence patterns: “word-level”, “phrase-level”, and “clause-level”. In this paper, according to the machine translation scheme based on non-compositional semantics, we implemented the Japanese-English sentence-pattern-based machine translation method using the word-level sentence patterns of ToribankSPD. In our experiments, the pattern matching rate was low (about 10 %). However, 72 out of 100 evaluated sentences used the sentence patterns that had an appropriate expression structure, and the translation accuracy of 55 sentences was high.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ikehara, S., Tokuhisa, M., Murakami, J., Saraki, M., Miyazaki, M., Ikeda, N.: Pattern dictionary development based on non-compositional language model for Japanese compound and complex sentences. In: Matsumoto, Y., Sproat, R.W., Wong, K.-F., Zhang, M. (eds.) ICCPOL 2006. LNCS (LNAI), vol. 4285, pp. 509–519. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Ikehara, S., Tokuhisa, M., Murakami, J.: Non-compositional language model and pattern dictionary development for Japanese compound and complex sentences. In: Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1, pp. 343–360, Manchester (2008)

    Google Scholar 

  3. Tori-Bank (2007). http://www.unicorn/toribank

  4. Tokuhisa, M., Murakami, J., Ikehara, S.: Pattern search by structural matching from Japanese compound and complex sentence pattern dictionary (in Japanese). IPSJ SIG Technical report, 2006-NL-176, pp. 9–16 (2006)

    Google Scholar 

  5. Yin, D., Zhang, D.: Construct chunk-level templates for improving rule-based machine translation. J. Comput. Inf. Syst. 9(14), 5505–5512 (2013)

    MathSciNet  Google Scholar 

  6. Shapiro, S.C.: Generalized augmented transition network grammars for generation from semantic networks. Am. J. Comput. Linguist. Arch. 8(1), 12–25 (1982)

    Google Scholar 

  7. Ikehara, S., Miyazaki, M., Shirai, S., Yokoo, A., Nakaiwa, H., Ogura, K., Ooyama, Y., Hayashi, Y.: Goi-Taikei: A Japanese Lexicon (in Japanese). Iwanami Shoten (1997)

    Google Scholar 

  8. Murakami, J., Hujinami, S.: Japanese-English parallel sentences collection from digital media (in Japanese). JCL Workshop 2012 (2012)

    Google Scholar 

  9. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics on Interactive Poster and Demonstration Sessions, pp. 177–180, Prague (2007)

    Google Scholar 

  10. Kishore, P., Roukos, S., Ward, T., Wei-Jing, Z.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318, Philadelphia, Pennsylvania (2002)

    Google Scholar 

  11. Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J.: A study of translation edit rate with targeted human annotation. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, pp. 223–231, Cambridge (2006)

    Google Scholar 

  12. Banerjee, S., Lavie, A.: METEOR: an automatic metric for MT evaluation with improved correlation with human judgments. In: Proceedings of the 43th Annual Meeting of the Association of Computational Linguistics on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization, pp. 65–72, Ann Arbor (2005)

    Google Scholar 

  13. Isozaki, H., Hirao, T., Duh, K., Sudoh, K., Tsukada, H.: Automatic evaluation of translation quality for distant language Pairs. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 944–952, Massachusetts (2010)

    Google Scholar 

  14. Koehn, P., Monz, C.: Manual and automatic evaluation of machine translation between European languages. In: Proceedings of the Workshop on Statistical Machine Translation, pp. 102–121, New York (2006)

    Google Scholar 

Download references

Acknowledgments

The development of a compound and complex sentence pattern dictionary, called “ToribankSPD”, was supported under the CREST program of the Japan Science and Technology Agency.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Sakata .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Singapore

About this paper

Cite this paper

Sakata, J., Murakami, J., Tokuhisa, M., Murata, M. (2016). Machine Translation Method Based on Non-compositional Semantics (Word-Level Sentence-Pattern-Based MT). In: Hasida, K., Purwarianti, A. (eds) Computational Linguistics. PACLING 2015. Communications in Computer and Information Science, vol 593. Springer, Singapore. https://doi.org/10.1007/978-981-10-0515-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-0515-2_16

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-0514-5

  • Online ISBN: 978-981-10-0515-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics