skip to main content
10.1145/1967486.1967548acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiiwasConference Proceedingsconference-collections
research-article

Automatic linguistic knowledge acquisition for web-based translation and language learning

Published: 08 November 2010 Publication History

Abstract

In this paper we present a new approach for the automatic acquisition of linguistic knowledge for machine translation based on parallel corpora and bilingual lexica. We have implemented a first prototype of a Web-based Japanese-English translation system called JETCAT and built a Fire-fox extension to analyze Japanese Web pages and translate sentences via Ajax. In addition, we visualize lexical and translation knowledge to offer a useful tool for Web-based language learning. Finally, the user can simply correct translation results and update the knowledge base resulting in a fully customizable personal translation assistant.

References

[1]
F. Bond. Translating the Untranslatable. CSLI Publications, Stanford, California, 2005.
[2]
F. Bond et al. Improving statistical machine translation by paraphrasing the training data. In Proceedings of the International Workshop on Spoken Language Translation, 2008.
[3]
F. Bond et al. Enhancing the Japanese WordNet. In Proceedings of the 7th Workshop on Asian Language Resources, in conjunction with ACL-IJCNLP, 2009.
[4]
J. Breen. JMdict: A Japanese-Multilingual dictionary. In COLING Multilingual Linguistic Resources Workshop, 2004.
[5]
C. Goutte et al., editors. Learning Machine Translation. MIT Press, Cambridge, Massachusetts, 2009.
[6]
S. Kanthak et al. Novel reordering approaches in phrase-based statistical machine translation. In Proceedings of the ACL Workshop on Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, pages 167--174, 2005.
[7]
J. Katz-Brown. Dependency reordering features for Japanese-English phrase-based translation. Master's thesis, MIT, 2008.
[8]
J. Katz-Brown and M. Collins. Syntactic reordering in preprocessing for Japanese--English translation: MIT system description for NTCIR-7. In Proceedings of the NTCIR-7 Workshop Meeting, pages 409--414, 2008.
[9]
T. Kudo and Y. Matsumoto. Japanese dependency analysis using cascaded chunking. In CoNLL 2002: Proceedings of the 6th Conference on Natural Language Learning 2002 (COLING 2002 Post-Conference Workshops), pages 63--69, 2002.
[10]
H. Lui. MontyLingua: An end-to-end natural language processor with common sense. Technical report, MIT Media Lab, 2004.
[11]
Y. Matsumoto et al. Japanese morphological analysis system ChaSen version 2.0 manual. Technical Report NAIST-IS-TR99009, NAIST, 1999.
[12]
M. Utiyama and H. Isahara. Reliable measures for aligning Japanese-English news articles and sentences. In Proceedings of the 41st Annual Meeting of the ACL, pages 72--79, 2003.
[13]
W. Winiwarter. WETCAT -- Web-Enabled Translation using Corpus-based Acquisition of Transfer rules. In Proceedings of the 3rd IEEE Intl. Conf. on Innovations in Information Technology, 2006.
[14]
W. Winiwarter. JETCAT -- Japanese-English Translation using Corpus-based Acquisition of Transfer rules. Journal of Computers, 2(9):27--36, November 2007.
[15]
W. Winiwarter. Learning transfer rules for machine translation from parallel corpora. Journal of Digital Information Management, 6(4):285--293, August 2008.
[16]
W. Winiwarter. WICKET -- Word-aligned Incremental Corpus-based Korean-English Translation. ÖGAI Journal, 27(4):18--25, December 2008.
[17]
W. Winiwarter. WILLIE -- a Web Interface for a Language Learning and Instruction Environment. In Proceedings of the 6th International Conference on Web-based Learning, pages 300--311, 2008.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
iiWAS '10: Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services
November 2010
895 pages
ISBN:9781450304214
DOI:10.1145/1967486
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • IIWAS: International Organization for Information Integration
  • Web-b: Web-b

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 November 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. alignment
  2. language tools
  3. parallel corpora
  4. web-based language learning
  5. web-based machine translation

Qualifiers

  • Research-article

Conference

iiWAS '10
Sponsor:
  • IIWAS
  • Web-b

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 186
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media