Skip to main content

Parallel Machine Translation for gLite Based Grid Infrastructures

  • Conference paper
ICT Innovations 2010 (ICT Innovations 2010)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 83))

Included in the following conference series:

  • 857 Accesses

Abstract

Statistical machine translation is often criticized for slow decoding time. We address this issue by presenting a new tool for enabling Moses, a state of the art machine translation system, to be run on gLite based Grid infrastructures. It implements a workflow model for equally distributing the decoding task among several worker nodes in a cluster. We report experimental results for possible speed-ups and envision how natural language processing scientists can benefit from existing Grid infrastructures for solving processing, storage and collaboration issues.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open Source Toolkit for Statistical Machine Translation. In: Annual Meeting of the Association for Computational Linguistics (ACL), demonstration session, Prague, Czech Republic (2007)

    Google Scholar 

  2. SEE-GRID eInfrastructure for regional eScience, http://www.see-grid-sci.eu/

  3. Haddow, B.: Adding Multi-Threaded Decoding to Moses. In: The Prague Bulletin of Mathematical Linguistics, Prague, Czech Republic, pp. 57–66 (2010)

    Google Scholar 

  4. Stolić, M., Zdravkova, K.: Resources for machine translation of the Macedonian language. In: 1st International Conference ICT Innovations, Ohrid, Macedonia (2009)

    Google Scholar 

  5. Tyers, F., Alperen, M.: South-East European Times: A parallel corpus of the Balkan languages. In: Proceedings of the Workshop on Exploitation of multilingual resources and tools for Central and (South) Eastern European Languages LREC (2010)

    Google Scholar 

  6. Federico, M., Bertoldi, N., Cettolo, M.: IRSTLM: an Open Source Toolkit for Handling Large Scale Language Models. In: Proceedings of the Interspeech, Brisbane, Australia (2008)

    Google Scholar 

  7. EGI European Grid Initiative, http://www.egi.eu/

  8. Misev, A., Atanassov, E.: ULMon - Grid Monitoring from User Point of View. In: Proceeding of the 31st International Conference on Information Technology Interfaces ITI 2009, Cavtat/Dubrovnik, Croatia, pp. 621–626 (2009)

    Google Scholar 

  9. Laure, E., Hemmer, F., Prelz, F., Beco, S., Fisher, S., Livny, M., Guy, L., Barroso, M., Buncic, P., Kunszt, P.: Middleware for the next generation grid infrastructure. In: Proceedings of CHEP, Interlaken, Switzerland (2004)

    Google Scholar 

  10. Pacini, F.: JDL attributes specification. Technical report, EGEE Document EGEE-JRA1-TEC-590869-JDL-Attributes-v0-9 (2007)

    Google Scholar 

  11. EGEE: Enabling grids for E-sciencE, http://www.eu-egee.org/

  12. LHC Computing Grid, http://lcg.web.cern.ch/LCG/

  13. Santos, N., Koblitz, B.: Metadata services on the grid. In: Proceedings of Advanced Computing and Analysis Techniques ACAT, Berlin, Germany (2005)

    Google Scholar 

  14. Giménez, J., Màrquez, L.: SVMTool: A general POS tagger generator based on Support Vector Machines. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation, Lisbon, Portugal (2004)

    Google Scholar 

  15. Koehn, P.: Europarl: A Parallel Corpus for Statistical Machine Translation. In: Machine Translation Summit X, Phuket, Thailand, pp. 79–86 (2005)

    Google Scholar 

  16. Erjavec, T.: MULTEXT-East Version 4: Multilingual Morphosyntactic Specifications, Lexicons and Corpora. In: The Seventh International Conference on Language Resources and Evaluation, LREC 2010, Malta (2010)

    Google Scholar 

  17. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: A method for automatic evaluation of machine translation. In: 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, USA, pp. 311–318 (2002)

    Google Scholar 

  18. Oracle Grid Engine, http://www.oracle.com/us/products/tools/oracle-grid-engine-075549.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Stolić, M., Mišev, A. (2011). Parallel Machine Translation for gLite Based Grid Infrastructures. In: Gusev, M., Mitrevski, P. (eds) ICT Innovations 2010. ICT Innovations 2010. Communications in Computer and Information Science, vol 83. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19325-5_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-19325-5_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-19324-8

  • Online ISBN: 978-3-642-19325-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics