Skip to main content

Abstract

Many scientific experiments deal with data-intensive applications and the orchestration of computational workflow activities. These can benefit from data parallelism exploited in parallel systems to minimize execution time. Due to its complexity, robustness and efficiency to exploit data parallelism, grid infrastructures are widely used in some e-Science areas like bioinformatics. Workflow techniques are very important to in-silico bioinformatics experiments, allowing the e-scientist to describe and enact experimental process in a structured, repeatable and verifiable way. The main purpose of this paper is to describe our experience with Tavena Workbench and PeDRo, which are part of myGrid project. Taverna is provided with a workflow toolset and enactor, allowing the specification of processing units, data transfer and execution constraints. As a data entry tool, PeDRo provides a model, a controlled vocabulary and field validations for Web Services descriptions, leveraging the knowledge associated to the workflows. The main contribution of this work is a summary of some considerations drawn by our experience with the use of these tools, emphasizing its advantages and negative aspects, together with proposals for some future improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. BioMart Project (2006), Available at http://www.biomart.org/

  2. BioMOBY. Available at http://biomoby.open-bio.org/

  3. BiowebDB. Available at http://www.biowebdb.org/index.html/

  4. Business Process Execution Language for Web Service version 1.1 (Feb. 2005), http://www-128.ibm.com/developerworks/library/specification/ws-bpel/

  5. Foster, I.: A Globus Primer (2005), Available at http://www.globus.org/toolkit/docs/4.0/key/

  6. Foster, I., Kesselman, C.: The Grid: Blueprint for a new computing infrastructure. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  7. Van Heijst, G., Schreiber, A., Wielinga, B.: Using explicit ontologies in KBS development. International Journal of Human-Computer Studies 46, 183–292 (1996)

    Google Scholar 

  8. Globus Toolkit. Available at http://www.globus.org/toolkit/

  9. Gruber, T.: A translation approach to portable ontologies. Knowledge Acquisition 5(2), 199–220 (1993)

    Article  Google Scholar 

  10. Guarino, N.: Formal Ontology and Information Systems. In: International Conference on Formal Ontologies in Information Systems (FOIS), Trento, Italy, June 1998, pp. 3–15 (1998)

    Google Scholar 

  11. Kaler, C., et al.: Web Services Security (WS-Security) (2002), Available at http://www-128.ibm.com/developerworks/webservices/library/ws-secure/

  12. Oinn, T., et al.: Taverna: Lessons in creating a workflow environment for the life sciences. In: Concurrency and Computation: Practice and Experience, pp.2 (2002)

    Google Scholar 

  13. Oinn, T., et al.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics Journal 20(17), 3045–3054 (2004)

    Article  Google Scholar 

  14. PeDRo, dynamic form generation, XML Schema, data validation, controlled vocabulary services... Manchester University (2004), Available at http://pedrodownload.man.ac.uk/main.html

  15. Santos, R.T.: – “O Ambiente 10+C para a definição e execução de workflows in silico através de serviços web” – (In Portuguese). Master Thesis, COPPE/UFRJ (2004)

    Google Scholar 

  16. Schulze-Kremer, S.: Ontologies for Molecular Biology. In: Pacific Symposium on Biocomputing, pp. 693–704 (1998)

    Google Scholar 

  17. SeqHound. Available at http://www.blueprint.org/seqhound/

  18. Silva, F., Cavalcanti, M.: Intermediate Data Management for In-Silico Workflows using Web Services. In: Workshop de Teses e Dissertações em Banco de Dados, Uberlândia, MG, Brazil (2005)

    Google Scholar 

  19. Stevens, R., Robinson, A., Goble, C.: myGrid: Personalized bioinformatics on the information grid. Bioinformatics 19(1), 302–304 (2003)

    Article  Google Scholar 

  20. Taverna Project Website (2006), Available at http://taverna.sourceforge.net/

  21. Wroe, C., et al.: Recycling Services and Workflows through Discovery and Reuse. In: Proc. UK e-Science All Hands Meeting 2004, pp. 622–629 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Michel Daydé José M. L. M. Palma Álvaro L. G. A. Coutinho Esther Pacitti João Correia Lopes

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Ruberg, N. et al. (2007). Experiencing Data Grids. In: Daydé, M., Palma, J.M.L.M., Coutinho, Á.L.G.A., Pacitti, E., Lopes, J.C. (eds) High Performance Computing for Computational Science - VECPAR 2006. VECPAR 2006. Lecture Notes in Computer Science, vol 4395. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71351-7_56

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71351-7_56

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71350-0

  • Online ISBN: 978-3-540-71351-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics