Skip to main content
Log in

An End-to-end Workflow Pipeline for Large-scale Grid Computing

  • Published:
Journal of Grid Computing Aims and scope Submit manuscript

Abstract

In this paper we describe a service-based, software architecture that enables end-to-end, high-level workflow processing in a Grid environment consisting of many heterogeneous resources. Our architecture is essentially a pipeline that extends from the abstract application specification phase to the deployment and execution stages through to returning the results to the user. We envision a large-scale Grid environment that contains heterogeneous resources. Our architecture caters for flexible deployment, performance, reliability and charging for resource usage. These are addressed at the specification level as well as at the realisation (brokering) and execution levels. The proposed architecture is derived from previous work in LeSC that has produced the ICENI pipeline, and our experience with e-Science projects, such as GENIE, e-Protein and RealityGrid from which we derive a set of key requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Access Grid. http://www.accessgrid.org/.

  2. D. Adamczyk, D. Collados, G. Denis, J. Fernandes, P. Galvez, I. Legrand, H.B. Newman, and K. Wei, “Global platform for rich media conferencing and collaboration”, in 2003 Conference for Computing in High-Energy and Nuclear Physics (CHEP 03), La Jolla, California, Mar 2003.

  3. AGWL. http://dps.uibk.ac.at/projects/agwl/.

  4. I. Altintas, C. Berkley, E. Jaeger, M. Jones, B. Ludscher, and S. Mock, “Kepler: An extensible system for design and execution of scientific workflows”, in 16th Intl. Conf. on Scientific and Statistical Database Management (SSDBM’04), Santorini Island, Greece, June 2004.

  5. Andrew Stephen McGough, Ali Afzal, John Darlington, Nathalie Furmento, Anthony Mayer and Laurie Young, “Making the Grid predictable through reservations and performance modelling”, The Computer Journal, Vol. 48, No. 3, pp. 358–368, 2005.

    Article  Google Scholar 

  6. Open Grid Services Architecture. https://forge.gridforum.org/projects/ogsa-wg.

  7. Rob Armstrong, Dennis Gannon, Al Geist, Katarzyna Keahey, Scott R. Kohn, Lois McInnes, Steve R. Parker, and Brent A. Smolinski, “Toward a common component architecture for high-performance scientific computing”, in High Performance Distributed Computing, 1999.

  8. Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter Patel-Schneider. The Description Logic Handbook : Theory, Implementation, and Applications. Cambridge, 2003.

  9. P. Barman, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, “Xen and the art of virtualization”, in SOSP 2003, September 2003.

  10. Sean Bechhofer, Frank van Harmelen, Jim Hendler, Ian Horrocks, Deborah L. McGuiness, Peter F. Patel-Schneider, and Lynn Andrea Stein, “OWL web ontology language reference”, W3C recommendation, February 2004. Available at http://www.w3.org/TR/owl-ref/.

  11. BPEL4WS. http://www6.software.ibm.com/software/developer/library/ws-bpel.pdf.

  12. F. Breg, S. Diwan, J. Villacis., J. Balasubramanian, E. Akman, and D. Gannon, “Java rmi performance and object model interoperability: Experiments with java/hpc++ distributed components”, in Concurrency Practice and Experience, Special Issue from the Fourth Java for Scientific Computing Workshop, 1998.

  13. CDDLM Working Group, GGF. https://forge.gridforum.org/projects/cddlm-wg.

  14. L. Chen, S. J. Cox, F. Tao, N.R. Shadbolt, C. Puleston, and C. Goble, “Empower resource providers to build the semantic Grid”, In IEEE/WIC/ACM International Conference on Web Intelligence, September 2004.

  15. J. Chin, E.S. Boek, and P.V. Coveney, “Lattice boltzmann simulation of the flow of binary immiscible fluids with different viscosities using the shan-chen microscopic interaction model”, Philisophical TRansactions of the Royal Society A, 360(547), 2002.

  16. J. Chin, P.V. Coveney, and J. Harting, “The teragyroid project: Collaborative steering and visualisation in an hpc Grid for modelling complex fluids”, UK All-hands e-Science Conference, 2004, September 2004.

  17. J. Chin, J. Harting, S. Jha, P.V. Coveney, A. R. Porter, and S. M. Pickles, “Steering in computational science: Mesoscale modelling and simulation”, Contemporary Physics, 44: 417–434, 2003.

    Article  Google Scholar 

  18. The OWL Services Coalition, “OWL-S semantic markup for web services”, http://www.daml.org/services/owl-s/1.1/, 2004.

  19. J. Cohen, W. Lee, A. Mayer, and S. Newhouse, “Making the Grid Pay – Economic Web Services”, in Building Service Based Grids Workshop, GGF11, Honolulu, Hawaii, USA, June 2004.

  20. K. Czajkowski, I.T. Foster, and C. Kesselman, “Resource co-allocation in computational Grids”, in IEEE HPDC-8, August 1999.

  21. Holly Daily, Henri Casanovay, and Fran Berman, “A decoupled scheduling approach for the GrADS program development environment”, in Proceedings of the Supercomputing 2002 conference, Baltimore, November 2002.

  22. Protege Ontology Editor. http://protege.stanford.edu/.

  23. M. Hakki Eres, Graeme E. Pound, Zhuoan Jiao, Jasmin L. Wason, Fenglian Xu, Andy J. Keane, and Simon J. Cox, “Implementation of a Grid-enabled problem solving environment in matlab”, in International Conference on Computationa Science, pages 420–429, 2003.

  24. M.H. Eres, G.E. Pound, Z. Jiao, J.L. Wason, F. Xu, A.J. Keane, and S.J. Cox, “Implementation and utilisation of a Grid-enabled problem solving environment in matlab”, Future Generation Computer Systems (in press), 2005.

  25. e-Protein. http://www.e-protein.org/.

  26. Global Grid Forum. http://www.ggf.org.

  27. Grid ENabled Integrated Earth system model project. http://www.genie.ac.uk/.

  28. W. Gropp, E. Lusk, N. Doss, and A. Skjellum, “A high-performance, portable implementation of the MPI message passing interface standard”, Parallel Computing, Vol. 22, No. 6, pp. 789–828, 1996.

    Article  MATH  Google Scholar 

  29. Job Submission Description Language Working Group. https://forge.gridforum.org/projects/jsdl-wg.

  30. M.Y. Gulamali, A.S. McGough, R.J. Marsh, N.R. Edwards, T.M. Lenton, P.J. Valdes, S.J. Cox, S.J. Newhouse, and J. Darlington, “Performance guided scheduling in genie through iceni”, In Proceedings of the UK e-Science All Hands Meeting 2004, Nottingham, September 2004.

  31. Volker Haarslev and Ralf Möller, “RACER system description”, Lecture Notes in Computer Science, Vol. 2083, pp. 701, 2001.

    Google Scholar 

  32. J. Hau, W. Lee, and J. Darlington, “A semantic similarity measure for semantic web services”, Web Service Semantics: A workshop at The Fourteen International World Wide Web Conference (WWW2005), 2005.

  33. I. Horrocks, “Using an expressive description logic: Fact or fiction?” in Principles of Knowledge Representation and Reasoning: Proceedings of the Sixth International Conference (KR98), June 1998.

  34. Immunology Grid, Immunology Grid Project. http://www.immunologygrid.org.

  35. K. Keahey, K. Doering, and I. Foster, “From sandbox to playground: Dynamic virtual environments in the Grid”, in 5th IEEE/ACM International Workshop on Grid Computing, November 2004.

  36. S. Liang. The Java Native Interface, Programmer’s Guide and Specification. Addison-Wesley, 1999.

  37. London e-Science Centre. A Market for Computational Services. Available at http://www.lesc.ic.ac.uk/markets/.

  38. Phillip Lord, Pinar Alper, Chris Wroe, and Carole Goble, “Feta: A light-wieght architecture for user oriented semantic service discovery”, in A. Gómez-Pérez workflow. Enactment.tex; 30/10/2005; 16:54; p. 37 and J. Euzenat, editors, European Semantic Web Conference, pages 17–31. Springer, 2005.

  39. Simone A. Ludwig, William Naylor, Julian Padget, and Omer F. Rana, “Matchmaking support for mathematical web services”, in Proceedings of the UK e-Science All Hands Meeting 2005, Nottingham, September 2005.

  40. Matthias Hovestadt, Odej Kao, Axel Keller, Achim Streit, “Scheduling in HPC resource management systems: Queuing vs. Planning”, Lecture Notes in Computer Science, 2862, October 2003.

  41. A.E. Mayer, Composite Construction of High Performance Scientific Applications. PhD thesis, Department of Computing, Imperial College, London, UK, 2001.

  42. A. Mayer, S. McGough, N. Furmento, J. Cohen, M. Gulamali, L. Young, A. Afzal, S. Newhouse, and J. Darlington, “ICENI: An Integrated Grid Middleware to Support e-Science”, in V. Getov and T. Kielmann (eds.), Component Models and Systems for Grid Applications, volume 1 of CoreGRID series, pages 109–124. Springer, June 2004.

  43. A. Mayer, S. McGough, M. Gulamali, L. Young, J. Stanton, S. Newhouse, and J. Darlington, “Meaning and behaviour in Grid oriented components”, Lecture Notes in Computer Science, Vol. 2536, pp. 100–111, 2002.

    Article  Google Scholar 

  44. A.S. McGough, L. Young, A. Afzal, S. Newhouse, and J. Darlington, “Workflow Enactment in ICENI”, in UK e-Science All Hands Meeting, pages 894–900, Nottingham, UK, Sep 2004.

  45. Wolfgang Nejdl, Boris Wolf, Changtao Qu, Stefan Decker, Michael Sintek, Ambj Naeve, Mikael Nilsson, Matthias Palmer, and Tore Risch, “Edutella: A p2p networking infrastructure based on rdf”, in 11th World Wide Web Conference, page 604, May 2002 2002.

  46. A. O’Brien, S.J. Newhouse, and J. Darlington, “Mapping of scientific workflow within the e-protein project to distributed resources”, in Proceedings of the UK e-Science All Hands Meeting 2004, Nottingham, September 2004.

  47. Open Grid Services Architecture Data Access and Integration (OGSA-DAI). http://www.ogsadai.org.uk/.

  48. S. Panagiotidi, E. Katsiri, and J. Darlington, “On Advanced Scientific Understanding, Model Componentisation and Coupling in GENIE”. in All Hands Meeting, Nottingham, UK, September 2005.

  49. S.M. Pickles, P.V. Coveney, and B.M. Boghosian, “Transcontinental realitygrids for interactive collaborative exploration of parameter space (triceps)”, Winner of SC’03 HPC Challenge Competition (Most Innovative Data-Intensive Application), November 2003.

  50. UDDI Project. Universal Description, Discovery and Integrati on (UDDI), September 2002. Available at http://www.uddi. org.

  51. RealityGrid Project. http://www.realitygrid.org/.

  52. RFC 2459. Internet X.509 Public Key Infrastructure Certificate and CRL Profile. http://www.ietf.org/rfc/rfc2459.txt.

  53. A. Saleem, M. Krznaric, S. Newhouse, and J. Darlington, “ICENI Virtual Organisation Management”, in UK e-Science All Hands Meeting, pages 117–120, Nottingham, UK, Sep. 2003.

  54. E. Smith and P. Anderson, “Dynamic reconfiguration for Grid fabrics”, in 5th IEEE/ACM International Workshop on Grid Computing, November 2004.

  55. R. Stevens, H.J. Tipney, C. Wroe, T. Oinn, M. Senger, P. Lord, C.A. Goble, A. Brass, and M. Tassabehji, “Exploring Williams–Beuren syndrome using my Grid”, in Proceedings of 12th International Conference on Intelligent Systems in Molecular Biology, Glasgow, UK, July 2004.

  56. Ian Taylor, Matthew Shields, Ian Wang, and Roger Philp, “Grid enabling applications using Triana”, in Workshop on Grid Applications and Programming Tools, Held in Conjunction with GGF8, June 2003.

  57. Condor Team. Condor Project Homepage. http://www.cs.wisc.edu/condor.

  58. The Bespoke Framework Generator (BFG). http://www.cs.manhesler.ac.uk/cnc/projects/bfg.php.

  59. The GriPhyN Virtual Data System. http://evitable.uchicago.edu/twiki/bin/view/VDSWeb/WebMain.

  60. The Kerrighed project. http://www.kerrighed.org/.

  61. Inc. The MathWorks. Matlab®. http://www.mathworks.com/products/matlab/.

  62. The open Mosix project. http://openmosix.sourceforge.net/.

  63. The open SSI project. http://openssi.org/index.shtml.

  64. The Shibboleth® Project. http://shibboleth.internet2.edu/.

  65. User Mode Linux. http://user-mode-linux.sourceforge.net/.

  66. Juan E. Villacis, Madhusudhan Govindaraju, David Stern, Andrew Whitaker, Fabian Breg, Prafulla Deuskar, Benjamin Temko, Dennis Gannon, and Randall Bramley, “CAT: A high performance distributed component architecture toolkit for the Grid”, in High Performance Distributed Computing, 1999.

  67. Virtual Organisation Membership Service (VOMS). http://hep-project-grid-scg.web.cern.ch/hep-project-grid-scg/voms.html.

  68. The Taverna Project Website. http://taverna.sourceforge.net/.

  69. Laurie Young, “Scheduling componentised applications on a computational Grid”, MPhil Transfer Report, 2004.

  70. Jia Yu and Rajkumar Buyya, “A taxonomy of workflow management systems for Grid computing”, http:www.gridbus.org/reports/GridWorkflowTaxonomy.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Stephen McGough.

Rights and permissions

Reprints and permissions

About this article

Cite this article

McGough, A.S., Cohen, J., Darlington, J. et al. An End-to-end Workflow Pipeline for Large-scale Grid Computing. J Grid Computing 3, 259–281 (2005). https://doi.org/10.1007/s10723-005-9014-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10723-005-9014-4

Key words

Navigation