skip to main content
10.1145/2534248.2534255acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Understanding workflows for distributed computing: nitty-gritty details

Published:17 November 2013Publication History

ABSTRACT

Scientific workflow management is heavily used in our organization. After six years, a large number of workflows are available and regularly used to run biomedical data analysis experiments on distributed infrastructures, mostly on grids. In this paper we present our first efforts to better understand and characterise these workflows. We start with a set of considerations previously proposed in the literature (workflow dimensions and motifs), and revise these to more closely describe what we observe in our workflows. We conclude that workflow characteristics can be categorized at two levels: firstly, the features characterizing the distributed application and how to implement it as a workflow, and secondly, workflow motifs that depend on the features of the selected workflow management system. These characteristics could be useful in the future to understand a larger set of workflows and to identify functional requirements for further development workflow management systems.

References

  1. crowdLabs. http://www.crowdlabs.org/.Google ScholarGoogle Scholar
  2. myExperiment website. http://www.myexperiment.org/.Google ScholarGoogle Scholar
  3. SHIWA Portal. http://ssp.shiwa-workflow.eu/.Google ScholarGoogle Scholar
  4. SHIWA Repository. http://repo.shiwa-workflow.eu/.Google ScholarGoogle Scholar
  5. Workflow Generator website. https://confluence.pegasus.isi.edu/display/pegasus/WorkflowGenerator.Google ScholarGoogle Scholar
  6. Workflow INstance Generation and Specialization (WINGS). http://wings-workflows.org/.Google ScholarGoogle Scholar
  7. Workflow Patterns website. http://www.workflowpatterns.com/.Google ScholarGoogle Scholar
  8. Yet Another Workflow Language (YAWL). http://www.yawlfoundation.org/.Google ScholarGoogle Scholar
  9. S. Bharathi, A. Chervenak, E. Deelman, G. Mehta, M. Su, and K. Vahi. Characterization of scientific workflows. In Third Workshop on Workflows in Support of Large-Scale Science. WORKS 2008., pages 1--10. IEEE, November 2008.Google ScholarGoogle ScholarCross RefCross Ref
  10. V. Curcin and M. Ghanem. Scientific workflow systems - can one size fit all? In Biomedical Engineering Conference, CIBEC 2008, pages 1--9. IEEE, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  11. E. Deelman, D. Gannon, M. S. Shields, and I. Taylor. Workflows and e-science: An overview of workflow system features and capabilities. Future Generation Computer Systems, 25(5): 528--540, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Garijo, P. Alper, K. Belhajjame, O. Corcho, Y. Gil, and C. Goble. Common motifs in scientific workflows: An empirical analysis. In Proceedings of the 8th IEEE International Conference on E-Science (e-Science), pages 1--8. IEEE Computer Society, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. Ghanem, V. Curcin, P. Wendel, and Y. Guo. Building and Using Analytical Workflows in Discovery Net, pages 119--139. John Wiley & Sons, Ltd, 2009.Google ScholarGoogle Scholar
  14. T. Glatard, J. Montagnat, D. Lingrand, and X. Pennec. Flexible and efficient workflow deployement of data-intensive applications on grids with MOTEUR. International Journal of High Performance Computing Applications, 22(3): 347--360, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. D. Jordan and J. Evdemon (chairs). Web Services Business Process Execution Language version 2.0. http://docs.oasis-open.org/wsbpel/2.0/wsbpel-v2.0.pdf.Google ScholarGoogle Scholar
  16. P. Kacsuk, K. Karoczkai, G. Hermann, G. Sipos, and J. Kovács. WS-PGRADE: Supporting parameter sweep applications in workflows. In Workflows in Support of Large-Scale Science, 2008. WORKS 2008. Third Workshop on, pages 1--10. IEEE, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  17. B. Ludäscher, I. Altintas, C. Berkley, D. Higgins, E. Jaeger, M. Jones, E. A. Lee, J. Tao, and Y. Zhao. Scientific workflow management and the kepler system. Concurrency and Computation: Practice and Experience, 18(10): 1039--1065, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Majithia, I. Taylor, M. Shields, and I. Wang. Triana: A graphical web service composition and execution toolkit. In Proceedings of the IEEE International Conference on Web Services, pages 514--524. IEEE Computer Society, July 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Migliorini, M. Gambini, M. La Rosa, and A. H. M. ter Hofstede. Pattern-based evaluation of scientific workflow management systems. BPM Center Report BPM-11-03, 2011. http://eprints.qut.edu.au/39935/.Google ScholarGoogle Scholar
  20. T. Oinn, M. Addis, J. Ferris, D. Marvin, T. Carver, M. R. Pocock, and A. Wipat. Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics, 20: 2004, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Pautasso and G. Alonso. Parallel computing patterns for grid workflows. In Workshop on Workflows in Support of Large-Scale Science, WORKS '06., pages 1--10, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  22. L. Ramakrishnan and B. Plale. A multi-dimensional classification model for scientific workflow characteristics. In Proceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Science, pages 4:1--4:12. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. S. Shahand, A. Benabdelkader, J. Huguet, M. Jaghouri, M. Santcroos, M. al Mourabit, P. F. C. Groot, M. W. A. Caan, A. H. C. van Kampen, and S. D. Olabarriaga. A data-centric science gateway for computational neuroscience. In Proceedings of the 5th International Workshop on Science Gateways, June 2013.Google ScholarGoogle Scholar
  24. S. Shahand, M. Santcroos, A. H. C. Kampen, and S. D. Olabarriaga. A grid-enabled gateway for biomedical data analysis. Journal of Grid Computing, 10(4): 725--742, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. W. M. P. Van Der Aalst, A. H. M. Ter Hofstede, B. Kiepuszewski, and A. P. Barros. Workflow patterns. Distrib. Parallel Databases, 14(1): 5--51, July 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Workflow Management Coallition. Workflow Reference Model, 1995.Google ScholarGoogle Scholar
  27. Ustun Yildiz, Adnene Guabtni, and Anne H. H. Ngu. Towards scientific workflow patterns. In Proceedings of the 4th Workshop on Workflows in Support of Large-Scale Science, WORKS '09, pages 13:1--13:10, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Yu and R. Buyya. A taxonomy of workflow management systems for grid computing. Journal of Grid Computing, 3(3--4): 171--200, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  29. Z. Zhang, D. S. Katz, J. M. Wozniak, A. Espinosa, and I. Foster. Design and analysis of data management in scalable parallel scripting. In High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for, pages 1--11, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Understanding workflows for distributed computing: nitty-gritty details

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WORKS '13: Proceedings of the 8th Workshop on Workflows in Support of Large-Scale Science
      November 2013
      133 pages
      ISBN:9781450325028
      DOI:10.1145/2534248

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 November 2013

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      WORKS '13 Paper Acceptance Rate13of16submissions,81%Overall Acceptance Rate30of54submissions,56%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader