skip to main content
10.1145/2110497.2110500acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Workflow overhead analysis and optimizations

Authors Info & Claims
Published:14 November 2011Publication History

ABSTRACT

The execution of scientific workflows often suffers from a variety of overheads in distributed environments. It is essential to identify the different overheads and to evaluate how optimization methods help reduce overheads and improve runtime performance. In this paper, we present an overhead analysis for a set of workflow runs on cloud and grid platforms. We present the overhead distributions and conclude that they satisfy an exponential or uniform distribution. We compare three methods to calculate the cumulative sum of the overheads based on how they overlap. In addition, we indicate how experimental parameters impact the overhead and thereby the overall workflow performance. We then show how popular optimization methods improve runtime performance by reducing some or all types of overheads.

References

  1. Gil, Y.; Deelman, E.; Ellisman, M.; Fahringer, T.; Fox, G.; Gannon, D.; Goble, C.; Livny, M.; Moreau, L.; Myers, Examing the Challenges of Scientific Workflows, J. - Computer , Vol.40, no.12, pp.24--32, Dec. 2007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. E. Deelman, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, S. Patil, M.-H. Su, K. Vahi, and M. Livny. Pegasus: Mapping scientific workflows onto the Grid. Lecture Notes in Computer Science: Grid Computing, pages 11--20, 2004.Google ScholarGoogle Scholar
  3. Peter Couvares, Tevik Kosar, Alain Roy, Jeff Weber and Kent Wenger, "Workflow in Condor", in In Workflows for e-Science, Editors: I. Taylor, E. Deelman, D. Gannon, M. Shields, Springer Press, January 2007 (ISBN: 1--84628--519--4)Google ScholarGoogle Scholar
  4. J. Frey, T. Tannenbaum, M. Livny, I. Foster, and S. Tuecke, "Condor-G: A Computation Management Agent for Multi-Institutional Grids," Cluster Computing, vol. 5, pp. 237--246 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. G. B. Berriman, E. Deelman, J. Good, J. Jacob, D. S. Katz, C. Kesselman, A. Laity, T. A. Prince, G. Singh, and M.-H. Su, "Montage: A Grid Enabled Engine for Delivering Custom Science-Grade Mosaics On Demand," presented at SPIE Conference 5487: Astronomical Telescopes, 2004.Google ScholarGoogle Scholar
  6. A. Lathers, M.-H. Su, A. Kulungowski, A. W. Lin, G. Mehta, S. T. Peltier, E. Deelman, and M. H. Ellisman, "Enabling parallel scientific applications with workflow tools," presented at Challenges of Large Applications in Distributed Environments, 2006 IEEE, 2006.Google ScholarGoogle Scholar
  7. E. Deelman, J. Blythe, Y. Gil, and C. Kesselman, "Workflow Management in GriPhyN," in Grid Resource Management: State of the Art and Future Trends, J. Nabrzyski, J. M. Schopf, and J. Weglarz, Eds.: Springer, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. E. Deelman, S. Callaghan, E. Field, H. Francoeur, R. Graves, N. Gupta, V. Gupta, T. H. Jordan, C. Kesselman, P. Maechling, J. Mehringer, G. Mehta, D. Okaya, K. Vahi, and L. Zhao, "Managing Large-Scale Workflow Execution from Resource Provisioning to Provenance Tracking: The CyberShake Example," presented at Second IEEE International Conference on e-Science and Grid Computing, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. C. Catlett, "The philosophy of TeraGrid: building an open, extensible, distributed TeraScale facility," presented at Cluster Computing and the Grid 2nd IEEE/ACM International Symposium CCGRID2002, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. The Open Science Grid Consortium, http://www.opensciencegrid.org.Google ScholarGoogle Scholar
  11. Louis-Claude Canon, Emmanuel Jeannot, Rizos Sakellariou, and Wei Zheng. Comparative evaluation of the Robustness of DAG Scheduling Heuristics. In Grid Computing: Achievements and Prospects (eds: Sergei Gorlatch, Paraskevi Fragopoulou, Thierry Priol), Springer, 2008, pp. 73--84 (ISBN 978-0--387-09456--4) Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Gurmeet Singh, Mei-Hui Su, Karan Vahi, Ewa Deelman, Bruce Berriman, John Good, Daniel S. Katz, and Gaurang Mehta. Workflow Task Clustering for Best Effort Systems with Pegasus Mardi Gras Conference, Baton Rouge, LA, January 2008 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Sang-Min Park; Humphrey, M.; Data throttling for data-intensive workflows, IEEE Intl. Symposium on Parallel and Distributed Processing, Miami, FL, April 2008.Google ScholarGoogle Scholar
  14. Ann Chervenak, Ewa Deelman, Miron Livny, Mei-Hui Su, Rob Schuler, Shishir Bharathi, Gaurang Mehta, Karan Vahi, Data Placement for Scientific Applications in Distributed Environments, Proceedings of Grid Conference 2007, Austin, Texas, September 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Nezil Yigitbasi, and Dick Epema, Static and Dynamic Overprovisioning Strategies for Performance Consistency in Grids, The 11th ACM/IEEE International Conference on Grid Computing (Grid 2010), Brussels, Belgium, Oct 2010.Google ScholarGoogle Scholar
  16. Vijay S. Kumar, P. Sadayappan, Gaurang Mehta, Karan Vahi, Ewa Deelman, Varun Ratnakar, Jihie Kim, Yolanda Gil, Mary W. Hall, Tahsin M. Kurç, Joel H. Saltz, An Integrated Framework for Parameter-based Optimization of Scientific Workflows, HPDC 2009: 177--186 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nan Dun, Kenjiro Taura, Akinori Yonezawa, ParaTrac: A Fine-Grained Profiler for Data-Intensive Workflows, The International Symposium on High Performance Distributed Computing (HPDC'10), Chicago, USA, June 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. H. Truong, P. Brunner, etc., K-WfGrid Distributed Monitoring and Performance Analysis Services for Workflows in the Grid, Proceedings of the Second IEEE International Conference on e-Science and Grid Computing, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. N. R. Tallent, Effective Performance Measurement and Analysis of Multithreaded Applications, PPoPP'09, Raleigh NC, USA, Feb 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Radu Prodan, Thomas Fabringer, Overhead Analysis of Scientific Workflows in Grid Environments, IEEE Transactions n Parallel and Distributed System, Vol. 19, No. 3, Mar 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Radu Prodan et al., Online Analysis and Runtime Steering of Dynamic Workflows in the ASKALON Grid Environment, CCGrid 07. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. C. Stratan, et al., A Performance Study of Grid Workflow Engines, the 9th IEEE/ACM Intl. Conf. on Grid Computing, Tsukuba, Japan, Sep 2008 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. James Frey, Todd Tannenbaum, Ian Foster, Miron Livny, and Steven Tuecke, "Condor-G: A Computation Management Agent for Multi-Institutional Grids", Journal of Cluster Computing volume 5, pages 237--246, 2002 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Gideon Juve, Ewa Deelman, Karan Vahi, Gaurang Mehta, Experiences with Resource Provisioning for Scientific Workflows Using Corral, Scientific Programming, 18:2, pp. 77--92, April 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. http://www.usc.edu/hpcc/Google ScholarGoogle Scholar
  26. http://scec.usc.edu/research/cme/groups/broadbandGoogle ScholarGoogle Scholar
  27. Amazon.com, "Elastic Compute Cloud (EC2)"; http://aws.amazon.com/ec2.Google ScholarGoogle Scholar
  28. S. Bharathi, A. Chervenak, E. Deelman, G. Mehta, M.-H. Su, and K. Vahi, "Characterization of Scientific Workflows," in The 3rd Workshop on Workflows in Support of Large-Scale Science (WORKS08), in conjunction with Supercomputing (SC08) Conference Austin, Texas, November, 2008.Google ScholarGoogle Scholar
  29. Livny, J., Teonadi, H., Livny, M., and Waldor, M.K. (2008). High-throughput, kingdom-wide prediction and annotation of bacterial non-coding RNAs. PLoS ONE 3, e3197Google ScholarGoogle Scholar
  30. A. Abramovici, W. Althouse, et al., "LIGO: The Laser Interferometer Gravitational-Wave Observatory," Science, vol 256, pp. 325--333, 1992 1992.Google ScholarGoogle ScholarCross RefCross Ref
  31. Proteomics: http://www.ccic.ohio-state.edu/MS/proteomics.htmGoogle ScholarGoogle Scholar
  32. Gideon Juve, et al., Scientific Workflow Applications on Amazon EC2. Workshop on Cloud-based Services and Applications in conjunction with 5th IEEE Intl. Conf. on e-Science (e-Science 2009), Oxford UK, December 2009.Google ScholarGoogle Scholar
  33. FutureGrid: https://portal.futuregrid.org/Google ScholarGoogle Scholar
  34. Dave Hitz, James Lau, and Michael Malcolm, File system design for an NFS file server appliance, In Proceedings of the Winter 1994 USENIX Conference, San Francisco, CA, January 1994 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. P. H. Carns, W. B. Ligon III, R. B. Ross, and R. Thakur. PVFS: A parallel file system for linux clusters. In Proceedings of the 4th Annual Linux Showcase and Conference, pages 317--327, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. DAGUM, L., AND MENON, R. 1998. OpenMP: An industry- standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5, 1, 46--55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Globus Toolkit: http://www.globus.org/toolkit/Google ScholarGoogle Scholar
  38. Matlab Interactive distribution fitting: http://www.mathworks.com/help/toolbox/stats/dfittool.htmlGoogle ScholarGoogle Scholar

Index Terms

  1. Workflow overhead analysis and optimizations

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WORKS '11: Proceedings of the 6th workshop on Workflows in support of large-scale science
      November 2011
      154 pages
      ISBN:9781450311007
      DOI:10.1145/2110497

      Copyright © 2011 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 November 2011

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate30of54submissions,56%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader