ABSTRACT
The execution of scientific workflows often suffers from a variety of overheads in distributed environments. It is essential to identify the different overheads and to evaluate how optimization methods help reduce overheads and improve runtime performance. In this paper, we present an overhead analysis for a set of workflow runs on cloud and grid platforms. We present the overhead distributions and conclude that they satisfy an exponential or uniform distribution. We compare three methods to calculate the cumulative sum of the overheads based on how they overlap. In addition, we indicate how experimental parameters impact the overhead and thereby the overall workflow performance. We then show how popular optimization methods improve runtime performance by reducing some or all types of overheads.
- Gil, Y.; Deelman, E.; Ellisman, M.; Fahringer, T.; Fox, G.; Gannon, D.; Goble, C.; Livny, M.; Moreau, L.; Myers, Examing the Challenges of Scientific Workflows, J. - Computer , Vol.40, no.12, pp.24--32, Dec. 2007 Google ScholarDigital Library
- E. Deelman, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, S. Patil, M.-H. Su, K. Vahi, and M. Livny. Pegasus: Mapping scientific workflows onto the Grid. Lecture Notes in Computer Science: Grid Computing, pages 11--20, 2004.Google Scholar
- Peter Couvares, Tevik Kosar, Alain Roy, Jeff Weber and Kent Wenger, "Workflow in Condor", in In Workflows for e-Science, Editors: I. Taylor, E. Deelman, D. Gannon, M. Shields, Springer Press, January 2007 (ISBN: 1--84628--519--4)Google Scholar
- J. Frey, T. Tannenbaum, M. Livny, I. Foster, and S. Tuecke, "Condor-G: A Computation Management Agent for Multi-Institutional Grids," Cluster Computing, vol. 5, pp. 237--246 2002. Google ScholarDigital Library
- G. B. Berriman, E. Deelman, J. Good, J. Jacob, D. S. Katz, C. Kesselman, A. Laity, T. A. Prince, G. Singh, and M.-H. Su, "Montage: A Grid Enabled Engine for Delivering Custom Science-Grade Mosaics On Demand," presented at SPIE Conference 5487: Astronomical Telescopes, 2004.Google Scholar
- A. Lathers, M.-H. Su, A. Kulungowski, A. W. Lin, G. Mehta, S. T. Peltier, E. Deelman, and M. H. Ellisman, "Enabling parallel scientific applications with workflow tools," presented at Challenges of Large Applications in Distributed Environments, 2006 IEEE, 2006.Google Scholar
- E. Deelman, J. Blythe, Y. Gil, and C. Kesselman, "Workflow Management in GriPhyN," in Grid Resource Management: State of the Art and Future Trends, J. Nabrzyski, J. M. Schopf, and J. Weglarz, Eds.: Springer, 2003. Google ScholarDigital Library
- E. Deelman, S. Callaghan, E. Field, H. Francoeur, R. Graves, N. Gupta, V. Gupta, T. H. Jordan, C. Kesselman, P. Maechling, J. Mehringer, G. Mehta, D. Okaya, K. Vahi, and L. Zhao, "Managing Large-Scale Workflow Execution from Resource Provisioning to Provenance Tracking: The CyberShake Example," presented at Second IEEE International Conference on e-Science and Grid Computing, 2006. Google ScholarDigital Library
- C. Catlett, "The philosophy of TeraGrid: building an open, extensible, distributed TeraScale facility," presented at Cluster Computing and the Grid 2nd IEEE/ACM International Symposium CCGRID2002, 2002. Google ScholarDigital Library
- The Open Science Grid Consortium, http://www.opensciencegrid.org.Google Scholar
- Louis-Claude Canon, Emmanuel Jeannot, Rizos Sakellariou, and Wei Zheng. Comparative evaluation of the Robustness of DAG Scheduling Heuristics. In Grid Computing: Achievements and Prospects (eds: Sergei Gorlatch, Paraskevi Fragopoulou, Thierry Priol), Springer, 2008, pp. 73--84 (ISBN 978-0--387-09456--4) Google ScholarDigital Library
- Gurmeet Singh, Mei-Hui Su, Karan Vahi, Ewa Deelman, Bruce Berriman, John Good, Daniel S. Katz, and Gaurang Mehta. Workflow Task Clustering for Best Effort Systems with Pegasus Mardi Gras Conference, Baton Rouge, LA, January 2008 Google ScholarDigital Library
- Sang-Min Park; Humphrey, M.; Data throttling for data-intensive workflows, IEEE Intl. Symposium on Parallel and Distributed Processing, Miami, FL, April 2008.Google Scholar
- Ann Chervenak, Ewa Deelman, Miron Livny, Mei-Hui Su, Rob Schuler, Shishir Bharathi, Gaurang Mehta, Karan Vahi, Data Placement for Scientific Applications in Distributed Environments, Proceedings of Grid Conference 2007, Austin, Texas, September 2007. Google ScholarDigital Library
- Nezil Yigitbasi, and Dick Epema, Static and Dynamic Overprovisioning Strategies for Performance Consistency in Grids, The 11th ACM/IEEE International Conference on Grid Computing (Grid 2010), Brussels, Belgium, Oct 2010.Google Scholar
- Vijay S. Kumar, P. Sadayappan, Gaurang Mehta, Karan Vahi, Ewa Deelman, Varun Ratnakar, Jihie Kim, Yolanda Gil, Mary W. Hall, Tahsin M. Kurç, Joel H. Saltz, An Integrated Framework for Parameter-based Optimization of Scientific Workflows, HPDC 2009: 177--186 Google ScholarDigital Library
- Nan Dun, Kenjiro Taura, Akinori Yonezawa, ParaTrac: A Fine-Grained Profiler for Data-Intensive Workflows, The International Symposium on High Performance Distributed Computing (HPDC'10), Chicago, USA, June 2010. Google ScholarDigital Library
- H. Truong, P. Brunner, etc., K-WfGrid Distributed Monitoring and Performance Analysis Services for Workflows in the Grid, Proceedings of the Second IEEE International Conference on e-Science and Grid Computing, 2006. Google ScholarDigital Library
- N. R. Tallent, Effective Performance Measurement and Analysis of Multithreaded Applications, PPoPP'09, Raleigh NC, USA, Feb 2009. Google ScholarDigital Library
- Radu Prodan, Thomas Fabringer, Overhead Analysis of Scientific Workflows in Grid Environments, IEEE Transactions n Parallel and Distributed System, Vol. 19, No. 3, Mar 2008. Google ScholarDigital Library
- Radu Prodan et al., Online Analysis and Runtime Steering of Dynamic Workflows in the ASKALON Grid Environment, CCGrid 07. Google ScholarDigital Library
- C. Stratan, et al., A Performance Study of Grid Workflow Engines, the 9th IEEE/ACM Intl. Conf. on Grid Computing, Tsukuba, Japan, Sep 2008 Google ScholarDigital Library
- James Frey, Todd Tannenbaum, Ian Foster, Miron Livny, and Steven Tuecke, "Condor-G: A Computation Management Agent for Multi-Institutional Grids", Journal of Cluster Computing volume 5, pages 237--246, 2002 Google ScholarDigital Library
- Gideon Juve, Ewa Deelman, Karan Vahi, Gaurang Mehta, Experiences with Resource Provisioning for Scientific Workflows Using Corral, Scientific Programming, 18:2, pp. 77--92, April 2010. Google ScholarDigital Library
- http://www.usc.edu/hpcc/Google Scholar
- http://scec.usc.edu/research/cme/groups/broadbandGoogle Scholar
- Amazon.com, "Elastic Compute Cloud (EC2)"; http://aws.amazon.com/ec2.Google Scholar
- S. Bharathi, A. Chervenak, E. Deelman, G. Mehta, M.-H. Su, and K. Vahi, "Characterization of Scientific Workflows," in The 3rd Workshop on Workflows in Support of Large-Scale Science (WORKS08), in conjunction with Supercomputing (SC08) Conference Austin, Texas, November, 2008.Google Scholar
- Livny, J., Teonadi, H., Livny, M., and Waldor, M.K. (2008). High-throughput, kingdom-wide prediction and annotation of bacterial non-coding RNAs. PLoS ONE 3, e3197Google Scholar
- A. Abramovici, W. Althouse, et al., "LIGO: The Laser Interferometer Gravitational-Wave Observatory," Science, vol 256, pp. 325--333, 1992 1992.Google ScholarCross Ref
- Proteomics: http://www.ccic.ohio-state.edu/MS/proteomics.htmGoogle Scholar
- Gideon Juve, et al., Scientific Workflow Applications on Amazon EC2. Workshop on Cloud-based Services and Applications in conjunction with 5th IEEE Intl. Conf. on e-Science (e-Science 2009), Oxford UK, December 2009.Google Scholar
- FutureGrid: https://portal.futuregrid.org/Google Scholar
- Dave Hitz, James Lau, and Michael Malcolm, File system design for an NFS file server appliance, In Proceedings of the Winter 1994 USENIX Conference, San Francisco, CA, January 1994 Google ScholarDigital Library
- P. H. Carns, W. B. Ligon III, R. B. Ross, and R. Thakur. PVFS: A parallel file system for linux clusters. In Proceedings of the 4th Annual Linux Showcase and Conference, pages 317--327, 2000. Google ScholarDigital Library
- DAGUM, L., AND MENON, R. 1998. OpenMP: An industry- standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5, 1, 46--55. Google ScholarDigital Library
- Globus Toolkit: http://www.globus.org/toolkit/Google Scholar
- Matlab Interactive distribution fitting: http://www.mathworks.com/help/toolbox/stats/dfittool.htmlGoogle Scholar
Index Terms
- Workflow overhead analysis and optimizations
Recommendations
Pegasus, a workflow management system for science automation
Modern science often requires the execution of large-scale, multi-stage simulation and data analysis pipelines to enable the study of complex systems. The amount of computation and data involved in these pipelines requires scalable workflow management ...
Specification and runtime workflow support in the ASKALON Grid environment
Dynamic Computational Workflows: Discovery, Optimization and SchedulingWe describe techniques to support the runtime execution of scientific workflows in the ASKALON Grid environment. We present a formal model and three middleware services that support in combination the effective execution in heterogeneous and dynamic ...
A taxonomy of scientific workflow systems for grid computing
With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios ...
Comments