skip to main content
10.1145/1851476.1851522acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

Morco: middleware framework for long-running multi-component applications on batch grids

Published: 21 June 2010 Publication History

Abstract

While computational grids with multiple batch systems (batch grids) have been used for efficient executions of loosely-coupled and workflow-based parallel applications, they can also be powerful infrastructures for executing long-running multi-component parallel applications. In this paper, we have constructed a generic middleware framework for executing long-running multi-component applications with execution times much greater than execution time limits of batch queues. Our framework coordinates the distribution, execution, migration and restart of the components of the application on the multiple queues, where the component jobs of the different queues can have different queue waiting and startup times. We have used our framework with a foremost long-running multi-component application for climate modeling, the Community Climate System Model (CCSM). We have performed real multiple-site CCSM runs for 6.5 days of wallclock time spanning three sites with four queues and emulated external workloads. Our experiments indicate that multi-site executions can lead to good throughput of application execution.

References

[1]
}}A. Bhatele, S. Kumar, M. Chao, J. Phillips, Z. Gengbin, and L. Kale. Overcoming Scaling Challenges in Biomolecular Simulations across Multiple Platforms. In IPDPS '08: Proceedings of the 2008 IEEE International Symposium on Parallel and Distributed Processing, pages 1--12, 2008.
[2]
}}J. Buisson, O. Sonmez, H. Mohamed, W. Lammers, and D. Epema. Scheduling Malleable Applications in Multicluster Systems. In CLUSTER '07: Proceedings of the 2007 IEEE International Conference on Cluster Computing, pages 372--381, 2007.
[3]
}}Community Climate System Model (CCSM). http://www.ccsm.ucar.edu.
[4]
}}W. Collins, C. Bitz, M. Blackmon, G. Bonan, C. Bretherton, J. Carton, P. Chang, S. Doney, J. Hack, T. Henderson, J. Kiehl, W. Large, D. McKenna, B. Santer, and R. Smith. The community climate system model: Ccsm3. 1998.
[5]
}}B. Howe, P. Lawson, R. Bellinger, E. Anderson, E. Santos, J. Freire, C. Scheidegger, A. Baptista, and C. Silva. End-to-End eScience: Integrating Workflow, Query, Visualization, and Provenance at an Ocean Observatory. In ESCIENCE '08: Proceedings of the 2008 Fourth IEEE International Conference on eScience, pages 127--134, 2008.
[6]
}}Y. Joshi and S. Vadhiyar. Analysis of DNA Sequence Transformations on Grids. Journal of Parallel and Distributed Computing, 69(1):80--90, 2009.
[7]
}}S. V. Kumar, P. Sadayappan, G. Mehta, K. Vahi, E. Deelman, V. Ratnakar, J. Kim, Y. Gil, M. Hall, T. Kurc, and J. Saltz. An Integrated Framework for Performance-based Optimization of Scientific Workflows. In HPDC '09: Proceedings of the 18th ACM international symposium on High performance distributed computing, pages 177--186, 2009.
[8]
}}U. Lublin and D. Feitelson. The Workload on Parallel Supercomputers: Modeling the Characteristics of Rigid Jobs. Journal of Parallel and Distributed Computing, 63(11):1105--1122, 2003.
[9]
}}N. Markatchev, C. Kiddle, and R. Simmonds. A Framework for Executing Long Running Jobs in Grid Environments. In HPCS '08: Proceedings of the 22nd International Symposium on High Performance Computing Systems and Applications, pages 69--75, 2008.
[10]
}}K. Nomura, R. Seymour, W. Weiqiang, H. Dursun, R. Kalia, A. Nakano, P. Vashishta, F. Shimojo, and L. Yang. A Metascalable Computing Framework for Large Spatiotemporal-scale Atomistic Simulations. In IPDPS '09: Proceedings of the 2009 IEEE International Symposium on Parallel and Distributed Processing, pages 1--10, 2009.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HPDC '10: Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
June 2010
911 pages
ISBN:9781605589428
DOI:10.1145/1851476
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. batch systems
  2. check-pointing
  3. climate models
  4. migration
  5. multi-component applications
  6. rescheduling

Qualifiers

  • Research-article

Funding Sources

Conference

HPDC '10
Sponsor:

Acceptance Rates

Overall Acceptance Rate 166 of 966 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 93
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media