ABSTRACT
The PerCo performance control framework is capable of managing the distributed execution of scientific coupled models using migration, for example, in response to changes in an execution environment. PerCo monitors execution times and reacts according to an adaptive performance control strategy whenever serious changes of behaviour occur. A computationally cheap technique is used per model to smooth the series of monitored execution times and to provide a short-term forecast for future execution times on currently assigned resources. Where this short-term forecast fails to be achieved, the system analyses whether migration would improve matters. For models that are candidates for migration, more accurate but computationally expensive techniques are used to form a longer-term prediction of future execution times on various candidate resources. Based on the predicted gain, a migration decision is made taking account of the expected cost of migration. Experimental results for small real scientific coupled models show that the performance control strategy behaves effectively in scenarios in which the ambient load is varied during execution.
- C. Armstrong, R. Ford, J. Gurd, M. Luján, K. R. Mayes, and G. D. Riley. Performance control of scientific coupled models in Grid environments. Concurrency and Computation: Practice and Experience, 17(2--4):259--295, 2005. Google ScholarDigital Library
- S. Basu, A. Mukherjee, and S. Klivansky. Time series models for internet traffic. In Proceedings of the Fifteenth Annual Joint Conference of the IEEE Computer and Communication Societies -- INFOCOM'96, volume 2, pages 611--620, 1996. Google ScholarDigital Library
- G. Box, G. Jenkins, and G. Reinsel. Time Series Analysis: Forecasting and Control. Prentice Hall, 3rd edition, 1994. Google ScholarDigital Library
- R. Delgado-Buscalioni, P. V. Coveney, G. D. Riley, and R. W. Ford. Hybrid molecular-continuum fluid models: implementation within a general coupling framework. Philosophical Transactions of the Royal Society: Series A, 363(1833):1975--1986, 2005.Google ScholarCross Ref
- P. A. Dinda. Design, implementation, and performance of an extensible toolkit for resource prediction in distributed systems. IEEE Transactions on Parallel and Distributed Systems, 17(2):160--173, 2006. Google ScholarDigital Library
- R. W. Ford, G. Riley, M. K. Bane, C. W. Armstrong, and T. Freeman. Gcf: A general coupling framework. Concurrency and Computation: Practice and Experience, 18(2):163--181, 2006. Google ScholarDigital Library
- R. Gibbons. A historical application profiler for use by parallel schedulers. In Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing, volume 1291 of Lecture Notes in Computer Science, pages 58--77, 1997. Google ScholarDigital Library
- F. Guim, A. Goyeneche, J. Corbalan, J. Labarta, and G. Terstyansky. Grid computing performance prediction based in historical information. In CoreGRID integration workshop. Integrated Research in Grid Computing Workshop, 2005.Google Scholar
- E. Huedo, R. S. Montero, and I. M. Llorente. A framework for adaptive execution in Grids. Software: Practice and Experience, 34(7):631--651, 2004. Google ScholarDigital Library
- N. H. Kapadia, J. A. B. Fortes, and C. E. Brodley. Predictive application-performance modeling in a computational grid environment. In Proceedings of the Eighth International Symposium on High Performance Distributed Computing -- HPDC'99, pages 47--54, 1999. Google ScholarDigital Library
- K. Kennedy et al. Toward a framework for preparing and executing adaptive grid applications. Proceedings of the 16th International Parallel and Distributed Processing Symposium -- IPDPS, 2002. Google ScholarDigital Library
- B. J. Lafreniere and A. C. Sodan. Scopred-scalable user-directed performance prediction using complexity modelling and historical data. In Proceedings of the 11th International Workshop on Job Scheduling Strategies for Parallel Processing -- JSSPP, volume 3834 of Lecture Notes in Computer Science, pages 62--90, 2005. Google ScholarDigital Library
- B.-D. Lee and J. M. Schopf. Run-time prediction of parallel applications on shared environments. In Proceedings of the IEEE International Conference on Cluster Computing, pages 487--491, 2003.Google Scholar
- S. G. Makridakis, S. C. Wheelwright, and R. J. Hyndman. Forecasting: Methods and Applications. John Wiley & Sons, 3rd edition, 1998.Google Scholar
- K. R. Mayes, M. Luján, G. D. Riley, J. Chin, P. V. Coveney, and J. R. Gurd. Towards performance control on the Grid. Philosophical Transactions of the Royal Society: Series A, 363(1833):1793--1805, 2005.Google ScholarCross Ref
- L. J. Senger, M. J. Santana, and R. H. C. Santana. An instance-based learning approach for predicting execution times of parallel applications. In Proceedings of the 3rd International Information and Telecommunication Technologies Symposium, pages 9--15, 2005.Google Scholar
- W. Smith, I. T. Foster, and V. E. Taylor. Predicting application run times using historical information. In Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing -- JSSPP, volume 1459 of Lecture Notes in Computer Science, pages 122--142, 1998. Google ScholarDigital Library
- V. Taylor, X. Wu, and R. Stevens. Prophesy: An infrastructure for performance analysis and modelling of parallel and Grid applications. ACM SIGMETRICS Performance Evaluation Review, 30(4):13--18, 2003. Google ScholarDigital Library
- N. Tran and D. A. Reed. ARIMA time series modelling and forecasting for adaptive I/O prefetching. In Proceedings 15th International Conference on Supercomputing -- ICS'01, pages 473--485, 2001. Google ScholarDigital Library
- S. S. Vadhiyar and J. J. Dongarra. Self adaptivity in Grid computing. Concurrency and Computation: Practice and Experience, 17(2--4):235--257, 2005. Google ScholarDigital Library
- S. Vazhkudai and J. M. Schopf. Using regression techniques to predict large data transfers. International Journal of High Performance Computing Applications, 17(3):249--268, 2003. Google ScholarDigital Library
- R. Wolski, N. T. Spring, and J. Hayes. The Network Weather Service: A distributed resource performance forecasting service for metacomputing. Future Generation Computing Systems, 15(5):757--768, 1999. Google ScholarDigital Library
Index Terms
- Adaptive performance control for distributed scientific coupled models
Recommendations
Performance control of scientific coupled models in Grid environments: Research Articles
Grid PerformanceIn recent years, there has been increasing interest in the development of computer simulations of complex biological systems, and of multi-physics and multi-scale physical phenomena. Applications have been developed that involve the coupling together of ...
Adaptive Workflow Processing and Execution in Pegasus
GPC-WORKSHOPS '08: Proceedings of the 2008 The 3rd International Conference on Grid and Pervasive Computing - WorkshopsWorkflows are widely used in applications that require coordinated use of computational resources. Workflow definition languages typically abstract over some aspects of the way in which a workflow is to be executed, such as the level of parallelism to ...
Adaptive middleware supporting scalable performance for high-end network services
Network service-based computation is a promising paradigm for both scientific and engineering, and enterprise computing. The network service allows users to focus on their application and obtain services when needed, simply by invoking the service ...
Comments