Abstract

Today, Cloud computing proposes an attractive alternative to building large-scale distributed computing environments by which resources are no longer hosted by the scientists' computational facilities, but leased from specialised data centres only when and for how long they are needed. This new class of Cloud resources raises new interesting research questions in the fields of resource management, scheduling, fault tolerance, or quality of service, requiring hundreds to thousands of experiments for finding valid solutions. To enable such research, a scalable simulation framework is typically required for early prototyping, extensive testing and validation of results before the real deployment is performed. The scope of this paper is twofold. In the first part we present GroudSim, a Grid and Cloud simulation toolkit for scientific computing based on a scalable simulation-independent discrete-event engine. GroudSim provides a comprehensive set of features for complex simulation scenarios from simple job executions on leased computing resources to file transfers, calculation of costs and background load on resources. Simulations can be parameterised and are easily extendable by probability distribution packages for failures which normally occur in complex distributed environments. Experimental results demonstrate the improved scalability of GroudSim compared to a related process-based simulation approach. In the second part, we show the use of the GroudSim simulator to analyse the problem of dynamic provisioning of Cloud resources to scientific workflows that do not benefit from sufficient Grid resources as required by their computational demands. We propose and study four strategies for provisioning and releasing Cloud resources that take into account the general leasing model encountered in today's commercial Cloud environments based on resource bulks, fuzzy descriptions and hourly payment intervals. We study the impact of our techniques to the overall execution time, overall cost and cost per unit of saved time with respect to various instance types offered by the Amazon EC2.