Abstract
The Cloud with its abundant on-demand processor, storage, and bandwidth capacities and the elastic billing models has been emerging as a promising platform to scientific workflow computations. However, in reality, due to ineffective use of or practical constraints on the provisioned resources, the best-effort model to allocate as many as possible resources from Clouds is not always cost effective or feasible for cloud users to compute their workflow applications. To address this problem, in this paper, we study the effective use of a virtual cluster with a shared finite storage system to improve the performance of the workflow scheduling. Since the concurrent executions of multiple concurrent instances of the workflow are subject to the storage capacity constraints, deadlock resolution is our major concern in the performance optimization. To this end, we propose an effective admission control scheme (ACS) that integrates a set of deadlock resolution algorithms to admit workflow instances to the system based on the available storage capacities. With ACS, we can reduce the competitiveness on the finite storage and minimize the adverse impact of deadlock as well. We show the benefits of ACS via intensive simulation studies on the performance changes of a set of selected benchmark workflows. Our results demonstrate that the proposed ACS is a cost-effective way to fully utilize the provisioned storage resources for workflow scheduling in cloud virtual clusters.
Similar content being viewed by others
Notes
A virtual cluster is a collection of virtual machines that have been configured to act like a traditional HPC cluster. This typically involves installing and configuring job management software, such as a batch scheduler, and a shared storage system (e.g., network/distributed file system).
The virtual block-based storage devices that provide virtual machines (VMs) with access to physical storage on local disk drives.
Storage units can be allocated from the shared file or storage system in the virtual cluster.
We do not count the memory cost as it can be attributed to the processor cost.
The critical path time of a workflow instance is informally defined as the most time-consuming sequence of jobs that must be carried out sequentially even if there are infinite resources for parallelism. The critical path time defines the minimum time that the workflow instance must take.
Dilworth theorem is equivalent to König’s theorem on bipartite matching, and the later theorem can lead to an algorithm.
References
NASA Ames and the Courant Institute at NYU (2012) Cart3D, http://people.nas.nasa.gov/aftosmis/cart3d/cart3Dhome.html
Awano Y, Kuribayashi S-i (2012) Reducing power consumption and improving quality of service in cloud computing environments. In: Proceedings of the 2012 15th international conference on network-based information systems, NBIS ’12DC, Washington, pp 1–6
Bent J, Thain D, Arpaci-Dusseau AC, Arpaci-Dusseau RH, Livny M (2004) Explicit control in a batch-aware distributed file system. In: Proceedings of networked systems design and implementation (NSDI), San Francisco, pp 365–378
Bharathi S, Chervenak A, Deelman E, Mehta G, Su Mei-Hui, Vahi K (2008) Characterization of scientific workflows. In: The 3rd workshop on workflows in support of large-scale science, WORKS 2008, pp 1–10
Blelloch GE, Fineman JT, Shun J (2012) Greedy sequential maximal independent set and matching are parallel on average. In: Proceedinbgs of the 24th ACM symposium on parallelism in algorithms and architectures, SPAA ’12, ACM, New York, pp 308–317
Chen W, Deelman E (2012) Integration of workflow partitioning and resource provisioning. In: Proceedings of the 2012 12th IEEE/ACM international symposium on cluster, cloud and grid computing (ccgrid 2012), CCGRID ’12, pp 764–768
Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-science: an overview of workflow system features and capabilities. Fut Gene Comput Syst 25(5):528–540
Dilworth RP (1950) A decomposition theorem for partially ordered sets. Ann Math 51:161–166
Djorgovski SG, Gal RR, Odewahn SC, de Carvalho RR, Brunner R, Longo G, Scaramella R (1998) The digital palomar sky survey (DPOSS). Wide Field Surv Cosmol 1:10–20
Gburzynski P (2012) SMURPH, http://www.olsonet.com/pg/PAPERS/side.pdfonline
Glatard T, Montagnat J, Pennec X (2005) Grid-enabled workflows for data intensive medical applications. In: 18th IEEE symposium on computer-based medical systems, Trinity College Dublin, pp 537–542
GoGrid (2012) online: http://www.gogrid.com
Gray J, Liu DT, Nieto-Santisteban M, Szalay AS, DeWitt D, Heber G (2005) Scientific data management in the coming decade. Technical Report MSR-TR-2005-10, Microsoft Corporation
GROMACS (2012) online: http://www.gromacs.org
Hoffa C, Mehta G, Freeman T, Deelman E, Keahey K, Berriman B, Good J (2008) On the use of cloud computing for scientific workflows. In: IEEE Fourth International Conference on eScience, 2008. eScience ’08, pp 640–645
Islam M, Balaji P, Sadayappan P, Panda DK (2004) Towards provision of quality of service guarantees in job scheduling. In: Proceedings of the 2004 IEEE international conference on cluster computing, CLUSTER ’04, pp 245–254
Islam M, Balaji P, Sadayappan P, Panda DK (2003) Qops: a qos based scheme for parallel job scheduling. In: Dror G. Feitelson, Larry Rudolph, Uwe Schwiegelshohn (eds) JSSPP, vol 2862 of Lecture Notes in Computer Science, Springer, Berlin, pp 252–268
Juve Gideon, Deelman Ewa, Bruce Berriman G, Berman Benjamin P, Maechling Philip (2012) An evaluation of the cost and performance of scientific workflows on amazon ec2. J Grid Comput 10(1):5–21
Juve G, Deelman E, Vahi K, Mehta G, Berriman B, Berman Benjamin P, Maechling P (2010) Data sharing options for scientific workflows on amazon ec2. In: Proceedings of the 2010 ACM/IEEE international conference for high performance computing, networking, storage and analysis, SC ’10, pp 1–9
Knight K, Marcu D (2005) Machine translation in the year 2004. In: In International conference on acoustics, speech, and signal processing (ICASSP), pp 965–968
Lang S-D (1999) An extended banker’s algorithm for deadlock avoidance. IEEE Trans Softw Eng 25(3):428–432
Philip M, Hans C, Maureen D, Ewa D, Yolanda G, Sridhar G, Vipin G, Carl K, Jihic K, Gaurang M, Brian M, Thomas R, Gurmeet S, Marc S, Garrick S, Karan V (2005) Simplifying construction of complex workflows for non-expert users of the southern california earthquake center community modeling environment. SIGMOD Rec 34(3):24–30
Ramakrishnan A, Singh G, Zhao H, Deelman E, Sakellariou R, Vahi K, Blackburn K, Mayers D, Samidi M (2007) Scheduling data-intensive workflows onto storage-constrained distributed resources. In: Proceedings of the 7th IEEE international symposium on cluster computing and the grid, pp 401–409
Rosenberg AL (2004) On scheduling mesh-structured computations for internet-based computing. IEEE Trans Comput 53(9):1176–1186
Sethi R (1975) Complete register allocation problem. SIAM J Comput 3(3):226–248
Sulistio A, Buyya R (2005) A time optimization algorithm for scheduling bag-of-task applications in auction-based proportional share systems. In: Proceedings of the 17th international symposium on computer architecture and high performance computing, Rio de Janeiro, Brazil, pp 235–242
Sum AK, de Pablo JJ (2002) Nautilus: molecular simulation code. Technical report, University of Wisconsin-Madison, Department of Chemical Engineering, Madison
Varia J, Buyya R, Broberg J, Goscinski A (2010) Architecting applications for the amazon cloud, cloud computing: principles and paradigms. Wiley Press, New York
Wang Yang, Lu P (2013) Maximizing active storage resources with deadlock avoidance in workflow-based computations. IEEE Trans Comput 62(11):2210–2223
Wang Yang, Paul Lu (2011) Dataflow detection and applications to workflow scheduling. Concurr Comput 23(11):1261–1283
Wang Yang, Paul Lu (2013) DDS: a deadlock detection-based scheduling algorithm for workflow computations in hpc systems with storage constraints. Paral Comput 39(8):291–305
Wu H, Hua X, Li Z, Ren S (2013) Resource minimization for real-time applications using computer clouds. In: Proceedings of 6th IEEE/ACM international conference on utility and cloud, computing, pp 1–8
Linlin W, Garg SK, Buyya R (2012) Sla-based admission control for a software-as-a-service provider in cloud computing environments. J Comput Syst Sci 78(5):1280–1299
Yu Z, Shi W (2007) An adaptive rescheduling strategy for grid workflow applications. In: Proceedings of the IEEE international parallel & distributed processing symposium, Long Beach, CA, pp 214–220
Zhang W, Cao J, Zhong Y, Liu L, Cheng W (2008) An integrated resource management and scheduling system for grid data streaming applications. In: Proceedings of the 2008 9th IEEE/ACM international conference on grid computing, GRID ’08DC, Washington, pp 258–265
Zhang Y, Koelbel C, Kennedy K (2007) Relative performance of scheduling algorithms in grid environment. In: Proceedings of the 7th IEEE international symposium on cluster computing and the grid, Rio de Janeiro, Brazil
Zheng W, Sakellariou R (2012) Budget-deadline constrained workflow planning for admission control in market-oriented environments. In: Proceedings of the 8th international conference on economics of grids, clouds, systems, and services, GECON’11, Heidelberg, Berlin, pp 105–119
Zhu M, Wu Q, Zhao Y (2012) A cost-effective scheduling algorithm for scientific workflows in clouds. In: 2012 IEEE 31st, international performance computing and communications conference (IPCCC), pp 256–265
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, Y., Hu, M. & Kent, K.B. ACS: an effective admission control scheme with deadlock resolutions for workflow scheduling in clouds. Computing 97, 379–402 (2015). https://doi.org/10.1007/s00607-014-0409-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-014-0409-6