Skip to main content
Log in

ACS: an effective admission control scheme with deadlock resolutions for workflow scheduling in clouds

  • Published:
Computing Aims and scope Submit manuscript

Abstract

The Cloud with its abundant on-demand processor, storage, and bandwidth capacities and the elastic billing models has been emerging as a promising platform to scientific workflow computations. However, in reality, due to ineffective use of or practical constraints on the provisioned resources, the best-effort model to allocate as many as possible resources from Clouds is not always cost effective or feasible for cloud users to compute their workflow applications. To address this problem, in this paper, we study the effective use of a virtual cluster with a shared finite storage system to improve the performance of the workflow scheduling. Since the concurrent executions of multiple concurrent instances of the workflow are subject to the storage capacity constraints, deadlock resolution is our major concern in the performance optimization. To this end, we propose an effective admission control scheme (ACS) that integrates a set of deadlock resolution algorithms to admit workflow instances to the system based on the available storage capacities. With ACS, we can reduce the competitiveness on the finite storage and minimize the adverse impact of deadlock as well. We show the benefits of ACS via intensive simulation studies on the performance changes of a set of selected benchmark workflows. Our results demonstrate that the proposed ACS is a cost-effective way to fully utilize the provisioned storage resources for workflow scheduling in cloud virtual clusters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. A virtual cluster is a collection of virtual machines that have been configured to act like a traditional HPC cluster. This typically involves installing and configuring job management software, such as a batch scheduler, and a shared storage system (e.g., network/distributed file system).

  2. The virtual block-based storage devices that provide virtual machines (VMs) with access to physical storage on local disk drives.

  3. Storage units can be allocated from the shared file or storage system in the virtual cluster.

    Fig. 1
    figure 1

    Two configurations for computing an \(m\)-stage pipeline workload in clouds: circle represents job and the number beside each circle is the number of processor requirements of the job. The first job writes output file “foo” which is the input of the second job in the pipeline workflow. The numbers beside each node indicate the number of VCPUs required by that job

  4. We do not count the memory cost as it can be attributed to the processor cost.

  5. The critical path time of a workflow instance is informally defined as the most time-consuming sequence of jobs that must be carried out sequentially even if there are infinite resources for parallelism. The critical path time defines the minimum time that the workflow instance must take.

  6. Dilworth theorem is equivalent to König’s theorem on bipartite matching, and the later theorem can lead to an algorithm.

References

  1. NASA Ames and the Courant Institute at NYU (2012) Cart3D, http://people.nas.nasa.gov/aftosmis/cart3d/cart3Dhome.html

  2. Awano Y, Kuribayashi S-i (2012) Reducing power consumption and improving quality of service in cloud computing environments. In: Proceedings of the 2012 15th international conference on network-based information systems, NBIS ’12DC, Washington, pp 1–6

  3. Bent J, Thain D, Arpaci-Dusseau AC, Arpaci-Dusseau RH, Livny M (2004) Explicit control in a batch-aware distributed file system. In: Proceedings of networked systems design and implementation (NSDI), San Francisco, pp 365–378

  4. Bharathi S, Chervenak A, Deelman E, Mehta G, Su Mei-Hui, Vahi K (2008) Characterization of scientific workflows. In: The 3rd workshop on workflows in support of large-scale science, WORKS 2008, pp 1–10

  5. Blelloch GE, Fineman JT, Shun J (2012) Greedy sequential maximal independent set and matching are parallel on average. In: Proceedinbgs of the 24th ACM symposium on parallelism in algorithms and architectures, SPAA ’12, ACM, New York, pp 308–317

  6. Chen W, Deelman E (2012) Integration of workflow partitioning and resource provisioning. In: Proceedings of the 2012 12th IEEE/ACM international symposium on cluster, cloud and grid computing (ccgrid 2012), CCGRID ’12, pp 764–768

  7. Deelman E, Gannon D, Shields M, Taylor I (2009) Workflows and e-science: an overview of workflow system features and capabilities. Fut Gene Comput Syst 25(5):528–540

    Article  Google Scholar 

  8. Dilworth RP (1950) A decomposition theorem for partially ordered sets. Ann Math 51:161–166

    Article  MATH  MathSciNet  Google Scholar 

  9. Djorgovski SG, Gal RR, Odewahn SC, de Carvalho RR, Brunner R, Longo G, Scaramella R (1998) The digital palomar sky survey (DPOSS). Wide Field Surv Cosmol 1:10–20

  10. Gburzynski P (2012) SMURPH, http://www.olsonet.com/pg/PAPERS/side.pdfonline

  11. Glatard T, Montagnat J, Pennec X (2005) Grid-enabled workflows for data intensive medical applications. In: 18th IEEE symposium on computer-based medical systems, Trinity College Dublin, pp 537–542

  12. GoGrid (2012) online: http://www.gogrid.com

  13. Gray J, Liu DT, Nieto-Santisteban M, Szalay AS, DeWitt D, Heber G (2005) Scientific data management in the coming decade. Technical Report MSR-TR-2005-10, Microsoft Corporation

  14. GROMACS (2012) online: http://www.gromacs.org

  15. Hoffa C, Mehta G, Freeman T, Deelman E, Keahey K, Berriman B, Good J (2008) On the use of cloud computing for scientific workflows. In: IEEE Fourth International Conference on eScience, 2008. eScience ’08, pp 640–645

  16. Islam M, Balaji P, Sadayappan P, Panda DK (2004) Towards provision of quality of service guarantees in job scheduling. In: Proceedings of the 2004 IEEE international conference on cluster computing, CLUSTER ’04, pp 245–254

  17. Islam M, Balaji P, Sadayappan P, Panda DK (2003) Qops: a qos based scheme for parallel job scheduling. In: Dror G. Feitelson, Larry Rudolph, Uwe Schwiegelshohn (eds) JSSPP, vol 2862 of Lecture Notes in Computer Science, Springer, Berlin, pp 252–268

  18. Juve Gideon, Deelman Ewa, Bruce Berriman G, Berman Benjamin P, Maechling Philip (2012) An evaluation of the cost and performance of scientific workflows on amazon ec2. J Grid Comput 10(1):5–21

    Article  Google Scholar 

  19. Juve G, Deelman E, Vahi K, Mehta G, Berriman B, Berman Benjamin P, Maechling P (2010) Data sharing options for scientific workflows on amazon ec2. In: Proceedings of the 2010 ACM/IEEE international conference for high performance computing, networking, storage and analysis, SC ’10, pp 1–9

  20. Knight K, Marcu D (2005) Machine translation in the year 2004. In: In International conference on acoustics, speech, and signal processing (ICASSP), pp 965–968

  21. Lang S-D (1999) An extended banker’s algorithm for deadlock avoidance. IEEE Trans Softw Eng 25(3):428–432

    Article  Google Scholar 

  22. Philip M, Hans C, Maureen D, Ewa D, Yolanda G, Sridhar G, Vipin G, Carl K, Jihic K, Gaurang M, Brian M, Thomas R, Gurmeet S, Marc S, Garrick S, Karan V (2005) Simplifying construction of complex workflows for non-expert users of the southern california earthquake center community modeling environment. SIGMOD Rec 34(3):24–30

    Article  Google Scholar 

  23. Ramakrishnan A, Singh G, Zhao H, Deelman E, Sakellariou R, Vahi K, Blackburn K, Mayers D, Samidi M (2007) Scheduling data-intensive workflows onto storage-constrained distributed resources. In: Proceedings of the 7th IEEE international symposium on cluster computing and the grid, pp 401–409

  24. Rosenberg AL (2004) On scheduling mesh-structured computations for internet-based computing. IEEE Trans Comput 53(9):1176–1186

    Article  Google Scholar 

  25. Sethi R (1975) Complete register allocation problem. SIAM J Comput 3(3):226–248

    Article  MathSciNet  Google Scholar 

  26. Sulistio A, Buyya R (2005) A time optimization algorithm for scheduling bag-of-task applications in auction-based proportional share systems. In: Proceedings of the 17th international symposium on computer architecture and high performance computing, Rio de Janeiro, Brazil, pp 235–242

  27. Sum AK, de Pablo JJ (2002) Nautilus: molecular simulation code. Technical report, University of Wisconsin-Madison, Department of Chemical Engineering, Madison

  28. Varia J, Buyya R, Broberg J, Goscinski A (2010) Architecting applications for the amazon cloud, cloud computing: principles and paradigms. Wiley Press, New York

    Google Scholar 

  29. Wang Yang, Lu P (2013) Maximizing active storage resources with deadlock avoidance in workflow-based computations. IEEE Trans Comput 62(11):2210–2223

    Article  MathSciNet  Google Scholar 

  30. Wang Yang, Paul Lu (2011) Dataflow detection and applications to workflow scheduling. Concurr Comput 23(11):1261–1283

    Article  Google Scholar 

  31. Wang Yang, Paul Lu (2013) DDS: a deadlock detection-based scheduling algorithm for workflow computations in hpc systems with storage constraints. Paral Comput 39(8):291–305

    Article  Google Scholar 

  32. Wu H, Hua X, Li Z, Ren S (2013) Resource minimization for real-time applications using computer clouds. In: Proceedings of 6th IEEE/ACM international conference on utility and cloud, computing, pp 1–8

  33. Linlin W, Garg SK, Buyya R (2012) Sla-based admission control for a software-as-a-service provider in cloud computing environments. J Comput Syst Sci 78(5):1280–1299

    Article  Google Scholar 

  34. Yu Z, Shi W (2007) An adaptive rescheduling strategy for grid workflow applications. In: Proceedings of the IEEE international parallel & distributed processing symposium, Long Beach, CA, pp 214–220

  35. Zhang W, Cao J, Zhong Y, Liu L, Cheng W (2008) An integrated resource management and scheduling system for grid data streaming applications. In: Proceedings of the 2008 9th IEEE/ACM international conference on grid computing, GRID ’08DC, Washington, pp 258–265

  36. Zhang Y, Koelbel C, Kennedy K (2007) Relative performance of scheduling algorithms in grid environment. In: Proceedings of the 7th IEEE international symposium on cluster computing and the grid, Rio de Janeiro, Brazil

  37. Zheng W, Sakellariou R (2012) Budget-deadline constrained workflow planning for admission control in market-oriented environments. In: Proceedings of the 8th international conference on economics of grids, clouds, systems, and services, GECON’11, Heidelberg, Berlin, pp 105–119

  38. Zhu M, Wu Q, Zhao Y (2012) A cost-effective scheduling algorithm for scientific workflows in clouds. In: 2012 IEEE 31st, international performance computing and communications conference (IPCCC), pp 256–265

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Hu, M. & Kent, K.B. ACS: an effective admission control scheme with deadlock resolutions for workflow scheduling in clouds. Computing 97, 379–402 (2015). https://doi.org/10.1007/s00607-014-0409-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-014-0409-6

Keywords

Mathematics Subject Classification

Navigation