Abstract
To program in distributed computing environments such as grids and clouds, workflow is adopted as an attractive paradigm for its powerful ability in expressing a wide range of applications, including scientific computing, multi-tier Web, and big data processing applications. With the development of cloud technology and extensive deployment of cloud platform, the problem of workflow scheduling in cloud becomes an important research topic. The challenges of the problem lie in: NP-hard nature of task-resource mapping; diverse QoS requirements; on-demand resource provisioning; performance fluctuation and failure handling; hybrid resource scheduling; data storage and transmission optimization. Consequently, a number of studies, focusing on different aspects, emerged in the literature. In this paper, we firstly conduct taxonomy and comparative review on workflow scheduling algorithms. Then, we make a comprehensive survey of workflow scheduling in cloud environment in a problem–solution manner. Based on the analysis, we also highlight some research directions for future investigation.




Similar content being viewed by others
Notes
Task is also referred to as node, subtask, activity, stage, job, or transformation in different works.
In this paper, cost is referred to monetary cost. While the performance of an algorithm is described as time, such as execution time, etc.
the makespan is achieved by using fastest resources for all the tasks.
References
Amazon ec2 pricing. http://aws.amazon.com/ec2/pricing/
Abawajy JH (2004) Fault-tolerant scheduling policy for grid computing systems. In: Proceedings of parallel and distributed processing symposium, 2004, 18th international, IEEE, p 238
Abrishami S, Naghibzadeh M, Epema DH (2012) Cost-driven scheduling of grid workflows using partial critical paths. IEEE Trans Parallel Distrib Syst 23(8):1400–1414
Abrishami S, Naghibzadeh M, Epema DH (2013) Deadline-constrained workflow scheduling algorithms for infrastructure as a service clouds. Future Gener Comput Syst 29(1):158–169
Ahmad I, Dhodhi MK (1995) Task assignment using a problem genetic algorithm. Concurr Pract Exp 7(5):411–428
Ahmad I, Kwok YK (1998) On exploiting task duplication in parallel program scheduling. IEEE Trans Parallel Distrib Syst 9(9):872–892
Ali S, Maciejewski AA, Siegel HJ, Kim JK (2004) Measuring the robustness of a resource allocation. IEEE Trans Parallel Distrib Syst 15(7):630–641
Ali S, Sait SM, Benten MS (1994) Gsa: Scheduling and allocation using genetic algorithm. In: Proceedings of the conference on European design automation, IEEE, pp 84–89
Andrews T, Curbera F, Dholakia H, Goland Y, Klein J, Leymann F, Liu K, Roller D, Smith D, Thatte S et al (2003) Business process execution language for web services
Arabnejad H, Barbosa JG (2014) A budget constrained scheduling algorithm for workflow applications. J Grid Comput, pp 1–15
Arabnejad H, Barbosa JG (2014) List scheduling algorithm for heterogeneous systems by an optimistic cost table. IEEE Trans Parallel Distrib Syst 25(3):682–694
Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I et al (2010) A view of cloud computing. Commun ACM 53(4):50–58
Baskiyar S, Abdel-Kader R (2010) Energy aware dag scheduling on heterogeneous systems. Clust Comput 13(4):373–383
Beguelin A, Seligman E, Stephan P (1997) Application level fault tolerance in heterogeneous networks of workstations. J Parallel Distrib Comput 43(2):147–155
Beloglazov A, Abawajy J, Buyya R (2012) Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Future Gener Comput Syst 28(5):755–768
Beloglazov A, Buyya R (2012) Optimal online deterministic algorithms and adaptive heuristics for energy and performance efficient dynamic consolidation of virtual machines in cloud data centers. Concurr Comput Pract Exp 24(13):1397–1420
Ben-Yehuda OA, Ben-Yehuda M, Schuster A, Tsafrir D (2013) Deconstructing amazon ec2 spot instance pricing. ACM Trans Econ Comput 1(3)
Bentley PJ, Wakefield JP (1996) An analysis of multiobjective optimization within genetic algorithms. Tech Rep 96:1–14
Bessai K, Youcef S, Oulamara A, Godart C, Nurcan S (2012) Bi-criteria workflow tasks allocation and scheduling in cloud computing environments. In: Proceedings of IEEE 5th international conference on cloud computing (CLOUD), IEEE, pp 638–645
Bharathi S, Chervenak A (2009) Data staging strategies and their impact on the execution of scientific workflows. In: Proceedings of the second international workshop on data-aware distributed computing, ACM, p 41–50
Bianchini R, Rajamony R (2004) Power and energy management for server systems. Computer 37(11):68–76
Bilgaiyan S, Sagnika S, Das M (2014) Workflow scheduling in cloud computing environment using cat swarm optimization. In: Proceedings of 2014 IEEE international advance computing conference (IACC), IEEE, pp 680–685
Bittencourt LF, Madeira ER (2008) A performance-oriented adaptive scheduler for dependent tasks on grids. Concurr Comput Pract Exp 20(9):1029–1049
Bittencourt LF, Madeira ERM (2011) Hcoc: a cost optimization algorithm for workflow scheduling in hybrid clouds. J Internet Serv Appl 2(3):207–227
Blythe J, Jain S, Deelman E, Gil Y, Vahi K, Mandal A, Kennedy K (2005) Task scheduling strategies for workflow-based applications in grids. In: Proceedings of cluster computing and the grid, CCGrid 2005, vol 2, IEEE International Symposium on 2005, pp 759–767
Bölöni L, Marinescu DC (2002) Robust scheduling of metaprograms. J Sched 5(5):395–412
Bozdaǧ D, Özgüner F, Catalyurek UV (2009) Compaction of schedules and a two-stage approach for duplication-based dag scheduling. IEEE Trans Parallel Distrib Syst 20(6):857–871
Braun TD, Siegel HJ, Beck N, Boloni LL, Muthucumaru M, Reuther AI, Robertson JP, Theys MD, Yao B, Hensgen D, Freund RF (1999) A comparison study of static mapping heuristics for a class of meta-tasks on heterogeneous computing systems. In: Proceedings of 8’th heterogeneous computing workshop, IEEE, pp 15–29
Brevik J, Nurmi D, Wolski R (2006) Predicting bounds on queuing delay for batch-scheduled parallel machines. In: Proceedings of the eleventh ACM SIGPLAN symposium on principles and practice of parallel programming, ACM, pp 110–118
Buyya R, Yeo CS, Venugopal S, Broberg J, Brandic I (2009) Cloud computing and emerging it platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gener Comput Syst 25(6):599–616
Byun EK, Kee YS, Deelman E, Vahi K, Mehta G, Kim JS (2008) Estimating resource needs for time-constrained workflows. In: Proceedings of IEEE fourth international conference on eScience, IEEE, pp 31–38
Byun EK, Kee YS, Kim JS, Deelman E, Maeng S (2011) Bts: resource capacity estimate for time-targeted science workflows. J Parallel Distrib Comput 71(6):848–862
Byun EK, Kee YS, Kim JS, Maeng S (2011) Cost optimized provisioning of elastic resources for application workflows. Future Gener Comput Syst 27(8):1011–1026
Canon LC, Jeannot E (2010) Evaluation and optimization of the robustness of dag schedules in heterogeneous environments. IEEE Trans Parallel Distrib Syst 21(4):532–546
Canon LC, Jeannot E, Sakellariou R, Zheng W (2008) Comparative evaluation of the robustness of dag scheduling heuristics. In: Proceedings of grid computing, Springer, New York, pp 73–84
Cao F, Zhu MM, Wu CQ (2014) Energy-efficient resource management for scientific workflows in clouds. In: Proceedings of services (SERVICES), IEEE World Congress on 2014, IEEE, pp 402–409
Chen H, Shirazi B, Marquis J (1993) Performance evaluation of a novel scheduling method: linear clustering with task duplication. In: Proceedings of the 2nd international conference on parallel and distributed systems
Chen WN, Zhang J (2009) An ant colony optimization approach to a grid workflow scheduling problem with various qos requirements. IEEE Trans Syst Man Cybern Part C Appl Rev 39(1):29–43
Cherkasova L, Gardner R (2005) Measuring cpu overhead for i/o processing in the xen virtual machine monitor. In: Proceedings of USENIX annual technical conference, general track, vol 50
Cherkasova L, Phaal P (2002) Session-based admission control: a mechanism for peak load management of commercial web sites. IEEE Trans Comput 51(6):669–685
Chu SC, Tsai PW (2007) Computational intelligence based on the behavior of cats. Int J Innov Comput Inf Control 3(1):163–173
Chung YC, Ranka S (1992) Applications and performance analysis of a compile-time optimization approach for list scheduling algorithms on distributed memory multiprocessors. In: Proceedings of supercomputing ’92. IEEE, pp 512–521
Coffman EG (1976) Computer and job shop scheduling theory. Wiley, New York
Coffman EG, Graham RL (1972) Optimal scheduling for two-processor systems. Acta Informatica 1(3):200–213
Colin J, Chretienne P (1991) C.p.m. scheduling with small computation delays and task duplication. In: Proceedings of operations research, pp 680–684
Cordasco G, Malewicz G, Rosenberg AL (2010) Extending ic-scheduling via the sweep algorithm. J Parallel Distrib Comput 70(3):201–211
Corrêa RC, Ferreira A, Rebreyend P (1996) Integrating list heuristics into genetic algorithms for multiprocessor scheduling. In: Proceedings of eighth symposium on parallel and distributed processing, IEEE, pp 462–469
Darbha S, Agrawal DP (1998) Optimal scheduling algorithm for distributed-memory machines. IEEE Trans Parallel Distrib Syst 9(1):87–95
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evolut Comput 6(2):182–197
Dejun J, Pierre G, Chi CH (2010) Ec2 performance analysis for resource provisioning of service-oriented applications. In: Proceedings of ICSOC/ServiceWave 2009 workshops service-oriented computing, Springer, New York, pp 197–207
Dodin B (1985) Bounding the project completion time distribution in pert networks. Op Res 33(4):862–881
Dogan A, Ozguner F (2002) Matching and scheduling algorithms for minimizing execution time and failure probability of applications in heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):308–323
Dogan A, Özgüner F (2005) Biobjective scheduling algorithms for execution time? Reliability trade-off in heterogeneous computing systems. Comput J 48:300–314. doi:10.1093/comjnl/bxh086
Dongarra JJ, Jeannot E, Saule E, Shi Z (2007) Bi-objective scheduling algorithms for optimizing makespan and reliability on heterogeneous systems. In: Proceedings of the nineteenth annual ACM symposium on parallel algorithms and architectures, ACM, pp 280–288
Dubois D, Fargier H, Fortemps P (2003) Fuzzy scheduling: modelling flexible constraints vs. coping with incomplete knowledge. Eur J Op Res 147(2):231–252
EI-Rewini H, Lewis TG (1990) Scheduling parallel program tasks onto arbitrary target machines. J Parallel Distrib Comput 1(9):138–153
Elastichosts. http://www.elastichosts.com/
Fard HM, Prodan R, Barrionuevo JJD, Fahringer T (2012) A multi-objective approach for workflow scheduling in heterogeneous environments. In: Proceedings of the 2012 12th IEEE/ACM international symposium on cluster, cloud and grid computing (ccgrid 2012), IEEE Computer Society, pp 300–309
Fard HM, Prodan R, Fahringer T (2013) A truthful dynamic workflow scheduling mechanism for commercial multicloud environments. IEEE Trans Parallel Distrib Syst 24(6):1203–1212
Fayad C, Garibaldi JM, Ouelhadj D (2007) Fuzzy grid scheduling using tabu search. In: Proceedings of IEEE international fuzzy systems conference, IEEE, pp 1–6
Fayad C, Petrovic S (2005) A fuzzy genetic algorithm for real-world job shop scheduling. In: Proceedings of innovations in applied artificial intelligence, Springer, New York, pp 524–533
Fishburn PC (1985) Interval graphs and interval orders. Discret Math 55(2):135–149
Foster I, Zhao Y, Raicu I, Lu S (2008) Cloud computing and grid computing 360-degree compared. In: Proceedings of grid computing environments workshop 2008, GCE’08, IEEE, pp 1–10
Gao PX, Curtis AR, Wong B, Keshav S (2012) It’s not easy being green. ACM SIGCOMM Comput Commun Rev 42(4):211–222
Gogrid cloud hosting. http://www.gogrid.com/
Garg SK, Buyya R, Siegel HJ (2010) Time and cost trade-off management for scheduling parallel applications on utility grids. Future Gener Comput Syst 26(8):1344–1355
Ge R, Feng X, Cameron KW (2005) Performance-constrained distributed dvs scheduling for scientific applications on power-aware clusters. In: Proceedings of the 2005 ACM/IEEE conference on supercomputing, IEEE Computer Society, p 34
Gerasoulis A, Yang T (1993) On the granularity and clustering of directed acyclic task graphs. IEEE Trans Parallel Distrib Syst 4(6):686–701
Hakem M, Butelle F (2007) Reliability and scheduling on systems subject to failures. In: Proceedings of international conference on parallel processing, IEEE, pp 38–38
Hönig U, Schiffmann W (2006) A meta-algorithm for scheduling multiple dags in homogeneous system environments. In: Proceedings of the eighteenth IASTED international conference on parallel and distributed computing and systems (PDCS06)
Hou ESH, Ansari N, Ren H (1994) A genetic algorithm for multiprocessor scheduling. IEEE Trans Parallel Distrib Syst 5(2):113–120
Hu TC (1961) Parallel sequencing and assembly line problems. Op Res 9(6):841–848
Hwang JJ, Chow YC, Lee FDACY (1989) Scheduling precedence graphs in systems with interprocessor communication times. SIAM J Comput 18(2):244–257
Hwang S, Kesselman C (2003) Grid workflow: a flexible failure handling framework for the grid. In: Proceeedings of high performance distributed computing, 12th IEEE international symposium on 2003, IEEE, pp 126–137
Iosup A, Ostermann S, Yigitbasi MN, Prodan R, Fahringer T, Epema DH (2011) Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Trans Parallel Distrib Syst 22(6):931–945
Isard M, Budiu M, Yu Y, Birrell A, Fetterly D (2007) Dryad: distributed data-parallel programs from sequential building blocks. In: Proceedings of ACM SIGOPS operating systems review, vol 41. ACM, pp 59–72
Iverson MA, Özgüner F (1999) Hierarchical, competitive scheduling of multiple dags in a dynamic heterogeneous environment. Distrib Syst Eng 6(3):112
Iyer R, Tewari V, Kant K (2001) Overload control mechanisms for web servers. In: Proceedings of performance and QoS of next generation networking, Springer, New York, pp 225–244
Juve G, Chervenak A, Deelman E, Bharathi S, Mehta G, Vahi K (2013) Characterizing and profiling scientific workflows. Future Gener Comput Syst 29(3):682–692
Juve G, Deelman E, Vahi K, Mehta G, Berriman B, Berman BP, Maechling P (2010) Data sharing options for scientific workflows on amazon ec2. In: Proceedings of the 2010 ACM/IEEE international conference for high performance computing, networking, storage and analysis, IEEE Computer Society, pp 1–9
Kerbyson DJ, Alme HJ, Hoisie A, Petrini F, Wasserman HJ, Gittings M (2001) Predictive performance and scalability modeling of a large-scale application. In: Proceedings of the 2001 ACM/IEEE conference on supercomputing (CDROM), ACM, pp 37–37
Kim KH, Beloglazov A, Buyya R (2011) Power-aware provisioning of virtual machines for real-time cloud services. Concurr Comput Pract Exp 23(13):1491–1505
Kim KH, Buyya R, Kim J (2007) Power aware scheduling of bag-of-tasks applications with deadline constraints on dvs-enabled clusters. CCGRID 7:541–548
Kim SJ, Browne JC (1991) A general approach to mapping of parallel computation upon multiprocessor architectures. In: Proceedings of the 1991 ACM/IEEE conference on supercomputing ’91, ACM/IEEE, pp 633–642
Knowles J, Corne D (1999) The pareto archived evolution strategy: a new baseline algorithm for pareto multiobjective optimisation. In: Proceedings of the 1999 congress on evolutionary computation, IEEE, vol 1
Kruatrachue B, Lewis T (1988) Grain determination for parallel processing systems. In: Proceedings of the twenty-first annual Hawaii international conference on software track, IEEE, pp 119–128
Kruatrachue B, Lewis T (1988) Grain size determination for parallel processing. IEEE Softw 5(1):23–32
Kwok YK, Ahmad I (1996) Dynamic critical-path scheduling: an effective technique for allocating task graphs to multiprocessors. IEEE Trans Parallel Distrib Syst 7(5):506–521
Kwok YK, Ahmad I (1997) Efficient scheduling of arbitrary task graphs to multiprocessors using a parallel genetic algorithm. J Parallel Distrib Comput 47(1):58–77
Kwok YK, Ahmad I (1998) Benchmarking the task graph scheduling algorithms. In: Proceedings of the international parallel processing symposium, IEEE, pp 531–537
Kwok YK, Ahmad I (1999) Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput Surv (CSUR) 31(4):406–471
Lee K, Paton NW, Sakellariou R, Deelman E, Fernandes AA, Mehta G (2009) Adaptive workflow processing and execution in pegasus. Concurr Comput Pract Exp 21(16):1965–1981
Lee YC, Subrata R, Zomaya AY (2009) On the performance of a dual-objective optimization model for workflow applications on grid platforms. Parallel Distrib Syst IEEE Trans 20(9):1273–1284
Lee YC, Zomaya AY (2011) Energy conscious scheduling for distributed computing systems under different operating conditions. Parallel Distrib Syst IEEE Trans 22(8):1374–1381
Leon J, Fisher AL, Steenkiste P (1993) Fail-safe pvm: a portable package for distributed programming with transparent recovery, technical report, DTIC Document
Li G, Chen D, Wang D, Zhang D (2003) Task clustering and scheduling to multiprocessors with duplication. In: Proceedings of the parallel and distributed processing symposium, IEEE
Li J, Su S, Cheng X, Huang Q, Zhang Z (2011) Cost-conscious scheduling for large graph processing in the cloud. In: Proceedings of 13th international conference on high performance computing and communications (HPCC), IEEE, pp 808–813
Lin X, Wu CQ (2013) On scientific workflow scheduling in clouds under budget constraint. In: Proceedings of 42nd international conference in parallel processing (ICPP), IEEE, pp 90–99
Liou JC, Palis MA (1996) An efficient task clustering heuristic for scheduling dags on multiprocessors. In: Proceedings of multiprocessors, workshop on resource management, symposium of parallel and distributed processing, pp 152–156
López MM, Heymann E, Senar MA (2006) Analysis of dynamic heuristics for workflow scheduling on grid systems. In: Proceedings of the fifth international symposium on parallel and distributed computing, IEEE, pp 199–207
Ludäscher B, Altintas I, Berkley C, Higgins D, Jaeger E, Jones M, Lee EA, Tao J, Zhao Y (2006) Scientific workflow management and the kepler system. Concurr Comput Pract Exp 18(10):1039–1065
Ludwig A, Möhring RH, Stork F (2001) A computational study on bounding the makespan distribution in stochastic project networks. Ann Op Res 102(1–4):49–64
Malawski M, Juve G, Deelman E, Nabrzyski J (2012) Cost-and deadline-constrained provisioning for scientific workflow ensembles in iaas clouds. In: Proceedings of the international conference on high performance computing, networking, storage and analysis, IEEE Computer Society Press, p 22
Malewicz G, Foster I, Rosenberg AL, Wilde M (2007) A tool for prioritizing dagman jobs and its evaluation. J Grid Comput 5(2):197–212
Mao M, Humphrey M (2011) Auto-scaling to minimize cost and meet application deadlines in cloud workflows. In: Proceedings of 2011 international conference for high performance computing, networking, storage and analysis, ACM, p 49
Mao M, Humphrey M (2013) Scaling and scheduling to maximize application performance within budget constraints in cloud workflows. In: Proceedings of 27th international symposium on parallel and distributed processing (IPDPS), IEEE, pp 67–78
Mao M, Li J, Humphrey M (2010) Cloud auto-scaling with deadline and budget constraints. In: Proceedings of grid computing (GRID), 11th IEEE/ACM international conference on 2010, IEEE, pp 41–48
Mell P, Grance T (2009) The nist definition of cloud computing. Natl Inst Stand Technol 53(6):50
Meng X, Pappas V, Zhang L (2010) Improving the scalability of data center networks with traffic-aware virtual machine placement. In: Proceedings of INFOCOM 2010, IEEE, pp 1–9
Menon A, Santos JR, Turner Y, Janakiraman GJ, Zwaenepoel W (2005) Diagnosing performance overheads in the xen virtual machine environment. In: Proceedings of the 1st ACM/USENIX international conference on virtual execution environments, ACM, pp 13–23
Mezmaz M, Melab N, Kessaci Y, Lee YC, Talbi EG, Zomaya AY, Tuyttens D (2011) A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems. J Parallel Distrib Comput 71(11):1497–1508
Michael RG, Johnson DS (1979) Computers and intractability, a guide to the theory of np-completeness. WH Freeman Co., San Francisco
Mishra R, Rastogi N, Zhu D, Mossé D, Melhem R (2003) Energy aware scheduling for distributed real-time systems. In: Proceedings of parallel and distributed processing symposium 2003, IEEE, p 9
Negoita C, Zadeh L, Zimmermann H (1978) Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst 1:3–28
Nudd GR, Kerbyson DJ, Papaefstathiou E, Perry SC, Harper JS, Wilcox DV (2000) Pacea toolset for the performance prediction of parallel and distributed systems. Int J High Perform Comput Appl 14(3):228–251
Nurmi D, Brevik J, Wolski R (2008) Qbets: queue bounds estimation from time series. In: Proceedings of job scheduling strategies for parallel processing, Springer, New York, pp 76–101
Nurmi D, Mandal A, Brevik J, Koelbel C, Wolski R, Kennedy K (2006) Evaluation of a workflow scheduler using integrated performance modelling and batch queue wait time prediction. In: Proceedings of the 2006 ACM/IEEE conference on supercomputing, ACM, p 119
Ostermann S, Prodan R (2012) Impact of variable priced cloud resources on scientific workflow scheduling. In: Proceedings of Euro-Par 2012 parallel processing, Springer, New York, pp 350–362
Palis MA, Liou JC, Wei DS (1996) Task clustering and scheduling for distributed memory parallel architectures. IEEE Trans Parallel Distrib Syst 7(1):46–55
Pandey S, Wu L, Guru SM, Buyya R (2010) A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments. In: Proceedings of 2010 24th IEEE international conference on advanced information networking and applications (AINA), IEEE, pp 400–407
Papadimitriou CH, Yannakakis, M (1988) Towards an architecture-independent analysis of parallel algorithms. In: Proceedings of the twentieth annual ACM symposium on theory of computing, STOC ’88, ACM, New York. doi:10.1145/62212.62262
Park GL, Shirazi B, Marquis J (1997) Dfrn: a new approach for duplication based scheduling for distributed memory multiprocessor systems. In: Proceedings of 11th international parallel processing symposium, pp 157–166
Park SM, Humphrey M (2008) Data throttling for data-intensive workflows. In: Proceedings of IEEE international symposium on parallel and distributed processing, IEEE, pp 1–11
Plank JS, Beck M, Kingsley G, Li K (1994) Libckpt: transparent checkpointing under unix. Computer Science Department
Pllana S, Fahringer T (2005) Performance prophet: a performance modeling and prediction tool for parallel and distributed programs. In: Proceedings of international conference workshops on parallel processing, IEEE, pp 509–516
Poola D, Garg SK, Buyya R, Yang Y, Ramamohanarao K (2014) Robust scheduling of scientific workflows with deadline and budget constraints in clouds. In: Proceedings of the 28th IEEE international conference on advanced information networking and applications (AINA-2014), pp 1–8
Poola D, Ramamohanarao K, Buyya R (2014) Fault-tolerant workflow scheduling using spot instances on clouds. Procedia Comput Sci 29:523–533
Prodan R, Wieczorek M (2010) Bi-criteria scheduling of scientific grid workflows. IEEE Trans Autom Sci Eng 7(2):364–376
Pruhs K, van Stee R, Uthaisombut P (2008) Speed scaling of tasks with precedence constraints. Theory Comput Syst 43(1):67–80
Radulescu A, van Gemund AJ, Lin HX (1999) Llb: a fast and effective scheduling algorithm for distributed-memory systems. In: Proceedings of the international parallel processing symposium, IEEE, pp 525–530
Ramakrishnan A, Singh G, Zhao H, Deelman E, Sakellariou R, Vahi K, Blackburn K, Meyers D, Samidi M (2007) Scheduling data-intensive workflows onto storage-constrained distributed resources. In: Proceeedings of seventh IEEE international symposium on cluster computing and the grid, IEEE, pp 401–409
Ramakrishnan L, Koelbel C, Kee YS, Wolski R, Nurmi D, Gannon D, Obertelli G, Yarkhan A, Mandal A, Huang TM et al (2009) Vgrads: enabling e-science workflows on grids and clouds with fault tolerance. In: Proceedings of the conference on high performance computing networking, storage and analysis, IEEE, pp 1–12
Rivoire S, Shah MA, Ranganathan P, Kozyrakis C (2007) Joulesort: a balanced energy-efficiency benchmark. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data, ACM, pp 365–376
Rodriguez MA, Buyya R (2014) Deadline based resource provisioning and scheduling algorithmfor scientific workflows on clouds. IEEE Trans Cloud Comput (to be published)
Rountree B, Lowenthal D, Funk S, Freeh V, de Supinski B, Schulz M (2007) Bounding energy consumption in large-scale mpi programs, in the ACM. In: Proceedings of IEEE conference on supercomputing, Nov 2007, vol 1
Sakellariou R, Zhao H (2004) A low-cost rescheduling policy for efficient mapping of workflows on grid systems. Sci Program 12(4):253–262
Sakellariou R, Zhao H, Tsiakkouri E, Dikaiakos MD (2007) Scheduling workflows with budget constraints. In: Proceedings of integrated research in GRID computing. Springer, New York, pp 189–202
Sarkar V (1987) Partitioning and scheduling parallel programs for execution on multiprocessors. PhD thesis, Stanford, CA, USA. UMI order no GAX87-23080
Schad J, Dittrich J, Quiané-Ruiz JA (2010) Runtime measurements in the cloud: observing, analyzing, and reducing variance. Proc VLDB Endow 3(1–2):460–471
Sharifi M, Shahrivari S, Salimi H (2013) Pasta: a power-aware solution to scheduling of precedence-constrained tasks on heterogeneous computing resources. Computing 95(1):67–88
Shestak V, Smith J, Siegel HJ, Maciejewski AA (2006) A stochastic approach to measuring the robustness of resource allocations in distributed systems. In: Proceedings of international conference on parallel processing, IEEE, pp 459–470
Shi Z, Jeannot E, Dongarra JJ (2006) Robust task scheduling in non-deterministic heterogeneous computing systems. In: Proceedings of cluster computing, IEEE international conference on 2006, IEEE, pp 1–10
Sih GC, Lee EA (1993) A compile-time scheduling heuristic for interconnection-constrained heterogeneous processor architectures. IEEE Trans Parallel Distrib Syst 4(2):175–187
Smith W, Foster I, Taylor V (1998) Predicting application run times using historical information. In: Proceedings of job scheduling strategies for parallel processing. Springer, New York, pp 122–142
Sonmez O, Yigitbasi N, Abrishami S, Iosup A, Epema D (2010) Performance analysis of dynamic workflow scheduling in multicluster grids. In: Proceedings of the 19th ACM international symposium on high performance distributed computing, ACM, pp 49–60
Stavrinides GL, Karatza HD (2011) Scheduling multiple task graphs in heterogeneous distributed real-time systems by exploiting schedule holes with bin packing techniques. Simul Modell Pract Theory 19(1):540–552
Stellner G (1996) Cocheck: checkpointing and process migration for mpi. In: Proceedings of the 10th international parallel processing symposium, IEEE, pp 526–531
Szepieniec T, Bubak M (2008) Investigation of the dag eligible jobs maximization algorithm in a grid. In: Proceedings of the 2008 9th IEEE/ACM international conference on grid computing, IEEE Computer Society, pp 340–345
Talukder A, Kirley M, Buyya R (2009) Multiobjective differential evolution for scheduling workflow applications on global grids. Concurr Comput Pract Exp 21(13):1742–1756
Taverna. http://www.taverna.org.uk/
Thain D, Tannenbaum T, Livny M (2005) Distributed computing in practice: the condor experience. Concurr Comput Pract Exp 17(2–4):323–356
Topcuoglu H, Hariri S, Wu MY (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13(3):260–274
Tsai YL, Huang KC, Chang HY, Ko J, Wang ET, Hsu CH (2012) Scheduling multiple scientific and engineering workflows through task clustering and best-fit allocation. In: Proceedings of IEEE eighth world congress in services, pp 1–8
Ullman JD (1975) Np-complete scheduling problems. J Comput Syst Sci 10(3):384–393
Venkatachalam V, Franz M (2005) Power reduction techniques for microprocessor systems. ACM Comput Surv (CSUR) 37(3):195–237
Vira C, Haimes YY (1983) Multiobjective decision making: theory and methodology. In: System science and engineering, vol 8. North-Holland
Wang H, Jing Q, Chen R, He B, Qian Z, Zhou L (2010) Distributed systems meet economics: pricing in the cloud. In: Proceedings of HotCloud’10. USENIX
Wang L, Siegel HJ, Roychowdhury VP, Maciejewski AA (1997) Task matching and scheduling in heterogeneous computing environments using a genetic-algorithm-based approach. IEEE Trans Parallel Distrib Syst 47(1):8–22
Wang M, Ramamohanarao K, Chen J (2012) Dependency-based risk evaluation for robust workflow scheduling. In: Proceedings of IEEE 26th international parallel and distributed processing symposium workshops and PhD forum (IPDPSW), IEEE, pp 2328–2335
Wang M, Zhu L, Chen J (2012) Risk-aware checkpoint selection in cloud-based scientific workflow. In: Proceedings of second international conference on cloud and green computing (CGC), IEEE, pp 137–144
Wang W, Niu D, Li B, Liang B (2013) Dynamic cloud resource reservation via cloud brokerage. In: Proceedings of 33rd international conference on distributed computing systems (ICDCS), IEEE, pp 400–409
Wieczorek M, Hoheisel A, Prodan R (2009) Towards a general model of the multi-criteria workflow scheduling on the grid. Future Gener Comput Syst 25(3):237–256
Wieczorek M, Podlipnig S, Prodan R, Fahringer T (2008) Bi-criteria scheduling of scientific workflows for the grid. In: Proceedings of 8th IEEE international symposium on cluster computing and the grid, IEEE, pp 9–16
Wieczorek M, Prodan R, Fahringer T (2005) Scheduling of scientific workflows in the askalon grid environment. ACM SIGMOD Record 34(3):56–62
Wu AS, Yu H, Jin S, Lin KC, Schiavone G (2004) An incremental genetic algorithm approach to multiprocessor scheduling. IEEE Trans Parallel Distrib Syst 15(9):824–834
Wu CM, Chang RS, Chan HY (2014) A green energy-efficient scheduling algorithm using the dvfs technique for cloud datacenters. Future Gener Comput Syst 37:141–147
Wu MY, Gajski DD (1990) Hypertool: a programming aid for message-passing systems. IEEE Trans Parallel Distrib Syst 1(3):330–343
Wu Z, Ni Z, Gu L, Liu X (2010) A revised discrete particle swarm optimization for cloud workflow scheduling. In: Proceedings of 2010 international conference on computational intelligence and security (CIS), IEEE, pp 184–188
Xiao P, Hu ZG, Zhang YP (2013) An energy-aware heuristic scheduling for data-intensive workflows in virtualized datacenters. J Comput Sci Technol 28(6):948–961
Yang T, Gerasoulis A (1991) A fast static scheduling algorithm for dags on an unbounded number of processors. In: Proceedings of the 1991 ACM/IEEE conference on supercomputing ’91, ACM/IEEE, pp 633–642
Yang T, Gerasoulis A (1994) Dsc: scheduling parallel tasks on an unbounded number of processors. IEEE Trans Parallel Distrib Syst 5(9):951–967
Yassa S, Chelouah R, Kadima H, Granado B (2013) Multi-objective approach for energy-aware workflow scheduling in cloud computing environments. Sci World J 2013. doi:10.1155/2013/350934
Yi S, Kondo D, Andrzejak A (2010) Reducing costs of spot instances via checkpointing in the amazon elastic compute cloud. In: Proceedings of IEEE 3rd international conference on cloud computing (CLOUD), IEEE, pp 236–243
Young L, McGough S, Newhouse S, Darlington J (2003) Scheduling architecture and algorithms within the iceni grid middleware. In: Proceedings of UK e-science all hands meeting, Citeseer, pp 5–12
Yu J, Buyya R (2006) A budget constrained scheduling of workflow applications on utility grids using genetic algorithms. In: Proceedings of workshop on workflows in support of large-scale science, IEEE
Yu J, Buyya R (2006) Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms. Sci Program 14(3):217–230
Yu J, Buyya R, Ramamohanarao K (2008) Workflow scheduling algorithms for grid computing. In: Proceedings of Metaheuristics for scheduling in distributed computing environments. Springer, New York, pp 173–214
Yu J, Buyya R, Tham CK (2005) Cost-based scheduling of scientific workflow applications on utility grids. In: Proceedings of first international conference on e-science and grid computing, IEEE, pp 8
Yu J, Kirley M, Buyya R (2007) Multi-objective planning for workflow execution on grids. In: Proceedings of the 8th IEEE/ACM international conference on grid computing, IEEE Computer Society, pp 10–17
Yu J, Ramamohanarao K, Buyya R Deadline/budget-based scheduling of workflows on utility grids. In: Proceedings of market-oriented grid and utility computing, pp 427–450
Yu Z, Shi W (2007) An adaptive rescheduling strategy for grid workflow applications. In: Proceedings of IEEE international parallel and distributed processing symposium, IEEE, pp 1–8
Yu Z, Shi W (2008) A planner-guided scheduling strategy for multiple workflow applications. In: Proceedings of international conference on parallel processing-workshops, IEEE, pp 1–8
Yu ZF, Shi WS (2010) Queue waiting time aware dynamic workflow scheduling in multicluster environments. J Comput Sci Technol 25(4):864–873
Yuan Y, Li X, Wang Q, Zhang Y (2008) Bottom level based heuristic for workflow scheduling in grids. Chin J Comput Chin 31(2):282
Yuan Y, Li X, Wang Q, Zhu X (2009) Deadline division-based heuristic for cost optimization in workflow scheduling. Inf Sci 179(15):2562–2575
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX conference on hot topics in cloud computing, p 10
Zeng L, Veeravalli B, Li X (2012) Scalestar: Budget conscious scheduling precedence-constrained many-task workflow applications in cloud. In: Proceedings of IEEE 26th international conference on advanced information networking and applications (AINA), IEEE, pp 534–541
Zhang Y, Koelbel C, Cooper K (2009) Hybrid re-scheduling mechanisms for workflow applications on multi-cluster grid. In: 9th IEEE/ACM international symposium on cluster computing and the grid, IEEE, pp 116–123
Zhao H, Sakellariou R (2006) Scheduling multiple dags onto heterogeneous systems. In: Proceedings of 20th international parallel and distributed processing symposium, IEEE, p 14
Zheng W, Sakellariou R (2012) Budget-deadline constrained workflow planning for admission control in market-oriented environments. In: Proceedings of economics of grids, clouds, systems, and services, Springer, New York, pp 105–119
Zheng W, Sakellariou R (2013) Budget-deadline constrained workflow planning for admission control. J Grid Comput 11(4):633–651
Zheng W, Sakellariou R (2013) Stochastic dag scheduling using a monte carlo approach. J Parallel Distrib Comput 73(12):1673–1689
Zhou AC, He B (2014) Transformation-based monetary cost optimizations for workflows in the cloud. IEEE Trans Cloud Comput 2(1):85–98
Zhou AC, He B, Liu C (2013) Monetary cost optimizations for hosting workflow-as-a-service in iaas clouds. arXiv:1306.6410
Zhu D, Melhem R, Childers BR (2003) Scheduling with dynamic voltage/speed adjustment using slack reclamation in multiprocessor real-time systems. Parallel Distrib Syst IEEE Trans 14(7):686–700
Zhu Q, Zhu J, Agrawal G (2010) Power-aware consolidation of scientific workflows in virtualized environments. In: Proceedings of the 2010 ACM/IEEE international conference for high performance computing, networking, storage and analysis, IEEE Computer Society, pp 1–12
Zitzler E, Laumanns M, Thiele L, Zitzler E, Zitzler E, Thiele L, Thiele L (2001) Spea 2: improving the strength pareto evolutionary algorithm
Zomaya AY, Ward C, Macey B (1999) Genetic scheduling for parallel processor systems: comparative studies and performance issues. IEEE Trans Parallel Distrib Syst 10(8):795–812
Acknowledgments
This work is supported by project (Grant No. 2013AA01A212) from the National 863 Program of China, project (Grant No. 61202121) from the National Natural Science Foundation of China, Science and technology project (Grant No. 2013Y2-00043) in Guangzhou of China.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wu, F., Wu, Q. & Tan, Y. Workflow scheduling in cloud: a survey. J Supercomput 71, 3373–3418 (2015). https://doi.org/10.1007/s11227-015-1438-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-015-1438-4