Abstract
Cloud computing, an important source of computing power for the scientific community, requires enhanced tools for an efficient use of resources. Current solutions for workflows execution lack frameworks to deeply analyze applications and consider realistic execution times as well as computation costs. In this study, we propose cloud user–provider affiliation (CUPA) to guide workflow’s owners in identifying the required tools to have his/her application running. Additionally, we develop PSO-DS, a specialized scheduling algorithm based on particle swarm optimization. CUPA encompasses the interaction of cloud resources, workflow manager system and scheduling algorithm. Its featured scheduler PSO-DS is capable of converging strategic tasks distribution among resources to efficiently optimize makespan and monetary cost. We compared PSO-DS performance against four well-known scientific workflow schedulers. In a test bed based on VMware vSphere, schedulers mapped five up-to-date benchmarks representing different scientific areas. PSO-DS proved its efficiency by reducing makespan and monetary cost of tested workflows by 75 and 78%, respectively, when compared with other algorithms. CUPA, with the featured PSO-DS, opens the path to develop a full system in which scientific cloud users can run their computationally expensive experiments.














Similar content being viewed by others
References
Bharathi S, Chervenak A, Deelman E, Mehta G, Su M-H, Vahi K (2008) Characterization of scientific workflows. In: 3rd Workshop on Workflows in Support of Large-Scale Science, 2008. WORKS 2008, pp 1–10
Miao Y, Wang L, Liu D, Ma Y, Zhang W, Chen L (2015) A Web 2.0-based science gateway for massive remote sensing image processing. Concurr Comput Pract Exp 27:2489–2501
Liu P, Yuan T, Ma Y, Wang L, Liu D, Yue S et al (2014) Parallel processing of massive remote sensing images in a GPU architecture. Comput Inform 33:197–217
Deelman E, Blythe J, Gil Y, Kesselman C, Mehta G, Patil S et al (2004) Pegasus: Mapping scientific workflows onto the grid. In: undefined. Springer, Heidelberg, pp 11—20
HTCondor: High Throughput Computing. http://research.cs.wisc.edu/htcondor/
Gutierrez-Garcia JO, Sim KM (2012) Agent-based cloud workflow execution. Integr Comput Aided Eng 19:39–56
Jrad F, Tao J, Streit A (2013) A broker-based framework for multi-cloud workflows. In: Proceedings of the 2013 International Workshop on Multi-cloud Applications and Federated Clouds, pp 61–68
De Oliveira D, Ogasawara E, Baião F, Mattoso M (2010) Scicumulus: A lightweight cloud middleware to explore many task computing paradigm in scientific workflows. In: 2010 IEEE 3rd International Conference on Cloud Computing (CLOUD), pp 378–385
Pandey S, Karunamoorthy D, Buyya R (2011) Workflow engine for clouds. Cloud computing: principles and paradigms, pp 321–344. doi:10.1002/9780470940105.ch12
Wang L, Chen D, Hu Y, Ma Y, Wang J (2013) Towards enabling cyberinfrastructure as a service in clouds. Comput Electr Eng 39:3–14
Chen D, Wang L, Wu X, Chen J, Khan SU, Kołodziej J et al (2013) Hybrid modelling and simulation of huge crowd over a hierarchical grid architecture. Future Gener Comput Syst 29:1309–1317
The Kepler Project. https://kepler-project.org/
Taverna Workflow Management System. http://www.taverna.org.uk/
Yang Y, Liu K, Chen J, Lignier J, Jin H (2007) Peer-to-peer based grid workflow runtime environment of SwinDeW-G. In: IEEE International Conference on e-Science and Grid Computing, pp 51–58
Topcuoglu H, Hariri S, M-y Wu (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13:260–274
de Oliveira D, Ocaña KA, Baião F, Mattoso M (2012) A provenance-based adaptive scheduling heuristic for parallel scientific workflows in clouds. J Grid Comput 10:521–552
Tsakalozos K, Kllapi H, Sitaridi E, Roussopoulos M, Paparas D, Delis A (2011) Flexible use of cloud resources through profit maximization and price discrimination. In: IEEE 27th International Conference on Data Engineering (ICDE), 2011 pp 75–86
Ros S, Caminero AC, Hernández R, Robles-Gómez A, Tobarra L (2014) Cloud-based architecture for web applications with load forecasting mechanism: a use case on the e-learning services of a distant university. J Supercomput 68:1556–1578
Casas I, Taheri J, Ranjan R, Wang L, Zomaya AY (2016) A balanced scheduler with data reuse and replication for scientific workflows in cloud computing systems. Future Gener Comput Sys. doi:10.1016/j.future.2015.12.005
Casas I, Taheri J, Ranjan R,Wang L, Zomaya A (2016) GA-ETI: An enhanced genetic algorithm for the scheduling of scientific workflows in cloud environments. J Comput Sci
Burger D, Austin TM (1997) The SimpleScalar tool set, version 2.0. ACM SIGARCH Comput Archit News 25:13–25
Ekman M, Stenstrom P (2003) Performance and power impact of issue-width in chip-multiprocessor cores. In: Proceedings 2003 International Conference on Parallel Processing, pp 359–368
Gordon-Ross A, Vahid F (2005) Frequent loop detection using efficient nonintrusive on-chip hardware. IEEE Trans Comput 54:1203–1215
Krishna R, Mahlke S, Austin T (2003) Architectural optimizations for low-power, real-time speech recognition. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp 220–231
Lau J, Schoenmackers S, Sherwood T, Calder B (2003) Reducing code size with echo instructions. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp 84–94
Mathew B, Davis A, Fang Z (2003) A low-power accelerator for the SPHINX 3 speech recognition system. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp 210–219
Suresh DC, Agrawal B, Yang J, Najjar W, Bhuyan L (2003) Power efficient encoding techniques for off-chip data buses. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp 267–275
Zhang W, Kandemir M, Sivasubramaniam A, Irwin MJ (2003) Performance, energy, and reliability tradeoffs in replicating hot cache lines. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp 309–317
Zhang Y, Gupta R (2003) Enabling partial cache line prefetching through data compression. In: Proceedings 2003 International Conference on Parallel Processing, pp 277–285
Eberhart RC, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the 6th International Symposium on Micro Machine and Human Science, pp 39–43
Kennedy J (2011) Particle swarm optimization. In: Encyclopedia of machine learning. Springer, New York, pp 760–766
Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, pp 4104–4108
Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence, pp 69–73
Kennedy J, Kennedy JF, Eberhart RC, Shi Y (2001) Swarm intelligence. Morgan Kaufmann, Burlington
Liao C-J, Tseng C-T, Luarn P (2007) A discrete version of particle swarm optimization for flowshop scheduling problems. Comput Oper Res 34:3099–3111
Shi Y, Eberhart RC (1998) Parameter selection in particle swarm optimization. In: International Conference on Evolutionary Programming, pp 591–600
Taheri J, Zomaya AY, Khan SU (2012) Genetic algorithm in finding Pareto frontier of optimizing data transfer versus job execution in grids. Concurr Comput Pract Exp 28(6):1715–1736
Acknowledgements
The authors would like to thank the Commonwealth Scientific and Industrial Research Organisation (CSIRO) and Consejo Nacional de Ciencia Tecnología (Conacyt) for supporting this work.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Casas, I., Taheri, J., Ranjan, R. et al. PSO-DS: a scheduling engine for scientific workflow managers. J Supercomput 73, 3924–3947 (2017). https://doi.org/10.1007/s11227-017-1992-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-017-1992-z