Skip to main content
Log in

PSO-DS: a scheduling engine for scientific workflow managers

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Cloud computing, an important source of computing power for the scientific community, requires enhanced tools for an efficient use of resources. Current solutions for workflows execution lack frameworks to deeply analyze applications and consider realistic execution times as well as computation costs. In this study, we propose cloud user–provider affiliation (CUPA) to guide workflow’s owners in identifying the required tools to have his/her application running. Additionally, we develop PSO-DS, a specialized scheduling algorithm based on particle swarm optimization. CUPA encompasses the interaction of cloud resources, workflow manager system and scheduling algorithm. Its featured scheduler PSO-DS is capable of converging strategic tasks distribution among resources to efficiently optimize makespan and monetary cost. We compared PSO-DS performance against four well-known scientific workflow schedulers. In a test bed based on VMware vSphere, schedulers mapped five up-to-date benchmarks representing different scientific areas. PSO-DS proved its efficiency by reducing makespan and monetary cost of tested workflows by 75 and 78%, respectively, when compared with other algorithms. CUPA, with the featured PSO-DS, opens the path to develop a full system in which scientific cloud users can run their computationally expensive experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Bharathi S, Chervenak A, Deelman E, Mehta G, Su M-H, Vahi K (2008) Characterization of scientific workflows. In: 3rd Workshop on Workflows in Support of Large-Scale Science, 2008. WORKS 2008, pp 1–10

  2. Miao Y, Wang L, Liu D, Ma Y, Zhang W, Chen L (2015) A Web 2.0-based science gateway for massive remote sensing image processing. Concurr Comput Pract Exp 27:2489–2501

    Article  Google Scholar 

  3. Liu P, Yuan T, Ma Y, Wang L, Liu D, Yue S et al (2014) Parallel processing of massive remote sensing images in a GPU architecture. Comput Inform 33:197–217

    Google Scholar 

  4. Deelman E, Blythe J, Gil Y, Kesselman C, Mehta G, Patil S et al (2004) Pegasus: Mapping scientific workflows onto the grid. In: undefined. Springer, Heidelberg, pp 11—20

  5. HTCondor: High Throughput Computing. http://research.cs.wisc.edu/htcondor/

  6. Gutierrez-Garcia JO, Sim KM (2012) Agent-based cloud workflow execution. Integr Comput Aided Eng 19:39–56

    Google Scholar 

  7. Jrad F, Tao J, Streit A (2013) A broker-based framework for multi-cloud workflows. In: Proceedings of the 2013 International Workshop on Multi-cloud Applications and Federated Clouds, pp 61–68

  8. De Oliveira D, Ogasawara E, Baião F, Mattoso M (2010) Scicumulus: A lightweight cloud middleware to explore many task computing paradigm in scientific workflows. In: 2010 IEEE 3rd International Conference on Cloud Computing (CLOUD), pp 378–385

  9. Pandey S, Karunamoorthy D, Buyya R (2011) Workflow engine for clouds. Cloud computing: principles and paradigms, pp 321–344. doi:10.1002/9780470940105.ch12

  10. Wang L, Chen D, Hu Y, Ma Y, Wang J (2013) Towards enabling cyberinfrastructure as a service in clouds. Comput Electr Eng 39:3–14

    Article  Google Scholar 

  11. Chen D, Wang L, Wu X, Chen J, Khan SU, Kołodziej J et al (2013) Hybrid modelling and simulation of huge crowd over a hierarchical grid architecture. Future Gener Comput Syst 29:1309–1317

    Article  Google Scholar 

  12. The Kepler Project. https://kepler-project.org/

  13. Taverna Workflow Management System. http://www.taverna.org.uk/

  14. Yang Y, Liu K, Chen J, Lignier J, Jin H (2007) Peer-to-peer based grid workflow runtime environment of SwinDeW-G. In: IEEE International Conference on e-Science and Grid Computing, pp 51–58

  15. Topcuoglu H, Hariri S, M-y Wu (2002) Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans Parallel Distrib Syst 13:260–274

    Article  Google Scholar 

  16. de Oliveira D, Ocaña KA, Baião F, Mattoso M (2012) A provenance-based adaptive scheduling heuristic for parallel scientific workflows in clouds. J Grid Comput 10:521–552

    Article  Google Scholar 

  17. Tsakalozos K, Kllapi H, Sitaridi E, Roussopoulos M, Paparas D, Delis A (2011) Flexible use of cloud resources through profit maximization and price discrimination. In: IEEE 27th International Conference on Data Engineering (ICDE), 2011 pp 75–86

  18. Ros S, Caminero AC, Hernández R, Robles-Gómez A, Tobarra L (2014) Cloud-based architecture for web applications with load forecasting mechanism: a use case on the e-learning services of a distant university. J Supercomput 68:1556–1578

    Article  Google Scholar 

  19. Casas I, Taheri J, Ranjan R, Wang L, Zomaya AY (2016) A balanced scheduler with data reuse and replication for scientific workflows in cloud computing systems. Future Gener Comput Sys. doi:10.1016/j.future.2015.12.005

  20. Casas I, Taheri J, Ranjan R,Wang L, Zomaya A (2016) GA-ETI: An enhanced genetic algorithm for the scheduling of scientific workflows in cloud environments. J Comput Sci

  21. Burger D, Austin TM (1997) The SimpleScalar tool set, version 2.0. ACM SIGARCH Comput Archit News 25:13–25

    Article  Google Scholar 

  22. Ekman M, Stenstrom P (2003) Performance and power impact of issue-width in chip-multiprocessor cores. In: Proceedings 2003 International Conference on Parallel Processing, pp 359–368

  23. Gordon-Ross A, Vahid F (2005) Frequent loop detection using efficient nonintrusive on-chip hardware. IEEE Trans Comput 54:1203–1215

    Article  Google Scholar 

  24. Krishna R, Mahlke S, Austin T (2003) Architectural optimizations for low-power, real-time speech recognition. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp 220–231

  25. Lau J, Schoenmackers S, Sherwood T, Calder B (2003) Reducing code size with echo instructions. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp 84–94

  26. Mathew B, Davis A, Fang Z (2003) A low-power accelerator for the SPHINX 3 speech recognition system. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp 210–219

  27. Suresh DC, Agrawal B, Yang J, Najjar W, Bhuyan L (2003) Power efficient encoding techniques for off-chip data buses. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp 267–275

  28. Zhang W, Kandemir M, Sivasubramaniam A, Irwin MJ (2003) Performance, energy, and reliability tradeoffs in replicating hot cache lines. In: Proceedings of the 2003 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, pp 309–317

  29. Zhang Y, Gupta R (2003) Enabling partial cache line prefetching through data compression. In: Proceedings 2003 International Conference on Parallel Processing, pp 277–285

  30. Eberhart RC, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the 6th International Symposium on Micro Machine and Human Science, pp 39–43

  31. Kennedy J (2011) Particle swarm optimization. In: Encyclopedia of machine learning. Springer, New York, pp 760–766

  32. Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, pp 4104–4108

  33. Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence, pp 69–73

  34. Kennedy J, Kennedy JF, Eberhart RC, Shi Y (2001) Swarm intelligence. Morgan Kaufmann, Burlington

    Google Scholar 

  35. Liao C-J, Tseng C-T, Luarn P (2007) A discrete version of particle swarm optimization for flowshop scheduling problems. Comput Oper Res 34:3099–3111

    Article  MATH  Google Scholar 

  36. Shi Y, Eberhart RC (1998) Parameter selection in particle swarm optimization. In: International Conference on Evolutionary Programming, pp 591–600

  37. Taheri J, Zomaya AY, Khan SU (2012) Genetic algorithm in finding Pareto frontier of optimizing data transfer versus job execution in grids. Concurr Comput Pract Exp 28(6):1715–1736

Download references

Acknowledgements

The authors would like to thank the Commonwealth Scientific and Industrial Research Organisation (CSIRO) and Consejo Nacional de Ciencia Tecnología (Conacyt) for supporting this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Israel Casas.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Casas, I., Taheri, J., Ranjan, R. et al. PSO-DS: a scheduling engine for scientific workflow managers. J Supercomput 73, 3924–3947 (2017). https://doi.org/10.1007/s11227-017-1992-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-017-1992-z

Keywords

Navigation