Skip to main content
Log in

Reducing energy footprint in cloud computing: a study on the impact of clustering techniques and scheduling algorithms for scientific workflows

  • Regular Paper
  • Published:
Computing Aims and scope Submit manuscript

Abstract

The concept of scientific workflow makes it possible to link and control different tasks to carry out a complex treatment. The complicated workflow is generated by scientific distributed applications that may contain thousands of tasks. This high number of tasks requires important computation capabilities over the cloud datacenters. The rate of tasks that require execution by the cloud may lead to hosts’ overloading, which may increase the energy consumption and makespan of workflows. As a result, efficient techniques are necessary to save energy and time. Task clustering is an efficient technique that involves combining multiple tasks into one unit, called a job, to reduce the resource allocation time for the workflow’s tasks and consequently reduce the makespan. On the other hand, the scheduling of tasks’ execution in cloud hosts may have an impact on energy consumption and makespan, so it is asked to wisely integrate the scheduling algorithms into the computation of workflows. In this study, we analyze the contribution of task clustering techniques and scheduling algorithms on energy consumption and makespan during the computation of scientific workflows by the cloud’s infrastructure. For this purpose, we used WorkflowSim, an open-source cloud simulator providing workflow level support, task scheduling, and clustering techniques. The simulations’ results conclude that clustering techniques affect the energy consumption and Makespan regardless of the deployed scheduling scheme, however some combination of both the scheduling and clustering techniques may reduce the Makespan and consequently reducing energy consumption; their impact is more related to the nature of the running scientific workflow in the cloud. The main simulations’ results observation shows that Vertical clustering and MaxMin algorithms are more suitable for saving energy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

References

  1. Saadi Y, El Kafhali S (2020) Energy-efficient strategy for virtual machine consolidation in cloud environment. Soft Comput 24(19):14845–14859

    Article  Google Scholar 

  2. U.S. Energy Information Administration. List of countries by electricity consumption, [Online]. Available:https://www.eia.gov/international/data/world/electricity/electricity-consumption, Accessed 15 July 2022

  3. Asad Z, Chaudhry MAR (2016) A two-way street: green big data processing for a greener smart grid. IEEE Syst J 11(2):784–795

    Article  Google Scholar 

  4. Rincon D, Agusti-Torra A, Botero JF, Raspall F, Remondo D, Hesselbach X, Giuliani G (2013) A novel collaboration paradigm for reducing energy consumption and carbon dioxide emissions in data centres. Comput J 56(12):1518–1536

    Article  Google Scholar 

  5. Ma Y, Ma G, Zhang S, Zhou F (2016) Cooling performance of a pump-driven two phase cooling system for free cooling in data centers. Appl Therm Eng 95:143–149

    Article  Google Scholar 

  6. Buyya R, Vecchiola C, Selvi ST (2013) Mastering cloud computing: foundations and applications programming. Newnes, Oxford

    Google Scholar 

  7. Rivoire S, Shah MA, Ranganathan P, Kozyrakis C, Meza J (2007) Models and metrics to enable energy-efficiency optimizations. Computer 40(12):39–48

    Article  Google Scholar 

  8. El Kafhali S, El Mir I, Salah K, Hanini M (2020) Dynamic scalability model for containerized cloud services. Arab J Sci Eng 45:10693–10708

    Article  Google Scholar 

  9. Poess M, Nambiar RO (2008) Energy cost, the key challenge of today’s data centers: a power consumption analysis of TPC-C results. Proc VLDB Endow 1(2):1229–1240

    Article  Google Scholar 

  10. Beloglazov A, Abawajy J, Buyya R (2012) Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Futur Gener Comput Syst 28(5):755–768

    Article  Google Scholar 

  11. Juve G, Chervenak A, Deelman E, Bharathi S, Mehta G, Vahi K (2013) Characterizing and profiling scientific workflows. Futur Gener Comput Syst 29(3):682–692

    Article  Google Scholar 

  12. Da Silva RF, Juve G, Deelman E, Glatard T, Desprez F, Thain D, Livny M (2013) Toward fine-grained online task characteristics estimation in scientific workflows. In: Proceedings of the 8th workshop on workflows in support of large-scale science, pp. 58–67

  13. Chen W, Deelman E (2012) Workflowsim: a toolkit for simulating scientific workflows in distributed environments. In: 2012 IEEE 8th international conference on E-science. IEEE, pp. 1–8

  14. Rajak R, Kumar S, Prakash S, Rajak N, Dixit P (2023) A novel technique to optimize quality of service for directed acyclic graph (DAG) scheduling in cloud computing environment using heuristic approach. J Supercomput 79:1956–1979

    Article  Google Scholar 

  15. Jiang H, Song M (2017) Dynamic scheduling of workflow for makespan and robustness improvement in the IaaS cloud. IEICE Trans Inf Syst 100(4):813–821

    Article  Google Scholar 

  16. Berriman GB, Deelman E, Good JC, Jacob JC, Katz DS, Kesselman C, Su MH (2004) Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand. In: Optimizing scientific return for astronomy through information technologies, Vol. 5493. SPIE, pp. 221–232

  17. Prakash V, Bawa S, Garg L (2021) Multi-dependency and time based resource scheduling algorithm for scientific applications in cloud computing. Electronics 10(11):1320

    Article  Google Scholar 

  18. “SIPHT.” [Online]. Available: http://pegasus.isi.edu/applications/SIPHT. Accessed 10 July 2022.

  19. Brown DA, Brady PR, Dietz A, Cao J, Johnson B, McNabb J (2007) A case study on the use of workflow technologies for scientific analysis: gravitational wave data analysis. In: Workflows for e-Science. Springer, London, pp. 39–59

  20. Graves R, Jordan TH, Callaghan S, Deelman E, Field E, Juve G, Vahi K (2011) CyberShake: a physics-based seismic hazard model for southern California. Pure Appl Geophys 168(3):367–381

    Article  Google Scholar 

  21. Chen W, Da Silva RF, Deelman E, Sakellariou R (2013) Balanced task clustering in scientific workflows. In: 2013 IEEE 9th international conference on e-Science. IEEE, pp. 188–195

  22. Chavan DV, Dhole K, Kaveri PR (2016) Comparative performance analysis of task clustering methods in cloud computing. In: National conference on recent trends in computer science and information technology (NCRTCSIT-2016), pp. 50–52

  23. Chen W, Da Silva RF, Deelman E, Sakellariou R (2015) Using imbalance metrics to optimize task clustering in scientific workflow executions. Futur Gener Comput Syst 46:69–84

    Article  Google Scholar 

  24. Kusic D, Kephart JO, Hanson JE, Kandasamy N, Jiang G (2009) Power and performance management of virtualized computing environments via lookahead control. Clust Comput 12(1):1–15

    Article  Google Scholar 

  25. Marozzo F, Rodrigo Duro F, Garcia Blas J, Carretero J, Talia D, Trunfio P (2017) A data-aware scheduling strategy for workflow execution in clouds. Concurr Comput Pract Exp 29(24):e4229

    Article  Google Scholar 

  26. Marozzo F, Talia D, Trunfio P (2015) JS4Cloud: script-based workflow programming for scalable data analysis on cloud platforms. Concurr Comput Pract Exp 27(17):5214–5237

    Article  Google Scholar 

  27. Duro FR, Blas JG, Carretero J (2013) A hierarchical parallel storage system based on distributed memory for large scale systems. In: Proceedings of the 20th European MPI users' group meeting, pp. 139–140

  28. Varma PS (2013) A finest time quantum for improving shortest remaining burst round robin (srbrr) algorithm. J Glob Res Comput Sci 4(3):10–15

    Google Scholar 

  29. Pradhan P, Behera PK, Ray BNB (2016) Modified round robin algorithm for resource allocation in cloud computing. Proced Comput Sci 85:878–890

    Article  Google Scholar 

  30. Mikram H, El Kafhali S, Saadi Y (2022) Server consolidation algorithms for cloud computing: taxonomies and systematic analysis of literature. Int J Cloud Appl Comput (IJCAC) 12(1):1–24

    Google Scholar 

  31. El Kafhali S, El Mir I, Hanini M (2022) Security threats, defense mechanisms, challenges, and future directions in cloud computing. Arch Comput Methods Eng 29(1):223–246

    Article  Google Scholar 

  32. Sharma A, Kumar V, Kushwaha AS (2018) Study of various scheduling algorithm in cloud environment. Int J Eng Res Technol (IJERT) 7(8):347–351

    Google Scholar 

  33. Yu X, Yu X (2009) A new grid computation-based min-min algorithm. In: 2009 Sixth international conference on fuzzy systems and knowledge discovery, Vol. 1. IEEE, pp. 43–45

  34. Aissi H, Bazgan C, Vanderpooten D (2005) Complexity of the min–max and min–max regret assignment problems. Oper Res Lett 33(6):634–640

    Article  MathSciNet  MATH  Google Scholar 

  35. Tissir N, El Kafhali S, Aboutabit N (2021) Cybersecurity management in cloud computing: semantic literature review and conceptual framework proposal. J Reliab Intell Environ 7(2):69–84

    Article  Google Scholar 

  36. Stavrinides GL, Karatza HD (2019) An energy-efficient, QoS-aware and cost-effective scheduling approach for real-time workflow applications in cloud computing systems utilizing DVFS and approximate computations. Futur Gener Comput Syst 96:216–226

    Article  Google Scholar 

  37. Al-Dulaimy A, Itani W, Taheri J, Shamseddine M (2020) bwSlicer: a bandwidth slicing framework for cloud data centers. Futur Gener Comput Syst 112:767–784

    Article  Google Scholar 

  38. Hanini M, Kafhali SE, Salah K (2019) Dynamic VM allocation and traffic control to manage QoS and energy consumption in cloud computing environment. Int J Comput Appl Technol 60(4):307–316

    Article  Google Scholar 

  39. Fernández-Cerero D, Jakóbik A, Grzonka D, Kołodziej J, Fernández-Montes A (2018) Security supportive energy-aware scheduling and energy policies for cloud environments. J Parallel Distrib Comput 119:191–202

    Article  Google Scholar 

  40. Khorsand R, Ramezanpour M (2020) An energy-efficient task-scheduling algorithm based on a multi-criteria decision-making method in cloud computing. Int J Commun Syst 33(9):e4379

    Article  Google Scholar 

  41. El Kafhali S, Salah K (2018) Modeling and analysis of performance and energy consumption in cloud data centers. Arab J Sci Eng 43(12):7789–7802

    Article  Google Scholar 

  42. Saadi Y, Hnini A, Jounaidi S, Zougah H (2020) Energy-based comparison for workflow task clustering techniques. In: International conference on intelligent systems design and applications. Springer, Cham, pp. 526–535

  43. Peng Z, Barzegar B, Yarahmadi M, Motameni H, Pirouzmand P (2020) Energy-aware scheduling of workflow using a heuristic method on green cloud. Sci Program. https://doi.org/10.1155/2020/8898059

    Article  Google Scholar 

  44. Juarez F, Ejarque J, Badia RM (2018) Dynamic energy-aware scheduling for parallel task-based application in cloud computing. Futur Gener Comput Syst 78:257–271

    Article  Google Scholar 

  45. Ali H, Qureshi MS, Qureshi MB, Khan AA, Zakarya M, Fayaz M (2020) An energy and performance aware scheduler for real-time tasks in cloud datacentres. IEEE Access 8:161288–161303

    Article  Google Scholar 

  46. El Kafhali S, Salah K (2018) Performance analysis of multi-core VMs hosting cloud SaaS applications. Comput Stand Interfaces 55:126–135

    Article  Google Scholar 

  47. Hosseinimotlagh S, Khunjush F, Samadzadeh R (2015) SEATS: smart energy-aware task scheduling in real-time cloud computing. J Supercomput 71(1):45–66

    Article  Google Scholar 

  48. Garg N, Goraya MS (2018) Task deadline-aware energy-efficient scheduling model for a virtualized cloud. Arab J Sci Eng 43(2):829–841

    Article  Google Scholar 

  49. Garg R, Shukla N (2018) Energy efficient scheduling for multiple workflows in cloud environment. Int J Inf Technol Web Eng (IJITWE) 13(3):14–34

    Article  Google Scholar 

  50. Cotes-Ruiz IT, Prado RP, García-Galán S, Muñoz-Expósito JE, Ruiz-Reyes N (2017) Dynamic voltage frequency scaling simulator for real workflows energy-aware management in green cloud computing. PLoS One 12(1):e0169803

    Article  Google Scholar 

  51. Zhang L, Zhou L, Salah A (2020) Efficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments. Inf Sci 531:31–46

    Article  MathSciNet  MATH  Google Scholar 

  52. Singh V, Gupta I, Jana PK (2020) An energy efficient algorithm for workflow scheduling in IAAS cloud. J Grid Comput 18(3):357–376

    Article  Google Scholar 

  53. Khojasteh Toussi G, Naghibzadeh M, Abrishami S, Taheri H, Abrishami H (2022) EDQWS: an enhanced divide and conquer algorithm for workflow scheduling in cloud. J Cloud Comput 11(1):13

    Article  Google Scholar 

  54. Choudhary A, Govil MC, Singh G, Awasthi LK, Pilli ES (2022) Energy-aware scientific workflow scheduling in cloud environment. Clust Comput 25(6):3845–3874

    Article  Google Scholar 

  55. Xia Y, Zhan Y, Dai L, Chen Y (2023) A cost and makespan aware scheduling algorithm for dynamic multi-workflow in cloud environment. J Supercomput 79(2):1814–1833

    Article  Google Scholar 

  56. Chen W, Deelman E (2011) Workflow overhead analysis and optimizations. In: Proceedings of the 6th workshop on workflows in support of large-scale science, pp. 11–20

Download references

Acknowledgements

The authors thank the anonymous reviewers for their valuable comments, which helped us to considerably improve the content, quality and presentation of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Said El Kafhali.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saadi, Y., Jounaidi, S., El Kafhali, S. et al. Reducing energy footprint in cloud computing: a study on the impact of clustering techniques and scheduling algorithms for scientific workflows. Computing 105, 2231–2261 (2023). https://doi.org/10.1007/s00607-023-01182-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-023-01182-w

Keywords

Navigation