Abstract
The concept of scientific workflow makes it possible to link and control different tasks to carry out a complex treatment. The complicated workflow is generated by scientific distributed applications that may contain thousands of tasks. This high number of tasks requires important computation capabilities over the cloud datacenters. The rate of tasks that require execution by the cloud may lead to hosts’ overloading, which may increase the energy consumption and makespan of workflows. As a result, efficient techniques are necessary to save energy and time. Task clustering is an efficient technique that involves combining multiple tasks into one unit, called a job, to reduce the resource allocation time for the workflow’s tasks and consequently reduce the makespan. On the other hand, the scheduling of tasks’ execution in cloud hosts may have an impact on energy consumption and makespan, so it is asked to wisely integrate the scheduling algorithms into the computation of workflows. In this study, we analyze the contribution of task clustering techniques and scheduling algorithms on energy consumption and makespan during the computation of scientific workflows by the cloud’s infrastructure. For this purpose, we used WorkflowSim, an open-source cloud simulator providing workflow level support, task scheduling, and clustering techniques. The simulations’ results conclude that clustering techniques affect the energy consumption and Makespan regardless of the deployed scheduling scheme, however some combination of both the scheduling and clustering techniques may reduce the Makespan and consequently reducing energy consumption; their impact is more related to the nature of the running scientific workflow in the cloud. The main simulations’ results observation shows that Vertical clustering and MaxMin algorithms are more suitable for saving energy.
Similar content being viewed by others
References
Saadi Y, El Kafhali S (2020) Energy-efficient strategy for virtual machine consolidation in cloud environment. Soft Comput 24(19):14845–14859
U.S. Energy Information Administration. List of countries by electricity consumption, [Online]. Available:https://www.eia.gov/international/data/world/electricity/electricity-consumption, Accessed 15 July 2022
Asad Z, Chaudhry MAR (2016) A two-way street: green big data processing for a greener smart grid. IEEE Syst J 11(2):784–795
Rincon D, Agusti-Torra A, Botero JF, Raspall F, Remondo D, Hesselbach X, Giuliani G (2013) A novel collaboration paradigm for reducing energy consumption and carbon dioxide emissions in data centres. Comput J 56(12):1518–1536
Ma Y, Ma G, Zhang S, Zhou F (2016) Cooling performance of a pump-driven two phase cooling system for free cooling in data centers. Appl Therm Eng 95:143–149
Buyya R, Vecchiola C, Selvi ST (2013) Mastering cloud computing: foundations and applications programming. Newnes, Oxford
Rivoire S, Shah MA, Ranganathan P, Kozyrakis C, Meza J (2007) Models and metrics to enable energy-efficiency optimizations. Computer 40(12):39–48
El Kafhali S, El Mir I, Salah K, Hanini M (2020) Dynamic scalability model for containerized cloud services. Arab J Sci Eng 45:10693–10708
Poess M, Nambiar RO (2008) Energy cost, the key challenge of today’s data centers: a power consumption analysis of TPC-C results. Proc VLDB Endow 1(2):1229–1240
Beloglazov A, Abawajy J, Buyya R (2012) Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Futur Gener Comput Syst 28(5):755–768
Juve G, Chervenak A, Deelman E, Bharathi S, Mehta G, Vahi K (2013) Characterizing and profiling scientific workflows. Futur Gener Comput Syst 29(3):682–692
Da Silva RF, Juve G, Deelman E, Glatard T, Desprez F, Thain D, Livny M (2013) Toward fine-grained online task characteristics estimation in scientific workflows. In: Proceedings of the 8th workshop on workflows in support of large-scale science, pp. 58–67
Chen W, Deelman E (2012) Workflowsim: a toolkit for simulating scientific workflows in distributed environments. In: 2012 IEEE 8th international conference on E-science. IEEE, pp. 1–8
Rajak R, Kumar S, Prakash S, Rajak N, Dixit P (2023) A novel technique to optimize quality of service for directed acyclic graph (DAG) scheduling in cloud computing environment using heuristic approach. J Supercomput 79:1956–1979
Jiang H, Song M (2017) Dynamic scheduling of workflow for makespan and robustness improvement in the IaaS cloud. IEICE Trans Inf Syst 100(4):813–821
Berriman GB, Deelman E, Good JC, Jacob JC, Katz DS, Kesselman C, Su MH (2004) Montage: a grid-enabled engine for delivering custom science-grade mosaics on demand. In: Optimizing scientific return for astronomy through information technologies, Vol. 5493. SPIE, pp. 221–232
Prakash V, Bawa S, Garg L (2021) Multi-dependency and time based resource scheduling algorithm for scientific applications in cloud computing. Electronics 10(11):1320
“SIPHT.” [Online]. Available: http://pegasus.isi.edu/applications/SIPHT. Accessed 10 July 2022.
Brown DA, Brady PR, Dietz A, Cao J, Johnson B, McNabb J (2007) A case study on the use of workflow technologies for scientific analysis: gravitational wave data analysis. In: Workflows for e-Science. Springer, London, pp. 39–59
Graves R, Jordan TH, Callaghan S, Deelman E, Field E, Juve G, Vahi K (2011) CyberShake: a physics-based seismic hazard model for southern California. Pure Appl Geophys 168(3):367–381
Chen W, Da Silva RF, Deelman E, Sakellariou R (2013) Balanced task clustering in scientific workflows. In: 2013 IEEE 9th international conference on e-Science. IEEE, pp. 188–195
Chavan DV, Dhole K, Kaveri PR (2016) Comparative performance analysis of task clustering methods in cloud computing. In: National conference on recent trends in computer science and information technology (NCRTCSIT-2016), pp. 50–52
Chen W, Da Silva RF, Deelman E, Sakellariou R (2015) Using imbalance metrics to optimize task clustering in scientific workflow executions. Futur Gener Comput Syst 46:69–84
Kusic D, Kephart JO, Hanson JE, Kandasamy N, Jiang G (2009) Power and performance management of virtualized computing environments via lookahead control. Clust Comput 12(1):1–15
Marozzo F, Rodrigo Duro F, Garcia Blas J, Carretero J, Talia D, Trunfio P (2017) A data-aware scheduling strategy for workflow execution in clouds. Concurr Comput Pract Exp 29(24):e4229
Marozzo F, Talia D, Trunfio P (2015) JS4Cloud: script-based workflow programming for scalable data analysis on cloud platforms. Concurr Comput Pract Exp 27(17):5214–5237
Duro FR, Blas JG, Carretero J (2013) A hierarchical parallel storage system based on distributed memory for large scale systems. In: Proceedings of the 20th European MPI users' group meeting, pp. 139–140
Varma PS (2013) A finest time quantum for improving shortest remaining burst round robin (srbrr) algorithm. J Glob Res Comput Sci 4(3):10–15
Pradhan P, Behera PK, Ray BNB (2016) Modified round robin algorithm for resource allocation in cloud computing. Proced Comput Sci 85:878–890
Mikram H, El Kafhali S, Saadi Y (2022) Server consolidation algorithms for cloud computing: taxonomies and systematic analysis of literature. Int J Cloud Appl Comput (IJCAC) 12(1):1–24
El Kafhali S, El Mir I, Hanini M (2022) Security threats, defense mechanisms, challenges, and future directions in cloud computing. Arch Comput Methods Eng 29(1):223–246
Sharma A, Kumar V, Kushwaha AS (2018) Study of various scheduling algorithm in cloud environment. Int J Eng Res Technol (IJERT) 7(8):347–351
Yu X, Yu X (2009) A new grid computation-based min-min algorithm. In: 2009 Sixth international conference on fuzzy systems and knowledge discovery, Vol. 1. IEEE, pp. 43–45
Aissi H, Bazgan C, Vanderpooten D (2005) Complexity of the min–max and min–max regret assignment problems. Oper Res Lett 33(6):634–640
Tissir N, El Kafhali S, Aboutabit N (2021) Cybersecurity management in cloud computing: semantic literature review and conceptual framework proposal. J Reliab Intell Environ 7(2):69–84
Stavrinides GL, Karatza HD (2019) An energy-efficient, QoS-aware and cost-effective scheduling approach for real-time workflow applications in cloud computing systems utilizing DVFS and approximate computations. Futur Gener Comput Syst 96:216–226
Al-Dulaimy A, Itani W, Taheri J, Shamseddine M (2020) bwSlicer: a bandwidth slicing framework for cloud data centers. Futur Gener Comput Syst 112:767–784
Hanini M, Kafhali SE, Salah K (2019) Dynamic VM allocation and traffic control to manage QoS and energy consumption in cloud computing environment. Int J Comput Appl Technol 60(4):307–316
Fernández-Cerero D, Jakóbik A, Grzonka D, Kołodziej J, Fernández-Montes A (2018) Security supportive energy-aware scheduling and energy policies for cloud environments. J Parallel Distrib Comput 119:191–202
Khorsand R, Ramezanpour M (2020) An energy-efficient task-scheduling algorithm based on a multi-criteria decision-making method in cloud computing. Int J Commun Syst 33(9):e4379
El Kafhali S, Salah K (2018) Modeling and analysis of performance and energy consumption in cloud data centers. Arab J Sci Eng 43(12):7789–7802
Saadi Y, Hnini A, Jounaidi S, Zougah H (2020) Energy-based comparison for workflow task clustering techniques. In: International conference on intelligent systems design and applications. Springer, Cham, pp. 526–535
Peng Z, Barzegar B, Yarahmadi M, Motameni H, Pirouzmand P (2020) Energy-aware scheduling of workflow using a heuristic method on green cloud. Sci Program. https://doi.org/10.1155/2020/8898059
Juarez F, Ejarque J, Badia RM (2018) Dynamic energy-aware scheduling for parallel task-based application in cloud computing. Futur Gener Comput Syst 78:257–271
Ali H, Qureshi MS, Qureshi MB, Khan AA, Zakarya M, Fayaz M (2020) An energy and performance aware scheduler for real-time tasks in cloud datacentres. IEEE Access 8:161288–161303
El Kafhali S, Salah K (2018) Performance analysis of multi-core VMs hosting cloud SaaS applications. Comput Stand Interfaces 55:126–135
Hosseinimotlagh S, Khunjush F, Samadzadeh R (2015) SEATS: smart energy-aware task scheduling in real-time cloud computing. J Supercomput 71(1):45–66
Garg N, Goraya MS (2018) Task deadline-aware energy-efficient scheduling model for a virtualized cloud. Arab J Sci Eng 43(2):829–841
Garg R, Shukla N (2018) Energy efficient scheduling for multiple workflows in cloud environment. Int J Inf Technol Web Eng (IJITWE) 13(3):14–34
Cotes-Ruiz IT, Prado RP, García-Galán S, Muñoz-Expósito JE, Ruiz-Reyes N (2017) Dynamic voltage frequency scaling simulator for real workflows energy-aware management in green cloud computing. PLoS One 12(1):e0169803
Zhang L, Zhou L, Salah A (2020) Efficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments. Inf Sci 531:31–46
Singh V, Gupta I, Jana PK (2020) An energy efficient algorithm for workflow scheduling in IAAS cloud. J Grid Comput 18(3):357–376
Khojasteh Toussi G, Naghibzadeh M, Abrishami S, Taheri H, Abrishami H (2022) EDQWS: an enhanced divide and conquer algorithm for workflow scheduling in cloud. J Cloud Comput 11(1):13
Choudhary A, Govil MC, Singh G, Awasthi LK, Pilli ES (2022) Energy-aware scientific workflow scheduling in cloud environment. Clust Comput 25(6):3845–3874
Xia Y, Zhan Y, Dai L, Chen Y (2023) A cost and makespan aware scheduling algorithm for dynamic multi-workflow in cloud environment. J Supercomput 79(2):1814–1833
Chen W, Deelman E (2011) Workflow overhead analysis and optimizations. In: Proceedings of the 6th workshop on workflows in support of large-scale science, pp. 11–20
Acknowledgements
The authors thank the anonymous reviewers for their valuable comments, which helped us to considerably improve the content, quality and presentation of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Saadi, Y., Jounaidi, S., El Kafhali, S. et al. Reducing energy footprint in cloud computing: a study on the impact of clustering techniques and scheduling algorithms for scientific workflows. Computing 105, 2231–2261 (2023). https://doi.org/10.1007/s00607-023-01182-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00607-023-01182-w