Task scheduling, resource provisioning, and load balancing on scientific workflows using parallel SARSA reinforcement learning agents and genetic algorithm

Asghari, Ali; Sohrabi, Mohammad Karim; Yaghmaee, Farzin

doi:10.1007/s11227-020-03364-1

Task scheduling, resource provisioning, and load balancing on scientific workflows using parallel SARSA reinforcement learning agents and genetic algorithm

Published: 06 July 2020

Volume 77, pages 2800–2828, (2021)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

2278 Accesses
80 Citations
6 Altmetric
Explore all metrics

Abstract

Cloud computing is one of the most popular distributed environments, in which, multiple powerful and heterogeneous resources are used by different user applications. Task scheduling and resource provisioning are two important challenges of cloud environment, called cloud resource management. Resource management is a major problem especially for scientific workflows due to their heavy calculations and dependency between their operations. Several algorithms and methods have been developed to manage cloud resources. In this paper, the combination of state-action-reward-state-action learning and genetic algorithm is used to manage cloud resources. At the first step, the intelligent agents schedule the tasks during the learning process by exploring the workflow. Then, in the resource provisioning step, each resource is assigned to an agent, and its utilization is attempted to be maximized in the learning process of its corresponding agent. This is conducted by selecting the most appropriate set of the tasks that maximizes the utilization of the resource. Genetic algorithm is utilized for convergence of the agents of the proposed method, and to achieve global optimization. The fitness function that has been exploited by this genetic algorithm seeks to achieve more efficient resource utilization and better load balancing by observing the deadlines of the tasks. The experimental results show that the proposed algorithm reduces makespan, enhances resource utilization, and improves load balancing, compared to MOHEFT and MCP, the well-known workflow scheduling algorithms of the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

Fig. 4

Online scheduling of dependent tasks of cloud’s workflows to enhance resource utilization and reduce the makespan using multiple reinforcement learning-based agents

Article 24 April 2020

A Multi-object Optimization Cloud Workflow Scheduling Algorithm Based on Reinforcement Learning

Bi-objective cloud resource management for dependent tasks using Q-learning and NSGA-3

Article 29 May 2022

References

Gradwell P, Padget J (2005) Markets vs auctions: approaches to distributed combinatorial resource scheduling. Multiagent Grid Syst 1(4):251–262
MATH Google Scholar
Galstyan A, Czajkowski K, Lerman K (2005) Resource allocation in the grid with learning agents. J Grid Comput 3(1–2):91–100
Google Scholar
Yeo CS, Buyya R, Pourreza H, Eskicioglu R, Graham P, Sommers F (2006) Cluster computing: high-performance, high-availability, and high-throughput processing on a network of computers. In: Zomaya AY (ed) Handbook of nature-inspired and innovative computing. Springer, Boston, MA, pp 521–551
Google Scholar
Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G et al (2010) A view of cloud computing. Commun ACM 53(4):50–58
Google Scholar
Hameed A, Khoshkbarforoushha A, Ranjan R, Jayaraman PP, Kolodziej J, Balaji P, Zeadally S et al (2016) A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems. Computing 98(7):751–774
MathSciNet Google Scholar
Weingärtner R, Bräscher GB, Westphall CB (2015) Cloud resource management: a survey on forecasting and profiling models. J Netw Comput Appl 47:99–106
Google Scholar
Kahanwal D, Singh DTP (2013) The distributed computing paradigms: P2P, grid, cluster, cloud, and jungle. arXiv:1311.3070
Gonzalez NM, de Brito Carvalho TCM, Miers CC (2017) Cloud resource management: towards efficient execution of large-scale scientific applications and workflows on complex infrastructures. J Cloud Comput 6(1):13
Google Scholar
Jennings B, Stadler R (2015) Resource management in clouds: survey and research challenges. J Netw Syst Manag 23(3):567–619
Google Scholar
Arunarani AR, Manjula D, Sugumaran V (2019) Task scheduling techniques in cloud computing: a literature survey. Future Gen Comput Syst 91:407–415
Google Scholar
Kalra M, Singh S (2015) A review of metaheuristic scheduling techniques in cloud computing. Egypt Inform J 16(3):275–295
Google Scholar
Rodriguez MA, Buyya R (2018) Scheduling dynamic workloads in multi-tenant scientific workflow as a service platforms. Future Gen Comput Syst 79:739–750
Google Scholar
Barker A, Van Hemert J (2007) Scientific workflow: a survey and research directions. In: International Conference on Parallel Processing and Applied Mathematics. Springer, Berlin, Heidelberg, pp 746–753
de Carvalho Silva J, de Oliveira Dantas AB, de Carvalho Junior FH (2019) A scientific workflow management system for orchestration of parallel components in a cloud of large-scale parallel processing services. Sci Comput Program 173:95–127
Google Scholar
Malawski M, Juve G, Deelman E, Nabrzyski J (2015) Algorithms for cost-and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds. Future Gen Comput Syst 48:1–18
Google Scholar
Zhang Q, Cheng L, Boutaba R (2010) Cloud computing: state-of-the-art and research challenges. J Internet Serv Appl 1(1):7–18
Google Scholar
Barto AG, Mahadevan S (2003) Recent advances in hierarchical reinforcement learning. Discrete Event Dyn Syst 13(1–2):41–77
MathSciNet MATH Google Scholar
Davis L (1991) Handbook of genetic algorithms. Van Nostrand Reinhold, New York
Google Scholar
Asghari A, Sohrabi MK, Yaghmaee F (2020) Online scheduling of dependent tasks of cloud’s workflows to enhance resource utilization and reduce the makespan using multiple reinforcement learning-based agents. Soft Comput. https://doi.org/10.1007/s00500-020-04931-7
Article Google Scholar
Asghari A, Sohrabi MK, Yaghmaee F (2020) A cloud resource management framework for multiple online scientific workflows using cooperative reinforcement learning agents. Comput Netw. https://doi.org/10.1016/j.comnet.2020.107340
Article Google Scholar
Xu C-Z, Rao J, Xiangping B (2012) URL: a unified reinforcement learning approach for autonomic cloud management. J Parallel Distrib Comput 72(2):95–105
Google Scholar
Duggan M, Duggan J, Howley E, Barrett E (2017) A reinforcement learning approach for the scheduling of live migration from under utilised hosts. Memet Comput 9(4):283–293
Google Scholar
Shi B, Zhu H, Yuan H, Shi R, Wang J (2018) Pricing cloud resource based on reinforcement learning in the competing environment. In: International Conference on Cloud Computing. Springer, Cham, pp 158–171
Benifa JVB, Dejey D (2019) RLPAS: reinforcement learning-based proactive auto-scaler for resource provisioning in cloud environment. Mob Netw Appl 24:1348–1363
Google Scholar
Orhean AI, Pop F, Raicu I (2018) New scheduling approach using reinforcement learning for heterogeneous distributed systems. J Parallel Distrib Comput 117:292–302
Google Scholar
Liu N, Li Z, Xu J, Xu Z, Lin S, Qiu Q, Tang J, Wang Y (2017) A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). IEEE, pp 372–382
Zhang Yu, Yao J, Guan H (2018) Intelligent cloud resource management with deep reinforcement learning. IEEE Cloud Comput 4(6):60–69
Google Scholar
Balla HAM, Sheng CG, Weipeng J (2018) Reliability enhancement in cloud computing via optimized job scheduling implementing reinforcement learning algorithm and queuing theory. In: 2018 1st International Conference on Data Intelligence and Security (ICDIS). IEEE, pp 127–130
Peng Z, Cui D, Zuo J, Li Q, Xu B, Lin W (2015) Random task scheduling scheme based on reinforcement learning in cloud computing. Cluster Comput 18(4):1595–1607
Google Scholar
Xu Y, Li K, Hu J, Li K (2014) A genetic algorithm for task scheduling on heterogeneous computing systems using multiple priority queues. Inf Sci 270:255–287
MathSciNet MATH Google Scholar
Kwok YK, Ahmad I (1998) Benchmarking the task graph scheduling algorithms. In: Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing. IEEE, pp 531–537
Keshanchi B, Souri A, Navimipour NJ (2017) An improved genetic algorithm for task scheduling in the cloud environments using the priority queues: formal verification, simulation, and statistical testing. J Syst Softw 124:1–21
Google Scholar
Liu C-Y, Zou C-M, Wu P (2014) A task scheduling algorithm based on genetic algorithm and ant colony optimization in cloud computing. In: 2014 13th International Symposium on Distributed Computing and Applications to Business, Engineering and Science (DCABES). IEEE, pp 68–72
Wu S-y, Zhang P, Li F, Gu F, Pan Y (2016) A hybrid discrete particle swarm optimization-genetic algorithm for multi-task scheduling problem in service oriented manufacturing systems. J Cent South Univ 23(2):421–429
Google Scholar
Akbari M, Rashidi H, Alizadeh SH (2017) An enhanced genetic algorithm with new operators for task scheduling in heterogeneous computing systems. Eng Appl Artif Intell 61:35–46
Google Scholar
Wang B, Li J (2016) Load balancing task scheduling based on multi-population genetic algorithm in cloud computing. In: 2016 35th Chinese Control Conference (CCC). IEEE, pp 5261–5266
Beegom ASA, Rajasree MS (2015) Genetic algorithm framework for bi-objective task scheduling in cloud computing systems. In: International Conference on Distributed Computing and Internet Technology. Springer, Cham, pp 356–359
Ahmad SG, Liew CS, Munir EU, Ang TF, Khan SU (2016) A hybrid genetic algorithm for optimization of scheduling workflow applications in heterogeneous computing systems. J Parallel Distrib Comput 87:80–90
Google Scholar
Page AJ, Keane TM, Naughton TJ (2010) Multi-heuristic dynamic task allocation using genetic algorithms in a heterogeneous distributed system. J Parallel Distrib Comput 70(7):758–766
MATH Google Scholar
Singh S, Chana I (2016) A survey on resource scheduling in cloud computing: issues and challenges. J Grid Comput 14(2):217–264
Google Scholar
Manvi SS, Shyam GK (2014) Resource management for Infrastructure as a service (IaaS) in cloud computing: a survey. J Netw Comput Appl 41:424–440
Google Scholar
Wu F, Wu Q, Tan Y (2015) Workflow scheduling in cloud: a survey. J Supercomput 71(9):3373–3418
Google Scholar
Antonopoulos N, Gillam L (2010) Cloud computing. Springer, London
MATH Google Scholar
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
MATH Google Scholar
Michalski RS, Carbonell JG, Mitchell TM (eds) (2013) Machine learning: an artificial intelligence approach. Springer, Berlin
Google Scholar
Puterman ML (2014) Markov decision processes: discrete stochastic dynamic programming. Wiley, New York
MATH Google Scholar
Barto AG, Bradtke SJ, Singh SP (1995) Learning to act using real-time dynamic programming. Artif Intell 72(1–2):81–138
Google Scholar
Watkins CJCH (1989) Learning from delayed rewards. Ph.D. Diss., King’s College, Cambridge
Rummery GA (1995) Problem solving with reinforcement learning. Ph.D. Diss., University of Cambridge
Rummery GA, Niranjan M (1994) On-line Q-learning using connectionist systems, vol 37. University of Cambridge, Cambridge
Google Scholar
John GH (1994) When the best move isn’t optimal: Q-learning with exploration. In: AAAI, p 1464
Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT Press, Cambridge
Google Scholar
Konak A, Coit DW, Smith AE (2006) Multi-objective optimization using genetic algorithms: a tutorial. Reliab Eng Syst Saf 91(9):992–1007
Google Scholar
Back T (1996) Evolutionary algorithms in theory and practice: evolution strategies, evolutionary programming, genetic algorithms. Oxford University Press, Oxford
MATH Google Scholar
Sastry K, Goldberg D, Kendall G (2005) Genetic algorithms. In: Burke EK, Kendall G (eds) Search methodologies. Springer, Boston, MA, pp 97–125
Google Scholar
Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading
MATH Google Scholar
Ghomi EJ, Rahmani AM, Qader NN (2017) Load-balancing algorithms in cloud computing: a survey. J Netw Comput Appl 88:50–71
Google Scholar
Xu M, Tian W, Buyya R (2017) A survey on load balancing algorithms for virtual machines placement in cloud computing. Concurr Comput Pract Exp 29(12):e4123
Google Scholar
Corazza M, Sangalli A (2015) Q-learning and SARSA: a comparison between two intelligent stochastic control approaches for financial trading. University Ca’Foscari of Venice, Dept. of Economics Research Paper Series No 15
Beale HD, Demuth HB, Hagan MT (1996) Neural network design. PWS, Boston
Google Scholar
Myerson RB (2013) Game theory. Harvard University Press, Cambridge
MATH Google Scholar
Chang D-H, Son JH, Kim MH (2002) Critical path identification in the context of a workflow. Inf Softw Technol 44(7):405–417
Google Scholar
Tong Z, Deng X, Chen H, Mei J, Liu H (2020) QL-HEFT: a novel machine learning scheduling scheme base on cloud computing environment. Neural Comput Appl 32:5553–5570
Google Scholar
Patel P, Ranabahu AH, Sheth AP (2009) Service level agreement in cloud computing. In: Proceeding of international conference on object oriented programming, systems, languages and application (Cloud Workshops at OOPSLA09), Orlando, Florida, USA, October 25–29, 2009, pp 212–217
Calheiros RN, Ranjan R, Beloglazov A, De Rose CAF, Buyya R (2011) CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41(1):23–50
Google Scholar
http://daggenerator.com/#
Rodriguez MA, Buyya R (2017) A taxonomy and survey on scheduling algorithms for scientific workflows in IaaS cloud computing environments. Concurr Comput Pract Exp 29(8):e4041
Google Scholar
Durillo JJ, Prodan R (2014) Multi-objective workflow scheduling in Amazon EC2. Cluster Comput 17(2):169–189
Google Scholar
Vasile M-A, Pop F, Tutueanu R-I, Cristea V, Kołodziej J (2015) Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing. Future Gen Comput Syst 51:61–71
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Semnan Branch, Islamic Azad University, Semnan, Iran
Ali Asghari, Mohammad Karim Sohrabi & Farzin Yaghmaee
Electrical and Computer Engineering Department, Semnan University, Semnan, Iran
Farzin Yaghmaee

Authors

Ali Asghari
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Karim Sohrabi
View author publications
You can also search for this author in PubMed Google Scholar
Farzin Yaghmaee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Karim Sohrabi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Asghari, A., Sohrabi, M.K. & Yaghmaee, F. Task scheduling, resource provisioning, and load balancing on scientific workflows using parallel SARSA reinforcement learning agents and genetic algorithm. J Supercomput 77, 2800–2828 (2021). https://doi.org/10.1007/s11227-020-03364-1

Download citation

Published: 06 July 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s11227-020-03364-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Task scheduling, resource provisioning, and load balancing on scientific workflows using parallel SARSA reinforcement learning agents and genetic algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Online scheduling of dependent tasks of cloud’s workflows to enhance resource utilization and reduce the makespan using multiple reinforcement learning-based agents

A Multi-object Optimization Cloud Workflow Scheduling Algorithm Based on Reinforcement Learning

Bi-objective cloud resource management for dependent tasks using Q-learning and NSGA-3

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now