A novel multi-agent reinforcement learning approach for job scheduling in Grid computing

doi:10.1016/j.future.2010.10.009

Future Generation Computer Systems

Volume 27, Issue 5, May 2011, Pages 430-439

https://doi.org/10.1016/j.future.2010.10.009 Get rights and content

Abstract

Grid computing utilizes distributed heterogeneous resources to support large-scale or complicated computing tasks, and an appropriate resource scheduling algorithm is fundamentally important for the success of Grid applications. Due to the complex and dynamic properties of Grid environments, traditional model-based methods may result in poor scheduling performance in practice. Scalability and adaptability are among the key objectives of Grid job scheduling. In this paper, a novel multi-agent reinforcement learning method, called ordinal sharing learning (OSL) method, is proposed for job scheduling problems, especially, for realizing load balancing in Grids. The approach circumvents the scalability problem by using an ordinal distributed learning strategy, and realizes multi-agent coordination based on an information-sharing mechanism with limited communication. Simulation results show that the OSL method can achieve the goal of load balancing effectively, and its performance is even comparable to some centralized scheduling algorithm in most cases. The convergence property and adaptability of the proposed method are also illustrated.

Research highlights

► We propose a novel multi-agent reinforcement learning method for job scheduling in Grid computing. ► The proposed approach circumvents the scalability problem by using an ordinal distributed learning strategy. ► We realize multi-agent coordination based on an information sharing mechanism with limited communication. ► Simulation results show that the OSL method can achieve the goal of load balancing effectively.

Introduction

Multi-agent resource allocation is the process of distributing a number of items amongst a number of agents, and acts as a central matter of concern in both computer science and economics [1]. It is relevant to a wide range of application domains, such as network routing [2], public transportation [3] and Grid computing [4], [5], [6], where Grid computing is one of the most important applications of resource allocation or scheduling [7].

Grid computing enables the sharing, selection, and aggregation of geographically distributed heterogeneous resources and becomes an important solution paradigm for supporting complicated computing problems. However, there are still some technical challenges for Grids [5]. For a majority of Grid systems, the real and specific problem that underlies Grid computing is coordinated resource scheduling and problem solving in dynamic, multi-institutional virtual organizations, where an effective and efficient scheduling algorithm is fundamentally important [8], [9]. Only with the help of a feasible scheduling policy, can the Grids speed up the task process and provide non-trivial services to users [10]. In the following, the job scheduling problem, which is the key issue for balancing the entire system load while completing all the jobs at hand as soon as possible, is studied (see Fig. 1).

In the past decade, there have been many advances in Grid job scheduling techniques. Various scheduling approaches, including model-based or model-free methods, either using centralized or decentralized mechanisms, have been developed for Grids. On the one hand, lots of algorithms have been studied for job scheduling problems in traditional parallel and distributed systems, such as FPLTF (Fastest Processor to Largest Task First), WQR (Work Queue with Replication) and FCFS (First Come First Serve) [11]. On the other hand, extensive research has been done for Grid scheduling problems, too. In traditional resource scheduling systems, such as Condor [12], PBS [13] and SGE [14], centralized schedulers work effectively since accurate and global information can be obtained. However, centralized or hierarchical resource allocation methods may suffer from the lack of scalability and fault-tolerance ability as well as having a single point of failure [15]. To overcome the scalability problem, some decentralized scheduling algorithms have been proposed. However, most existing decentralized schedulers, for example, in Condor-G [16] and AppleS [17], perform individual scheduling policies regardless of the other schedulers’ decisions and may lead to serious synchronization problems in resource management. Finally, a Herd behavior will arise since schedulers run without central oversight and communication [18], [19]. However, if job scheduling is carried out under the assumption of coordination, such as in Legion Federation [20] and Condor Flock P2P [21], the strong dependency on negotiation among schedulers and resources may lead to high communication overhead. Therefore, how to coordinate the scheduling among decentralized schedulers with a moderate communication cost is an important and open problem. A recent work to deal with the above problem has been done in [22], where a collaborative model is proposed based on the Random Early Detection (RED) strategies via gossiping and good scheduling performance is achieved.

Moreover, to meet the need for scheduling adaptation, which comes from the heterogeneity of resources, the variations of resource performance, and the diversity of applications, an adaptive scheduling method is deserved. Recently, a promising approach based on reinforcement learning (RL) has been studied for job scheduling and resource allocation in Grids [23]. As an important class of machine learning methods, RL aims to solve uncertain decision-making problems by interacting with the environment and near-optimal or suboptimal policies can be obtained in a data-driven way [24]. Therefore, RL provides a model-free methodology and is very promising to solve the difficulties of Grid resource scheduling. According to different learning mechanisms, existing RL approaches to resource scheduling can be mainly divided into two types. One is based on policy gradient learning algorithms [6], [25], [26] and the other uses value-function-based learning algorithms [5], [23], [27]. However, the learning efficiency and scalability of existing RL methods in Grid resource allocation still need to be improved for large-scale applications of Grid computing.

In this paper, to realize learning-based coordination and generalization in large-scale Grid environments, a novel multi-agent reinforcement learning method, called the ordinal sharing learning (OSL) method, is proposed to solve the job scheduling problem for Grid computing. In the OSL method, a fast distributed learning algorithm is designed based on an ordinal information-sharing mechanism. Compared with previous multi-agent RL (MARL) methods for job scheduling, the OSL method has two aspects of innovations. One aspect simplifies the modeling of optimal decision-making in job scheduling, where only a utility table is learned online to estimate the resources’ efficiency, instead of building the complex Grid Information System (GIS). The other aspect circumvents the scalability and coordination problem by an efficient information-sharing mechanism with limited communication for multi-agent systems, where an ordinal sharing strategy makes all agents share their utility tables and make decisions in turn. The proposed approach was evaluated in a simulated large-scale Grid computing environment and the results show its validity and feasibility.

The remainder of this paper is organized as follows. Section 2 introduces a general model for job scheduling in Grid computing, and discusses the performance measures. Section 3 discusses the basic idea of multi-agent reinforcement learning and presents the OSL method for Grid job scheduling. Section 4 makes performance evaluation and comparisons of different job scheduling methods in a simulated Grid computing environment and the results illustrate the effectiveness of the proposed method. Section 5 gives a further overview of the related works. Finally, conclusions are made in Section 6.

Section snippets

A general job scheduling model in Grids

It is well known that the complexity of a general centralized scheduling problem is NP-Complete [28]. Due to the NP-Complete nature and the difficulty to prove the optimality of scheduling algorithms in Grid scenarios, current research always tries to find suboptimal solutions. Moreover, in this paper, to solve the scalability problem, a strategy where decentralized schedulers take charge of job scheduling simultaneously instead of a centralized scheduler is considered. To describe the

The OSL method for adaptive job scheduling

As mentioned above, in practical large-scale Grid applications, even with the help of the GIS system, the information about resources in the schedulers is time delayed and potentially inaccurate. So it is reasonable to develop a robust scheduling algorithm which is not dependent on an accurate model. To satisfy the requirements in adaptive job scheduling, a coordinated multi-agent reinforcement leaning method may be an appropriate solution. In the following, after an analysis on different MARL

Performance evaluation and discussions

In this section, the performance of the OSL-based Selection (OSLS) rule for job scheduling will be evaluated and analyzed in simulations. In addition, the proposed OSLS method is compared with four other resource scheduling or selection rules, which are Decentralized Min–Min Selection (DMMS) [38], Random Selection (RS), Least Load Selection (LLS), and Simple Learning Selection (SLS) [5]. The Min–Min algorithm is a heuristic scheduling method that becomes a benchmark scheduling algorithm for

Related works

For the RL-based job scheduling problem in Grids, there are some other related works. In [5], the SLS method was adopted for Grid job scheduling. However, the above experimental results show that the SLS method only has good performance in some special cases when the number of users is much more than the number of resources. Moreover, its performance still needs to be improved.

In [6], the authors introduced a new gradient ascent learning algorithm named Weighted Policy Learner (WPL) for the

Conclusions

One of the key concerns of Grid computing is to develop autonomic computing systems that have the abilities of self-configuration and self-optimization in dynamic environments. In this paper, the OSL method based on multi-agent reinforcement learning is proposed to solve the job scheduling problem in Grids. This approach circumvents the scalability problem by using a distributed learning strategy, and achieves multi-agent coordination based on an ordinal information-sharing mechanism. Finally,

Acknowledgements

This work is supported in part by National Natural Science Foundation of China under Grants 60774076 and 61075072, the Fork Ying Tong Youth Teacher Foundation Under Grant 114005, and Natural Science Foundation of Hunan Province under Grant 07JJ3122. We also thank the anonymous reviewers for their comments and recommendations, which have been crucial to improving the quality of this work.

Jun Wu received the B.Sc. and M.Sc. degrees in electrical engineering from National University of Defense Technology, Changsha, China, in 2002 and 2005, respectively. He is currently working toward the Ph.D. degree from the Institute of Automation, National University of Defense Technology, China. His current research interests include reinforcement learning, autonomous agent and multi-agent systems, especially in resource allocation, and multi-robot control.

References (40)

F. Xhafa et al.
Computational models and heuristic methods for grid scheduling problems
Future Generation Computer Systems
(2010)
W.C. Chung et al.
A new mechanism for resource monitoring in grid computing
Future Generation Computer Systems
(2009)
R.S. Chang et al.
An ant algorithm for balanced job scheduling in grids
Future Generation Computer Systems
(2009)
A.R. Butt et al.
A self-organizing flock of condors
Journal of Parallel and Distributed Computing
(2006)
D. Vengerov
A reinforcement learning approach to dynamic resource allocation
Engineering Applications of Artificial Intelligence
(2007)
H. Li et al.
Model-based simulation and performance evaluation of grid scheduling strategies
Future Generation Computer Systems
(2009)
I. Rodero et al.
Grid broker selection strategies using aggregated resource information
Future Generation Computer Systems
(2010)
M.L. Littman
Value-function reinforcement learning in Markov games
The Journal of Cognitive Systems Research
(2001)
Y. Chevaleyre et al.
Issues in multi-agent resource allocation
Informatica
(2006)
R. Feldmann et al.
Selfish Routing in Non-Cooperative Networks: A Survey
(2003)

E. Cantillon et al.

Auctioning bus routes: the London experience

P. Gradwell et al.

Markets vs. auctions: approaches to distributed combinatorial resource scheduling

Journal Multiagent and Grid Systems

(2005)

A. Galstyan et al.

Resource allocation in the grid with learning agents

Journal of Grid Computing

(2005)

S. Abdallahy, V. Lesser, Learning the task allocation game, in: Proceedings of the Fifth AAMAS, Japan, May 8–12, 2006,...

F.P. Dong, S.G. Akl, Scheduling algorithms for grid computing: state of the art and open problems, Technical Report No....

D. Thain et al.

Distributed computing in practice: the Condor experience

Concurrency and Computation: Practice and Experience

(2005)

Portable Barch System, 2009....

Sun Grid Engine, 2009....

K. Krauter et al.

A taxonomy and survey of grid resource management systems for distributed computing

Software: Practice and Experience

(2002)

Cited by (100)

Solving job scheduling problems in a resource preemption environment with multi-agent reinforcement learning
2022, Robotics and Computer-Integrated Manufacturing
Citation Excerpt :
Most researches focus on applying DASA to handle various scheduling problems. Wu [33] proposed a distributed-agent architecture with RL, which leveraged the partial information and the interaction between agents for job scheduling in grid computing. Roesch [34] applied the PPO RL algorithm with the Generalized Advantage Estimator (GAE) in a distributed-agent architecture to address the dynamic energy scheduling problem, which outperformed meta-heuristic algorithms.
In smart manufacturing, robots gradually replace traditional machines as new processing units, which have significantly liberated laborers and reduced manufacturing expenditure. However, manufacturing resources are usually limited so that the preemption relationship exists among robots. Under this circumstance, job scheduling puts forward higher requirements on accuracy and generalization. To this end, this paper proposes a scheduling algorithm to solve job scheduling problems in a resource preemption environment with multi-agent reinforcement learning. The resource preemption environment is modeled as a decentralized partially observable Markov decision process, where each job is regarded as an intelligent agent that chooses an available robot according to its current partial observation. Based on this modeling, a multi-agent scheduling architecture is constructed to handle the high-dimension action space issue caused by multi-task simultaneous scheduling. Besides, multi-agent reinforcement learning is employed to learn both the decision-making policy of each agent and the cooperation between job agents. This paper is novel in addressing the scheduling problem in a resource preemption environment and solving the job shop scheduling problem with multi-agent reinforcement learning. The experiments of the case study indicate that our proposed method outperforms the traditional rule-based methods and the distributed-agent reinforcement learning method in total makespan, training stability, and model generalization.
A synergistic reinforcement learning-based framework design in driving automation
2022, Computers and Electrical Engineering
Citation Excerpt :
Moreover, considering the driving automation system is still evolving, a heterogeneous accelerator architecture can better accommodate the ever-changing new algorithms and applications in this area [20]. To efficiently process such a large amount of CNN-based tasks with high variability on the complicated hardware substrate [21], effective criteria for system design that are tailored to driving automation should be defined [22]. Obviously the overall performance of the computing platform should be considered at first.
Autonomous driving, which integrates artificial intelligence and the Internet of Things, has piqued the interest of both academics and industry because of its economic and societal benefits. Rigorous accuracy and latency requirements are important for autonomous driving safety. In order to achieve high computation performance in driving automation system, we propose in this paper a heterogeneous multicore AI accelerator (HMAI). At the same time, on the HMAI, how to allocate a large number of real-time tasks to different accelerators remains a notable problem that is worth considering. Theoretically, this problem is NP-complete, and always solved using heuristic-based and guided random-search-based algorithms. However, the global state of HMAI cannot be considered comprehensively in these algorithms, which usually leads to suboptimal allocations. In this paper, we propose FlexAI, a predictive and global scheduling mechanism on HMAI. Specifically, the proposed scheduling algorithm that is based upon deep reinforcement learning (RL). In order to evaluate the quality of strategies produced by RL agent and update the observation of the scheduling agent, two scheduling metrics are proposed: Global State Value (Gvalue), Matching Score (MS) which pays attention to the requirements of various tasks in driving automation system like emergency level. In the experimental, FlexAI achieves up to 80% execution time reduction and 99% resource utilization improvement compared with Min-min, ATA in heuristics, and genetic algorithms, simulated annealing in guided random-search-based algorithms, and unscheduled case.
Weighted randomized algorithms for efficient load balancing in distributed computing environments
2020, Materials Today: Proceedings
Randomized algorithms for resource choice make use of information concerning the spread of their keyed-in data by using random samples. This can effectively improve resource utilization but can create a load imbalance naturally due to the randomness of its input space. For specific problems it is helpful to use a helper to direct the solution space in the right direction. In this paper, to enact effective resource utilization combined with optimized load balancing, a weighted randomized resource assignment algorithm is proposed. The simulation results using standard workload format datasets reveal that the proposed algorithm outperforms existing solutions in average resource utilization by 8% to 12% while improving on load balance by 5% to 11%.
Artificial Intelligence and Mathematical Models of Power Grids Driven by Renewable Energy Sources: A Survey
2023, Energies
A review of cooperative multi-agent deep reinforcement learning
2023, Applied Intelligence
Dynamic Task Allocation for Heterogeneous Multi-Robot System under Communication Constraints
2023, ITNEC 2023 - IEEE 6th Information Technology, Networking, Electronic and Automation Control Conference

View all citing articles on Scopus

Xin Xu received the B.S. degree in control engineering from the Department of Automatic Control, National University of Defense Technology (NUDT), Changsha, PR China, in 1996 and the Ph.D. degree in electrical engineering from the College of Mechatronics and Automation (CMEA), NUDT. From 2003 to 2004, he was a Postdoctoral Fellow at School of Computer, NUDT. In August, 2006 and from September to October 2007, he was a visiting scholar for cooperation research in the Hong Kong Polytechnic University, Hong Kong, China and the University of Strathclyde, UK, respectively. Currently, he is an Associate Professor at the Institute of Automation, CMEA, NUDT.

He has coauthored four books and published more than 50 papers in international journals and conferences, including IEEE Transactions on Neural Networks, Journal of AI Research, etc. His research interests include reinforcement learning, data mining, learning control, robotics, autonomic computing, and computer security.

Dr. Xu received the excellent Ph.D. dissertation award from Hunan Province, PR China, in 2004 and the Fork Ying Tong Youth Teacher Fund of China in 2008. He has served as a PC member or Session Chair in many international conferences, and currently, he is a reviewer for several journals including several IEEE Transactions. He has been a grant reviewer of National Natural Science Foundation of China since 2005.

View full text

A novel multi-agent reinforcement learning approach for job scheduling in Grid computing

Abstract

Research highlights

Introduction

Section snippets

A general job scheduling model in Grids

The OSL method for adaptive job scheduling

Performance evaluation and discussions

Related works

Conclusions

Acknowledgements

Future Generation Computer Systems

Future Generation Computer Systems

Future Generation Computer Systems

Journal of Parallel and Distributed Computing

Engineering Applications of Artificial Intelligence

Future Generation Computer Systems

Future Generation Computer Systems

The Journal of Cognitive Systems Research

Issues in multi-agent resource allocation

Informatica

Selfish Routing in Non-Cooperative Networks: A Survey

Auctioning bus routes: the London experience

Markets vs. auctions: approaches to distributed combinatorial resource scheduling

Journal Multiagent and Grid Systems

Resource allocation in the grid with learning agents

Journal of Grid Computing

Distributed computing in practice: the Condor experience

Concurrency and Computation: Practice and Experience

A taxonomy and survey of grid resource management systems for distributed computing

Software: Practice and Experience