Fair scheduling of bag-of-tasks applications on large-scale platforms
Introduction
It is common for a distributed computing platform to be shared among several users, for instance, a cluster giving service to several researchers of an academic institution, or a commercial cloud infrastructure attending millions of requests from around the world. All of them would like to obtain a fair share of the platform. However, the most common scheduling policies are incompatible with the fairness objective. They unbalance the share of the platform among users to maximize the global throughput, minimize the makespan or satisfy the negotiated SLA terms. So, it is the scheduling policy itself who must enforce the fair sharing of the platform among its users. While several such policies have been proposed [1], [2], [3], [4], [5], [6], [7], [8], [9], they have serious scalability limitations. They are usually implemented with a centralized design that relies on full knowledge of the platform and the workload. This prevents them from managing the scheduling of tasks on systems of thousands, or even millions of nodes, a scale that is becoming more common every day.
In this paper we present the Fair Share Policy (FSP), a scheduling policy that allocates bag-of-tasks (BoT) applications with fairness in mind. It is a policy for STaRS [10], a scheduling model that can be implemented as part of a distributed computing platform. It provides scalability, fault-tolerance and the ability to support different scheduling policies. It is based on decentralized algorithms that eliminate the bottlenecks of a centralized design, and it is best suited for environments with millions of nodes. So, throughout the paper we assume that we deal with a very large platform and no centralized scheduler.
To measure the share of the platform, we consider the amount of computation that each user wants to get done. In this case, the most suited metric seems to be the maximum stretch, or slowdown [11], [2]. The stretch of an application is defined as the ratio of its response time under the concurrent scheduling of applications to its response time when it is the only application executed on the platform. It is the user’s perception of how slow its applications run due to its sharing the platform with other users. Let be the release time of an application its end time and its response time in a platform dedicated to itself, its stretch is calculated as A perfectly fair share of the platform is obtained when all the applications obtain the same stretch. However, this is only possible with offline scheduling and divisible load. With these conditions, the scheduler can adjust the end time of each application to reach the optimum objective. Instead, we consider a classic configuration in distributed computing, with online scheduling and atomic tasks. So, the best tradeoff is obtained by minimizing the maximum stretch among all applications.
We already presented a first design of this policy in [12]. In this paper, we widely improve the estimation and representation of the stretch, to obtain much better results. In particular, the main problem we face is that computing usually requires full knowledge of the platform. This is impractical with a decentralized design, so we assume two reasonable hypothesis:
- •
Each application has much less tasks than nodes in the platform.
- •
The distribution of computing power among nodes changes very little.
The rest of the paper is organized as follows: Section 2 presents the related work on fair scheduling, both in centralized and decentralized environments. Then, Section 3 explains how we solve the problem of minimizing the maximum stretch without full knowledge of the platform. Section 4 gives a brief description of the architecture of STaRS, on which this paper is based. In Sections 5 FSP local policy, 6 FSP global policy, we explain the details of the FSP policy. And finally, Section 7 presents the results of the experiments and in Section 8 we give our conclusions and a description of the future work.
Section snippets
Related work
Several works have approached the fair scheduling of different kinds of applications by minimizing the stretch. Benoit et al. [1] study the minimization of maximum stretch for concurrent BoT applications, like we do, but in a centralized setting. They show that interleaving tasks of several concurrent BoT applications perform better than scheduling each application after the other. Previously, Legrand et al. [2] focused on the scheduling of divisible load applications. In particular, they
Fairness in a decentralized BoT environment
There are several issues we have to deal with if we want to achieve fairness among BoT applications in a decentralized environment. We consider that a BoT application consists of tasks of equal length , in millions of FLOPs. This is a common model [1], [21], although BoT applications with variable-length tasks [33] will be tackled in the future. As stated by Eq. (1), to compute the stretch of an application we need to calculate its response time if it was alone in the platform, .
The architecture of STaRS
As we said before, in this paper we present a scheduling policy for STaRS, with fairness as its objective. STaRS is an online distributed scheduling model. It provides a set of tools and concepts to allocate different applications to a heterogeneous set of nodes in a decentralized way. With its design, it simultaneously provides with the scalability, fault-tolerance and versatility that most distributed computing platforms lack. Here, we summarize its characteristics, but we encourage the
FSP local policy
Each policy in STaRS has two parts: a local and a global one. The local part of the FSP policy specifies the order in which tasks should be run by an execution node, so that the maximum slowness is minimized. However, at each execution node, there is only information about the applications with one or more tasks in the queue. So, its objective is to minimize the maximum slowness among these applications. Moreover, in order to estimate the eventual slowness of an application, it must calculate
FSP global policy
The global part of the FSP policy must minimize the maximum slowness among all the applications currently allocated to any node. This can be achieved by minimizing the maximum slowness among all the execution nodes. It does not matter if the same application sets different maximums in different execution nodes, because we only care about the highest value. So, the global policy provides the mechanisms to route tasks towards the nodes where they will make the maximum slowness increase the least.
Experimentation
Like we did in [10] for other policies, we have measured the scalability and performance of the FSP policy through a set of tests and simulations. We have first developed tests to evaluate the accuracy of the aggregation scheme. They are run by a specific evaluation program that aggregates the information of a set of nodes in the same way that would be done in the tree.
Then, we have also observed our model under more realistic conditions with an ad-hoc discrete event simulator (DES).1
Conclusions and future work
In this paper, we propose a decentralized model for a scheduling policy whose objective is fairness. It tries to provide a similar share of the platform to every submitted application. To measure this share, we consider the amount of computation that each user wants to get done. In this case, the most suited metric seems to be the stretch. So, our policy tries to minimize the maximum stretch among all the applications.
Our contribution is triple. First, we propose a method to compute the stretch
Acknowledgment
The research work presented in this paper has been supported by the Spanish Ministry of Economy under the program “Programa de I+D+i Estatal de Investigacion, Desarrollo e innovacion Orientada a los Retos de la Sociedad”, project id TIN2013-40809-R; and by COSMOS and GISED, research groups recognized by the Aragonese Government.
Javier Celaya received his Ph.D. and M.Sc. degrees in computer science from the University of Zaragoza, Spain, in 2013 and 2005. His research interests include distributed systems and applications, grid and cloud computing, computer networks and discrete simulation.
References (42)
- et al.
A framework for providing hard delay guarantees and user fairness in grid computing
Future Gener. Comput. Syst.
(2009) - et al.
A task routing approach to large-scale scheduling
Future Gener. Comput. Syst.
(2013) - et al.
Decentralized scalable fairshare scheduling
Future Gener. Comput. Syst.
(2013) - et al.
Fair scheduling of bag-of-tasks applications using distributed Lagrangian optimization
J. Parallel Distrib. Comput.
(2014) - et al.
Scheduling concurrent bag-of-tasks applications on heterogeneous platforms
IEEE Trans. Comput.
(2010) - et al.
Minimizing the stretch when scheduling flows of biological requests
- et al.
Minimizing stretch and makespan of multiple parallel task graphs via malleable allocations
- J. Emeras, V. Pinheiro, K. Rzadca, D. Trystram, OStrich: fair scheduling for multiple submissions, in: Proceedings of...
- et al.
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling
- et al.
Choosy: max–min fair sharing for datacenter jobs with constraints
No justified complaints: on fair sharing of multiple resources
Multiresource allocation: fairness-efficiency tradeoffs in a unifying framework
IEEE/ACM Trans. Netw.
Online scheduling to minimize average stretch
A fair decentralized scheduler for bag-of-tasks applications on desktop grids
Stretch-optimal scheduling for on-demand data broadcasts
MapReduce: simplified data processing on large clusters
Commun. ACM
Dryad: distributed data-parallel programs from sequential building blocks
Quincy: fair scheduling for distributed computing clusters
A common substrate for cluster computing
Dominant resource fairness: fair allocation of multiple resource types
Cited by (16)
Improving task scheduling with parallelism awareness in heterogeneous computational environments
2019, Future Generation Computer SystemsCitation Excerpt :Therefore, a lot of work applied various technologies, heuristic [5,6], meta-heuristic [7], machine learning [8], etc., to achieve acceptable solutions for the scheduling problem with tolerable time consumption. While, to our best knowledge, existing task scheduling works considered that a task can be executed on only one core without parallelization [5–18] or on a fixed number of cores [19–23], or exhausted all resource at a time without task concurrency on a server [24–28], which leads to the inefficient use of resources, as illustrated by an example as following. Fig. 1 shows the results of scheduling four tasks on a server with four cores with various parallelism awarenesses using earliest deadline first (EDF).1
A Cost-Optimized Data Parallel Task Scheduling in Multi-Core Resources Under Deadline and Budget Constraints
2022, International Journal of Cloud Applications and ComputingAchieving Fairness-Aware Two-Level Scheduling for Heterogeneous Distributed Systems
2021, IEEE Transactions on Services ComputingUnavailable time aware scheduling of hybrid task on heterogeneous distributed system
2020, IAENG International Journal of Applied MathematicsScheduling Bag-of-Task-Chains in Distributed Systems
2019, Proceedings - 2019 IEEE 14th International Symposium on Autonomous Decentralized Systems, ISADS 2019
Javier Celaya received his Ph.D. and M.Sc. degrees in computer science from the University of Zaragoza, Spain, in 2013 and 2005. His research interests include distributed systems and applications, grid and cloud computing, computer networks and discrete simulation.
Unai Arronategui received his Ph.D. and M.Sc. degrees in computer science from the Paul Sabatier University, Toulouse, France, in 1992 and 1988, respectively. Since 2000, he has been an associate professor at the University of Zaragoza, Spain. His research interests are in distributed systems and computer networks.