Fair scheduling of bag-of-tasks applications on large-scale platforms

doi:10.1016/j.future.2015.03.002

Future Generation Computer Systems

Volume 49, August 2015, Pages 28-44

https://doi.org/10.1016/j.future.2015.03.002 Get rights and content

Highlights

•
We present a scheduling model for fair resource sharing on large-scale platforms.
•
It effectively aggregates information about application stretch.
•
Task allocation is performed so that the maximum stretch is minimized.
•
Our model is able to perform similar to a centralized implementation.
•
The management overhead is bounded.

Abstract

Users of distributed computing platforms want to obtain a fair share of the resources they use. With respect to the amount of computation, the most suitable measure of fairness is the stretch. It describes the slowdown that the applications suffer for being executed in a shared platform, in contrast to being executed alone. In this paper, we present a decentralized scheduling policy that minimizes the maximum stretch among user-submitted applications. With two reasonable assumptions, that can be deduced from existing system traces, we are able to minimize the stretch using only local information. In this way, we avoid a centralized design and provide scalability and fault tolerance. As a result, our policy performs just 11% worse than a centralized implementation, and largely outperforms other common policies. Additionally, it easily scales to hundreds of thousands of nodes. We presume that it can scale to millions with a minimal overhead. Finally, we also show that preemption is crucial to provide fairness in any case.

Introduction

It is common for a distributed computing platform to be shared among several users, for instance, a cluster giving service to several researchers of an academic institution, or a commercial cloud infrastructure attending millions of requests from around the world. All of them would like to obtain a fair share of the platform. However, the most common scheduling policies are incompatible with the fairness objective. They unbalance the share of the platform among users to maximize the global throughput, minimize the makespan or satisfy the negotiated SLA terms. So, it is the scheduling policy itself who must enforce the fair sharing of the platform among its users. While several such policies have been proposed [1], [2], [3], [4], [5], [6], [7], [8], [9], they have serious scalability limitations. They are usually implemented with a centralized design that relies on full knowledge of the platform and the workload. This prevents them from managing the scheduling of tasks on systems of thousands, or even millions of nodes, a scale that is becoming more common every day.

In this paper we present the Fair Share Policy (FSP), a scheduling policy that allocates bag-of-tasks (BoT) applications with fairness in mind. It is a policy for STaRS [10], a scheduling model that can be implemented as part of a distributed computing platform. It provides scalability, fault-tolerance and the ability to support different scheduling policies. It is based on decentralized algorithms that eliminate the bottlenecks of a centralized design, and it is best suited for environments with millions of nodes. So, throughout the paper we assume that we deal with a very large platform and no centralized scheduler.

To measure the share of the platform, we consider the amount of computation that each user wants to get done. In this case, the most suited metric seems to be the maximum stretch, or slowdown [11], [2]. The stretch of an application is defined as the ratio of its response time under the concurrent scheduling of applications to its response time when it is the only application executed on the platform. It is the user’s perception of how slow its applications run due to its sharing the platform with other users. Let $r_{i}$ be the release time of an application $A_{i}, e_{i}$ its end time and $Θ_{i}$ its response time in a platform dedicated to itself, its stretch $S_{i}$ is calculated as $S_{i} = \frac{e_{i} - r_{i}}{Θ_{i}} .$ A perfectly fair share of the platform is obtained when all the applications obtain the same stretch. However, this is only possible with offline scheduling and divisible load. With these conditions, the scheduler can adjust the end time of each application to reach the optimum objective. Instead, we consider a classic configuration in distributed computing, with online scheduling and atomic tasks. So, the best tradeoff is obtained by minimizing the maximum stretch among all applications.

We already presented a first design of this policy in [12]. In this paper, we widely improve the estimation and representation of the stretch, to obtain much better results. In particular, the main problem we face is that computing $Θ_{i}$ usually requires full knowledge of the platform. This is impractical with a decentralized design, so we assume two reasonable hypothesis:

•
Each application has much less tasks than nodes in the platform.
•
The distribution of computing power among nodes changes very little.

With these premises, we are able to minimize the maximum stretch without full knowledge of the platform.

The rest of the paper is organized as follows: Section 2 presents the related work on fair scheduling, both in centralized and decentralized environments. Then, Section 3 explains how we solve the problem of minimizing the maximum stretch without full knowledge of the platform. Section 4 gives a brief description of the architecture of STaRS, on which this paper is based. In Sections 5 FSP local policy, 6 FSP global policy, we explain the details of the FSP policy. And finally, Section 7 presents the results of the experiments and in Section 8 we give our conclusions and a description of the future work.

Section snippets

Related work

Several works have approached the fair scheduling of different kinds of applications by minimizing the stretch. Benoit et al. [1] study the minimization of maximum stretch for concurrent BoT applications, like we do, but in a centralized setting. They show that interleaving tasks of several concurrent BoT applications perform better than scheduling each application after the other. Previously, Legrand et al. [2] focused on the scheduling of divisible load applications. In particular, they

Fairness in a decentralized BoT environment

There are several issues we have to deal with if we want to achieve fairness among BoT applications in a decentralized environment. We consider that a BoT application $A_{i}$ consists of $n_{i}$ tasks of equal length $a_{i}$ , in millions of FLOPs. This is a common model [1], [21], although BoT applications with variable-length tasks [33] will be tackled in the future. As stated by Eq. (1), to compute the stretch of an application $A_{i}$ we need to calculate its response time if it was alone in the platform, $Θ_{i}$ .

The architecture of STaRS

As we said before, in this paper we present a scheduling policy for STaRS, with fairness as its objective. STaRS is an online distributed scheduling model. It provides a set of tools and concepts to allocate different applications to a heterogeneous set of nodes in a decentralized way. With its design, it simultaneously provides with the scalability, fault-tolerance and versatility that most distributed computing platforms lack. Here, we summarize its characteristics, but we encourage the

FSP local policy

Each policy in STaRS has two parts: a local and a global one. The local part of the FSP policy specifies the order in which tasks should be run by an execution node, so that the maximum slowness is minimized. However, at each execution node, there is only information about the applications with one or more tasks in the queue. So, its objective is to minimize the maximum slowness among these applications. Moreover, in order to estimate the eventual slowness of an application, it must calculate

FSP global policy

The global part of the FSP policy must minimize the maximum slowness among all the applications currently allocated to any node. This can be achieved by minimizing the maximum slowness among all the execution nodes. It does not matter if the same application sets different maximums in different execution nodes, because we only care about the highest value. So, the global policy provides the mechanisms to route tasks towards the nodes where they will make the maximum slowness increase the least.

Experimentation

Like we did in [10] for other policies, we have measured the scalability and performance of the FSP policy through a set of tests and simulations. We have first developed tests to evaluate the accuracy of the aggregation scheme. They are run by a specific evaluation program that aggregates the information of a set of nodes in the same way that would be done in the tree.

Then, we have also observed our model under more realistic conditions with an ad-hoc discrete event simulator (DES).¹

Conclusions and future work

In this paper, we propose a decentralized model for a scheduling policy whose objective is fairness. It tries to provide a similar share of the platform to every submitted application. To measure this share, we consider the amount of computation that each user wants to get done. In this case, the most suited metric seems to be the stretch. So, our policy tries to minimize the maximum stretch among all the applications.

Our contribution is triple. First, we propose a method to compute the stretch

Acknowledgment

The research work presented in this paper has been supported by the Spanish Ministry of Economy under the program “Programa de I+D+i Estatal de Investigacion, Desarrollo e innovacion Orientada a los Retos de la Sociedad”, project id TIN2013-40809-R; and by COSMOS and GISED, research groups recognized by the Aragonese Government.

Javier Celaya received his Ph.D. and M.Sc. degrees in computer science from the University of Zaragoza, Spain, in 2013 and 2005. His research interests include distributed systems and applications, grid and cloud computing, computer networks and discrete simulation.

References (42)

P. Kokkinos et al.
A framework for providing hard delay guarantees and user fairness in grid computing
Future Gener. Comput. Syst.
(2009)
J. Celaya et al.
A task routing approach to large-scale scheduling
Future Gener. Comput. Syst.
(2013)
P.-O. Östberg et al.
Decentralized scalable fairshare scheduling
Future Gener. Comput. Syst.
(2013)
R. Bertin et al.
Fair scheduling of bag-of-tasks applications using distributed Lagrangian optimization
J. Parallel Distrib. Comput.
(2014)
A. Benoit et al.
Scheduling concurrent bag-of-tasks applications on heterogeneous platforms
IEEE Trans. Comput.
(2010)
A. Legrand et al.
Minimizing the stretch when scheduling flows of biological requests
H. Casanova et al.
Minimizing stretch and makespan of multiple parallel task graphs via malleable allocations
J. Emeras, V. Pinheiro, K. Rzadca, D. Trystram, OStrich: fair scheduling for multiple submissions, in: Proceedings of...
M. Zaharia et al.
Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling
A. Ghodsi et al.
Choosy: max–min fair sharing for datacenter jobs with constraints

D. Dolev et al.

No justified complaints: on fair sharing of multiple resources

C. Joe-Wong et al.

Multiresource allocation: fairness-efficiency tradeoffs in a unifying framework

IEEE/ACM Trans. Netw.

(2013)

S. Muthukrishnan et al.

Online scheduling to minimize average stretch

J. Celaya et al.

A fair decentralized scheduler for bag-of-tasks applications on desktop grids

Y. Wu et al.

Stretch-optimal scheduling for on-demand data broadcasts

J. Dean et al.

MapReduce: simplified data processing on large clusters

Commun. ACM

(2008)

H.P.M. Committee, Apache Hadoop website, March 2014....

M. Isard et al.

Dryad: distributed data-parallel programs from sequential building blocks

M. Isard et al.

Quincy: fair scheduling for distributed computing clusters

B. Hindman et al.

A common substrate for cluster computing

A. Ghodsi et al.

Dominant resource fairness: fair allocation of multiple resource types

Cited by (16)

Improving task scheduling with parallelism awareness in heterogeneous computational environments
2019, Future Generation Computer Systems
Citation Excerpt :
Therefore, a lot of work applied various technologies, heuristic [5,6], meta-heuristic [7], machine learning [8], etc., to achieve acceptable solutions for the scheduling problem with tolerable time consumption. While, to our best knowledge, existing task scheduling works considered that a task can be executed on only one core without parallelization [5–18] or on a fixed number of cores [19–23], or exhausted all resource at a time without task concurrency on a server [24–28], which leads to the inefficient use of resources, as illustrated by an example as following. Fig. 1 shows the results of scheduling four tasks on a server with four cores with various parallelism awarenesses using earliest deadline first (EDF).1
Task scheduling is a key function for executing tasks in heterogeneous computational environments, efficiently. While the available computing resources are not fully used when applying existing scheduling methods as they consider that a task is executed on one single core or on a server without parallel tasks by assuming that the task exhausts the server. Therefore, in this paper, we focus on the problem of executing tasks with deadline constraints with parallelism awareness where the parallel degree of each task can be tuned between one and its maximum according to the available cores of the server it assigned to during its execution. We first model the problem as an optimization problem maximizing the overall utilization of servers, and propose a set of scheduling methods with parallelism awareness (SPA), each of which iteratively allocates as much resources and as soon as possible to the assigned task with the earliest deadline on a server, based on existing scheduling algorithms, and present two SPA instances to illustrate the implement of SPA. Experiment results show a great performance improvement in various aspects, e.g., resource utilization, task violations, finish time, and energy efficiency, when executing tasks heterogeneous computational systems using SPA.
A Cost-Optimized Data Parallel Task Scheduling in Multi-Core Resources Under Deadline and Budget Constraints
2022, International Journal of Cloud Applications and Computing
Achieving Fairness-Aware Two-Level Scheduling for Heterogeneous Distributed Systems
2021, IEEE Transactions on Services Computing
Unavailable time aware scheduling of hybrid task on heterogeneous distributed system
2020, IAENG International Journal of Applied Mathematics
Scheduling Bag-of-Task-Chains in Distributed Systems
2019, Proceedings - 2019 IEEE 14th International Symposium on Autonomous Decentralized Systems, ISADS 2019
Off-Line Time Aware Scheduling of Bag-of-Tasks on Heterogeneous Distributed System
2019, IEEE Access

View all citing articles on Scopus

Unai Arronategui received his Ph.D. and M.Sc. degrees in computer science from the Paul Sabatier University, Toulouse, France, in 1992 and 1988, respectively. Since 2000, he has been an associate professor at the University of Zaragoza, Spain. His research interests are in distributed systems and computer networks.

View full text

Fair scheduling of bag-of-tasks applications on large-scale platforms

Highlights

Abstract

Introduction

Section snippets

Related work

Fairness in a decentralized BoT environment

The architecture of STaRS

FSP local policy

FSP global policy

Experimentation

Conclusions and future work

Acknowledgment

Future Gener. Comput. Syst.

Future Gener. Comput. Syst.

Future Gener. Comput. Syst.

J. Parallel Distrib. Comput.

Scheduling concurrent bag-of-tasks applications on heterogeneous platforms

IEEE Trans. Comput.

Minimizing the stretch when scheduling flows of biological requests

Minimizing stretch and makespan of multiple parallel task graphs via malleable allocations

Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling

Choosy: max–min fair sharing for datacenter jobs with constraints

No justified complaints: on fair sharing of multiple resources

Multiresource allocation: fairness-efficiency tradeoffs in a unifying framework

IEEE/ACM Trans. Netw.

Online scheduling to minimize average stretch

A fair decentralized scheduler for bag-of-tasks applications on desktop grids

Stretch-optimal scheduling for on-demand data broadcasts

MapReduce: simplified data processing on large clusters

Commun. ACM

Dryad: distributed data-parallel programs from sequential building blocks

Quincy: fair scheduling for distributed computing clusters

A common substrate for cluster computing

Dominant resource fairness: fair allocation of multiple resource types