Fair scheduling of bag-of-tasks applications on large-scale platforms

https://doi.org/10.1016/j.future.2015.03.002Get rights and content

Highlights

  • We present a scheduling model for fair resource sharing on large-scale platforms.

  • It effectively aggregates information about application stretch.

  • Task allocation is performed so that the maximum stretch is minimized.

  • Our model is able to perform similar to a centralized implementation.

  • The management overhead is bounded.

Abstract

Users of distributed computing platforms want to obtain a fair share of the resources they use. With respect to the amount of computation, the most suitable measure of fairness is the stretch. It describes the slowdown that the applications suffer for being executed in a shared platform, in contrast to being executed alone. In this paper, we present a decentralized scheduling policy that minimizes the maximum stretch among user-submitted applications. With two reasonable assumptions, that can be deduced from existing system traces, we are able to minimize the stretch using only local information. In this way, we avoid a centralized design and provide scalability and fault tolerance. As a result, our policy performs just 11% worse than a centralized implementation, and largely outperforms other common policies. Additionally, it easily scales to hundreds of thousands of nodes. We presume that it can scale to millions with a minimal overhead. Finally, we also show that preemption is crucial to provide fairness in any case.

Introduction

It is common for a distributed computing platform to be shared among several users, for instance, a cluster giving service to several researchers of an academic institution, or a commercial cloud infrastructure attending millions of requests from around the world. All of them would like to obtain a fair share of the platform. However, the most common scheduling policies are incompatible with the fairness objective. They unbalance the share of the platform among users to maximize the global throughput, minimize the makespan or satisfy the negotiated SLA terms. So, it is the scheduling policy itself who must enforce the fair sharing of the platform among its users. While several such policies have been proposed  [1], [2], [3], [4], [5], [6], [7], [8], [9], they have serious scalability limitations. They are usually implemented with a centralized design that relies on full knowledge of the platform and the workload. This prevents them from managing the scheduling of tasks on systems of thousands, or even millions of nodes, a scale that is becoming more common every day.

In this paper we present the Fair Share Policy (FSP), a scheduling policy that allocates bag-of-tasks (BoT) applications with fairness in mind. It is a policy for STaRS  [10], a scheduling model that can be implemented as part of a distributed computing platform. It provides scalability, fault-tolerance and the ability to support different scheduling policies. It is based on decentralized algorithms that eliminate the bottlenecks of a centralized design, and it is best suited for environments with millions of nodes. So, throughout the paper we assume that we deal with a very large platform and no centralized scheduler.

To measure the share of the platform, we consider the amount of computation that each user wants to get done. In this case, the most suited metric seems to be the maximum stretch, or slowdown  [11], [2]. The stretch of an application is defined as the ratio of its response time under the concurrent scheduling of applications to its response time when it is the only application executed on the platform. It is the user’s perception of how slow its applications run due to its sharing the platform with other users. Let ri be the release time of an application Ai,ei its end time and Θi its response time in a platform dedicated to itself, its stretch Si is calculated as Si=eiriΘi. A perfectly fair share of the platform is obtained when all the applications obtain the same stretch. However, this is only possible with offline scheduling and divisible load. With these conditions, the scheduler can adjust the end time of each application to reach the optimum objective. Instead, we consider a classic configuration in distributed computing, with online scheduling and atomic tasks. So, the best tradeoff is obtained by minimizing the maximum stretch among all applications.

We already presented a first design of this policy in  [12]. In this paper, we widely improve the estimation and representation of the stretch, to obtain much better results. In particular, the main problem we face is that computing Θi usually requires full knowledge of the platform. This is impractical with a decentralized design, so we assume two reasonable hypothesis:

  • Each application has much less tasks than nodes in the platform.

  • The distribution of computing power among nodes changes very little.

With these premises, we are able to minimize the maximum stretch without full knowledge of the platform.

The rest of the paper is organized as follows: Section  2 presents the related work on fair scheduling, both in centralized and decentralized environments. Then, Section  3 explains how we solve the problem of minimizing the maximum stretch without full knowledge of the platform. Section  4 gives a brief description of the architecture of STaRS, on which this paper is based. In Sections  5 FSP local policy, 6 FSP global policy, we explain the details of the FSP policy. And finally, Section  7 presents the results of the experiments and in Section  8 we give our conclusions and a description of the future work.

Section snippets

Related work

Several works have approached the fair scheduling of different kinds of applications by minimizing the stretch. Benoit et al.  [1] study the minimization of maximum stretch for concurrent BoT applications, like we do, but in a centralized setting. They show that interleaving tasks of several concurrent BoT applications perform better than scheduling each application after the other. Previously, Legrand et al.  [2] focused on the scheduling of divisible load applications. In particular, they

Fairness in a decentralized BoT environment

There are several issues we have to deal with if we want to achieve fairness among BoT applications in a decentralized environment. We consider that a BoT application Ai consists of ni tasks of equal length ai, in millions of FLOPs. This is a common model [1], [21], although BoT applications with variable-length tasks  [33] will be tackled in the future. As stated by Eq. (1), to compute the stretch of an application Ai we need to calculate its response time if it was alone in the platform, Θi.

The architecture of STaRS

As we said before, in this paper we present a scheduling policy for STaRS, with fairness as its objective. STaRS is an online distributed scheduling model. It provides a set of tools and concepts to allocate different applications to a heterogeneous set of nodes in a decentralized way. With its design, it simultaneously provides with the scalability, fault-tolerance and versatility that most distributed computing platforms lack. Here, we summarize its characteristics, but we encourage the

FSP local policy

Each policy in STaRS has two parts: a local and a global one. The local part of the FSP policy specifies the order in which tasks should be run by an execution node, so that the maximum slowness is minimized. However, at each execution node, there is only information about the applications with one or more tasks in the queue. So, its objective is to minimize the maximum slowness among these applications. Moreover, in order to estimate the eventual slowness of an application, it must calculate

FSP global policy

The global part of the FSP policy must minimize the maximum slowness among all the applications currently allocated to any node. This can be achieved by minimizing the maximum slowness among all the execution nodes. It does not matter if the same application sets different maximums in different execution nodes, because we only care about the highest value. So, the global policy provides the mechanisms to route tasks towards the nodes where they will make the maximum slowness increase the least.

Experimentation

Like we did in  [10] for other policies, we have measured the scalability and performance of the FSP policy through a set of tests and simulations. We have first developed tests to evaluate the accuracy of the aggregation scheme. They are run by a specific evaluation program that aggregates the information of a set of nodes in the same way that would be done in the tree.

Then, we have also observed our model under more realistic conditions with an ad-hoc discrete event simulator (DES).1

Conclusions and future work

In this paper, we propose a decentralized model for a scheduling policy whose objective is fairness. It tries to provide a similar share of the platform to every submitted application. To measure this share, we consider the amount of computation that each user wants to get done. In this case, the most suited metric seems to be the stretch. So, our policy tries to minimize the maximum stretch among all the applications.

Our contribution is triple. First, we propose a method to compute the stretch

Acknowledgment

The research work presented in this paper has been supported by the Spanish Ministry of Economy under the program “Programa de I+D+i Estatal de Investigacion, Desarrollo e innovacion Orientada a los Retos de la Sociedad”, project id TIN2013-40809-R; and by COSMOS and GISED, research groups recognized by the Aragonese Government.

Javier Celaya received his Ph.D. and M.Sc. degrees in computer science from the University of Zaragoza, Spain, in 2013 and 2005. His research interests include distributed systems and applications, grid and cloud computing, computer networks and discrete simulation.

References (42)

  • D. Dolev et al.

    No justified complaints: on fair sharing of multiple resources

  • C. Joe-Wong et al.

    Multiresource allocation: fairness-efficiency tradeoffs in a unifying framework

    IEEE/ACM Trans. Netw.

    (2013)
  • S. Muthukrishnan et al.

    Online scheduling to minimize average stretch

  • J. Celaya et al.

    A fair decentralized scheduler for bag-of-tasks applications on desktop grids

  • Y. Wu et al.

    Stretch-optimal scheduling for on-demand data broadcasts

  • J. Dean et al.

    MapReduce: simplified data processing on large clusters

    Commun. ACM

    (2008)
  • H.P.M. Committee, Apache Hadoop website, March 2014....
  • M. Isard et al.

    Dryad: distributed data-parallel programs from sequential building blocks

  • M. Isard et al.

    Quincy: fair scheduling for distributed computing clusters

  • B. Hindman et al.

    A common substrate for cluster computing

  • A. Ghodsi et al.

    Dominant resource fairness: fair allocation of multiple resource types

  • Cited by (16)

    • Improving task scheduling with parallelism awareness in heterogeneous computational environments

      2019, Future Generation Computer Systems
      Citation Excerpt :

      Therefore, a lot of work applied various technologies, heuristic [5,6], meta-heuristic [7], machine learning [8], etc., to achieve acceptable solutions for the scheduling problem with tolerable time consumption. While, to our best knowledge, existing task scheduling works considered that a task can be executed on only one core without parallelization [5–18] or on a fixed number of cores [19–23], or exhausted all resource at a time without task concurrency on a server [24–28], which leads to the inefficient use of resources, as illustrated by an example as following. Fig. 1 shows the results of scheduling four tasks on a server with four cores with various parallelism awarenesses using earliest deadline first (EDF).1

    • Unavailable time aware scheduling of hybrid task on heterogeneous distributed system

      2020, IAENG International Journal of Applied Mathematics
    • Scheduling Bag-of-Task-Chains in Distributed Systems

      2019, Proceedings - 2019 IEEE 14th International Symposium on Autonomous Decentralized Systems, ISADS 2019
    View all citing articles on Scopus

    Javier Celaya received his Ph.D. and M.Sc. degrees in computer science from the University of Zaragoza, Spain, in 2013 and 2005. His research interests include distributed systems and applications, grid and cloud computing, computer networks and discrete simulation.

    Unai Arronategui received his Ph.D. and M.Sc. degrees in computer science from the Paul Sabatier University, Toulouse, France, in 1992 and 1988, respectively. Since 2000, he has been an associate professor at the University of Zaragoza, Spain. His research interests are in distributed systems and computer networks.

    View full text