research-article

BoPF: Mitigating the Burstiness-Fairness Tradeoff in Multi-Resource Clusters

Authors:
Tan N. Le

Stony Brook University, Stony Brook, NY, USA

Stony Brook University, Stony Brook, NY, USA
View Profile

,
Xiao Sun

Stony Brook University, Stony Brook, NY, USA

Stony Brook University, Stony Brook, NY, USA
View Profile

,
Mosharaf Chowdhury

University of Michigan, Ann Arbor, MI, USA

University of Michigan, Ann Arbor, MI, USA
View Profile

,
Zhenhua Liu

Stony Brook University, Stony Brook, NY, USA

Stony Brook University, Stony Brook, NY, USA
View Profile

ACM SIGMETRICS Performance Evaluation Review Volume 46 Issue 2September 2018pp 77–78https://doi.org/10.1145/3305218.3305246

Published:17 January 2019Publication History

ACM SIGMETRICS Performance Evaluation Review

Abstract

Even though batch, interactive, and streaming applications all care about performance, their notions of performance are different. For instance, while the average completion time can suffciently capture the performance of a throughout-sensitive batch-job queue (TQ) [5], interactive sessions and streaming applications form latencysensitive queues (LQ): each LQ is a sequence of small jobs following an ON-OFF pattern. For these jobs [7], individual completion times or latencies are far more important than the average completion time or the throughput of the LQ.

Indeed, existing "fair" schedulers are inherently unfair to LQ jobs: when LQ jobs are present (ON state), they must share the resources equally with TQ jobs, but when they are absent (OFF state), batch jobs get all the resources. In the long run, TQs receive more resources than their fair shares because today's schedulers such as Dominant Resource Fairness [4] make instantaneous decisions

Clearly, it is impossible to achieve the best response time for LQ jobs under instantaneous fairness. In other words, there is a hard tradeoff between providing instantaneous fairness for TQs and minimizing the response time of LQs. However, instantaneous fairness is not necessary for TQs because average-completion time over a relatively long time horizon is their most important metric. This sheds light on the following question: how well can we simultaneously accommodate multiple classes of workloads with performance guarantees, in particular, isolation protection for TQs in terms of long-term fairness and low response times for LQs?

This work serves as our first step in answering the question by designing BoPF: the first multi-resource scheduler that achieves both isolation protection for TQs and response time guarantees for LQs in a strategyproof way. The key idea is "bounded" priority for LQs: as long as the burst is not too large to hurt the long-term fair share of TQs and other LQs, they are given higher priority so jobs can be completed as quickly as possible.

References

BoPF: Mitigating the Burstiness-Fairness Tradeoff in Multi-Resource Clusters -- Technical Report. https://bit.ly/2rBZDjc.Google Scholar
Big-Data-Benchmark-for-Big-Bench. https://github.com/intel-hadoop/ Big-Data-Benchmark-for-Big-Bench, 2016.Google Scholar
N. Bansal and M. Harchol-Balter. Analysis of SRPT scheduling: Investigating unfairness, volume 29. ACM, 2001. Google ScholarDigital Library
A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant resource fairness: Fair allocation of multiple resource types. In NSDI, 2011. Google ScholarDigital Library
R. Grandl, G. Ananthanarayanan, S. Kandula, S. Rao, and A. Akella. Multi-resource packing for cluster schedulers. In SIGCOMM, 2014. Google ScholarDigital Library
L. Kleinrock and R. Gail. Queueing systems: Problems and Solutions. Wiley, 1996.Google Scholar
M. Zaharia, T. Das, H. Li, S. Shenker, and I. Stoica. Discretized streams: Faulttolerant stream computation at scale. In SOSP, 2013. Google ScholarDigital Library

Index Terms

BoPF: Mitigating the Burstiness-Fairness Tradeoff in Multi-Resource Clusters
1. Theory of computation
  1. Theory and algorithms for application domains
    1. Machine learning theory
      1. Reinforcement learning
        Sequential decision making

Index terms have been assigned to the content through auto-classification.

Recommendations

Scheduling of deteriorating jobs with release dates to minimize the maximum lateness

In this paper, we consider the problem of scheduling n deteriorating jobs with release dates on a single (batching) machine. Each job's processing time is a simple linear function of its starting time. The objective is to minimize the maximum lateness. ...
Read More
Primary-secondary bicriteria scheduling on identical machines to minimize the total completion time of all jobs and the maximum T-time of all machines

In this paper, we study a new primary-secondary bicriteria scheduling problem on identical machines. The primary objective is to minimize the total completion time of all jobs and the secondary objective is to minimize the maximum T-time of all machines,...
Read More
Minimizing Total Completion Time Subject to Job Release Dates and Preemption Penalties

Extensive research has been devoted to preemptive scheduling. However, little attention has been paid to problems where a certain time penalty must be incurred if preemption is allowed. In this paper, we consider the single-machine scheduling problem of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM SIGMETRICS Performance Evaluation Review Volume 46, Issue 2
September 2018
95 pages
ISSN:0163-5999
DOI:10.1145/3305218
Editor:
Nidhi Hegde
Borealis AI
Issue’s Table of Contents
Copyright © 2019 Authors
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 January 2019
Check for updates
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 104
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

BoPF: Mitigating the Burstiness-Fairness Tradeoff in Multi-Resource Clusters

ACM SIGMETRICS Performance Evaluation Review

Abstract

References

Cited By

Index Terms

Recommendations

Scheduling of deteriorating jobs with release dates to minimize the maximum lateness

Primary-secondary bicriteria scheduling on identical machines to minimize the total completion time of all jobs and the maximum T-time of all machines

Minimizing Total Completion Time Subject to Job Release Dates and Preemption Penalties

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

BoPF: Mitigating the Burstiness-Fairness Tradeoff in Multi-Resource Clusters

ACM SIGMETRICS Performance Evaluation Review

Abstract

References

Cited By

Index Terms

Recommendations

Scheduling of deteriorating jobs with release dates to minimize the maximum lateness

Primary-secondary bicriteria scheduling on identical machines to minimize the total completion time of all jobs and the maximum T-time of all machines

Minimizing Total Completion Time Subject to Job Release Dates and Preemption Penalties

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media