skip to main content
research-article

BoPF: Mitigating the Burstiness-Fairness Tradeoff in Multi-Resource Clusters

Published:17 January 2019Publication History
Skip Abstract Section

Abstract

Even though batch, interactive, and streaming applications all care about performance, their notions of performance are different. For instance, while the average completion time can suffciently capture the performance of a throughout-sensitive batch-job queue (TQ) [5], interactive sessions and streaming applications form latencysensitive queues (LQ): each LQ is a sequence of small jobs following an ON-OFF pattern. For these jobs [7], individual completion times or latencies are far more important than the average completion time or the throughput of the LQ.

Indeed, existing "fair" schedulers are inherently unfair to LQ jobs: when LQ jobs are present (ON state), they must share the resources equally with TQ jobs, but when they are absent (OFF state), batch jobs get all the resources. In the long run, TQs receive more resources than their fair shares because today's schedulers such as Dominant Resource Fairness [4] make instantaneous decisions

Clearly, it is impossible to achieve the best response time for LQ jobs under instantaneous fairness. In other words, there is a hard tradeoff between providing instantaneous fairness for TQs and minimizing the response time of LQs. However, instantaneous fairness is not necessary for TQs because average-completion time over a relatively long time horizon is their most important metric. This sheds light on the following question: how well can we simultaneously accommodate multiple classes of workloads with performance guarantees, in particular, isolation protection for TQs in terms of long-term fairness and low response times for LQs?

This work serves as our first step in answering the question by designing BoPF: the first multi-resource scheduler that achieves both isolation protection for TQs and response time guarantees for LQs in a strategyproof way. The key idea is "bounded" priority for LQs: as long as the burst is not too large to hurt the long-term fair share of TQs and other LQs, they are given higher priority so jobs can be completed as quickly as possible.

References

  1. BoPF: Mitigating the Burstiness-Fairness Tradeoff in Multi-Resource Clusters -- Technical Report. https://bit.ly/2rBZDjc.Google ScholarGoogle Scholar
  2. Big-Data-Benchmark-for-Big-Bench. https://github.com/intel-hadoop/ Big-Data-Benchmark-for-Big-Bench, 2016.Google ScholarGoogle Scholar
  3. N. Bansal and M. Harchol-Balter. Analysis of SRPT scheduling: Investigating unfairness, volume 29. ACM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant resource fairness: Fair allocation of multiple resource types. In NSDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Grandl, G. Ananthanarayanan, S. Kandula, S. Rao, and A. Akella. Multi-resource packing for cluster schedulers. In SIGCOMM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. L. Kleinrock and R. Gail. Queueing systems: Problems and Solutions. Wiley, 1996.Google ScholarGoogle Scholar
  7. M. Zaharia, T. Das, H. Li, S. Shenker, and I. Stoica. Discretized streams: Faulttolerant stream computation at scale. In SOSP, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. BoPF: Mitigating the Burstiness-Fairness Tradeoff in Multi-Resource Clusters
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGMETRICS Performance Evaluation Review
      ACM SIGMETRICS Performance Evaluation Review  Volume 46, Issue 2
      September 2018
      95 pages
      ISSN:0163-5999
      DOI:10.1145/3305218
      Issue’s Table of Contents

      Copyright © 2019 Authors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 17 January 2019

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader