skip to main content
10.1145/2486159.2486187acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article

Efficient online scheduling for deadline-sensitive jobs: extended abstract

Published: 23 July 2013 Publication History

Abstract

We consider mechanisms for online deadline-aware scheduling in large computing clusters. Batch jobs that run on such clusters often require guarantees on their completion time (i.e., deadlines). However, most existing scheduling systems implement fair-share resource allocation between users, an approach that ignores heterogeneity in job requirements and may cause deadlines to be missed.
In our framework, jobs arrive dynamically and are characterized by their value and total resource demand (or estimation thereof), along with their reported deadlines. The scheduler's objective is to maximize the aggregate value of jobs completed by their deadlines. We circumvent known lower bounds for this problem by assuming that the input has slack, meaning that any job could be delayed and still finish by its deadline. Under the slackness assumption, we design a preemptive scheduler with a constant-factor worst-case performance guarantee. Along the way, we pay close attention to practical aspects, such as runtime efficiency, data locality and demand uncertainty. We evaluate the algorithm via simulations over real job traces taken from a large production cluster, and show that its actual performance is significantly better than other heuristics used in practice.
We then extend our framework to handle provider commitments: the requirement that jobs admitted to service must be executed until completion. We prove that no algorithm can obtain worst-case guarantees when enforcing the commitment decision to the job arrival time. Nevertheless, we design efficient heuristics that commit on job admission, in the spirit of our basic algorithm. We show empirically that these heuristics perform just as well as (or better than) the original algorithm. Finally, we discuss how our scheduling framework can be used to design truthful scheduling mechanisms, motivated by applications to commercial public cloud offerings.

References

[1]
A. Bar-Noy, R. Canetti, S. Kutten, Y. Mansour, and B. Schieber. Bandwidth allocation with preemption. SIAM J. Comput., 28(5):1806--1828, 1999.
[2]
R. Canetti and S. Irani. Bounding the power of preemption in randomized scheduling. SIAM J. Comput., 27(4):993--1015, 1998.
[3]
A. Ferguson, P. Bodik, S. Kandula, E. Boutin, and R. Fonseca. Jockey: guaranteed job latency in data parallel clusters. In. Proceedings of the 7th ACM european conference on Computer Systems, pages 99--112. ACM, 2012.
[4]
J. A. Garay, J. Naor, B. Yener, and P. Zhao. On-line admission control and packet scheduling with interleaving. In INFOCOM, 2002.
[5]
A. Ghodsi, V. Sekar, M. Zaharia, and I. Stoica. Multi-resource fair queueing for packet processing. In Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication, pages 1--12. ACM, 2012.
[6]
M. T. Hajiaghayi, R. Kleinberg, M. Mahdian, and D. C. Parkes. Online auctions with re-usable goods. pages 165--174, 2005.
[7]
H. Herodotou, F. Dong, and S. Babu. No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics. In Proceedings of the 2nd ACM Symposium on Cloud Computing, page 18. ACM, 2011.
[8]
M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: fair scheduling for distributed computing clusters. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, pages 261--276. ACM, 2009.
[9]
N. Jain, I. Menache, J. Naor, and J. Yaniv. A truthful mechanism for value-based scheduling in cloud computing. Algorithmic Game Theory, pages 178--189, 2011.
[10]
N. Jain, I. Menache, J. Naor, and J. Yaniv. Near-optimal scheduling mechanisms for deadline-sensitive jobs in large computing clusters. In SPAA, pages 255--266, 2012.
[11]
V. Jalaparti, H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron. Bridging the tenant-provider gap in cloud services. In ACM Symposium on Cloud Computing. ACM, 2012.
[12]
G. Koren and D. Shasha. Dover; an optimal on-line scheduling algorithm for overloaded real-time systems. In RTSS, pages 290--299. IEEE Computer Society, 1992.
[13]
G. Koren and D. Shasha. Moca: A multiprocessor on-line competitive algorithm for real-time system scheduling. Theor. Comput. Sci., 128(1&2):75--97, 1994.

Cited By

View all
  • (2025)Maximizing Throughput for Parallel Jobs with Speed-Up CurvesApproximation and Online Algorithms10.1007/978-3-031-81396-2_10(135-150)Online publication date: 12-Feb-2025
  • (2024)EdgeOPT: A Competitive Algorithm for Online Parallel Task Scheduling With Latency Guarantee in Mobile Edge ComputingIEEE Transactions on Communications10.1109/TCOMM.2024.341274172:11(7077-7092)Online publication date: Nov-2024
  • (2024)Online Data Driven Scheduling for Deadline-Sensitive Tasks of Mobile Edge Computing Enabled Consumer ElectronicsIEEE Transactions on Consumer Electronics10.1109/TCE.2024.336235070:1(4142-4154)Online publication date: Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '13: Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures
July 2013
348 pages
ISBN:9781450315722
DOI:10.1145/2486159
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 July 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. online scheduling
  2. resource allocation
  3. scheduling algorithms
  4. truthful mechanisms

Qualifiers

  • Research-article

Conference

SPAA '13

Acceptance Rates

SPAA '13 Paper Acceptance Rate 31 of 130 submissions, 24%;
Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25
37th ACM Symposium on Parallelism in Algorithms and Architectures
July 28 - August 1, 2025
Portland , OR , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)45
  • Downloads (Last 6 weeks)8
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Maximizing Throughput for Parallel Jobs with Speed-Up CurvesApproximation and Online Algorithms10.1007/978-3-031-81396-2_10(135-150)Online publication date: 12-Feb-2025
  • (2024)EdgeOPT: A Competitive Algorithm for Online Parallel Task Scheduling With Latency Guarantee in Mobile Edge ComputingIEEE Transactions on Communications10.1109/TCOMM.2024.341274172:11(7077-7092)Online publication date: Nov-2024
  • (2024)Online Data Driven Scheduling for Deadline-Sensitive Tasks of Mobile Edge Computing Enabled Consumer ElectronicsIEEE Transactions on Consumer Electronics10.1109/TCE.2024.336235070:1(4142-4154)Online publication date: Feb-2024
  • (2024)Edge-LLM: A Collaborative Framework for Large Language Model Serving in Edge Computing2024 IEEE International Conference on Web Services (ICWS)10.1109/ICWS62655.2024.00099(799-809)Online publication date: 7-Jul-2024
  • (2024)A competitive algorithm for throughput maximization on identical machinesMathematical Programming10.1007/s10107-023-02045-0206:1-2(497-514)Online publication date: 10-Jan-2024
  • (2020)Matching IoT Devices to the Fog Service Providers: A Mechanism Design PerspectiveSensors10.3390/s2023676120:23(6761)Online publication date: 26-Nov-2020
  • (2019)A General Framework for Handling Commitment in Online Throughput MaximizationInteger Programming and Combinatorial Optimization10.1007/978-3-030-17953-3_11(141-154)Online publication date: 13-Apr-2019
  • (2018)Scheduling Parallelizable Jobs Online to Maximize ThroughputLATIN 2018: Theoretical Informatics10.1007/978-3-319-77404-6_55(755-776)Online publication date: 13-Mar-2018
  • (2017)Simple Pricing Schemes for the CloudWeb and Internet Economics10.1007/978-3-319-71924-5_22(311-324)Online publication date: 25-Nov-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media