research-article

Public Access

Pigeon: an Effective Distributed, Hierarchical Datacenter Job Scheduler

Authors:
Zhijun Wang

The University of Texas at Arlington

The University of Texas at Arlington
View Profile

,
Huiyang Li

The University of Texas at Arlington

The University of Texas at Arlington
View Profile

,
Zhongwei Li

The University of Texas at Arlington

The University of Texas at Arlington
View Profile

,
Xiaocui Sun

Guangdong Pharmaceutical University

Guangdong Pharmaceutical University
View Profile

,
Jia Rao

The University of Texas at Arlington

The University of Texas at Arlington
View Profile

,
Hao Che

The University of Texas at Arlington

The University of Texas at Arlington
View Profile

,
Hong Jiang

The University of Texas at Arlington

The University of Texas at Arlington
View Profile

SoCC '19: Proceedings of the ACM Symposium on Cloud ComputingNovember 2019Pages 246–258https://doi.org/10.1145/3357223.3362728

Published:20 November 2019Publication History

SoCC '19: Proceedings of the ACM Symposium on Cloud Computing

Pages 246–258

ABSTRACT

In today's datacenters, job heterogeneity makes it difficult for schedulers to simultaneously meet latency requirements and maintain high resource utilization. The state-of-the-art datacenter schedulers, including centralized, distributed, and hybrid schedulers, fail to ensure low latency for short jobs in large-scale and highly loaded systems. The key issues are the scalability in centralized schedulers, ineffective and inefficient probing and resource sharing in both distributed and hybrid schedulers.

In this paper, we propose Pigeon, a distributed, hierarchical job scheduler based on a two-layer design. Pigeon divides workers into groups, each managed by a separate master. In Pigeon, upon a job arrival, a distributed scheduler directly distribute tasks evenly among masters with minimum job processing overhead, hence, preserving highest possible scalability. Meanwhile, each master manages and distributes all the received tasks centrally, oblivious of the job context, allowing for full sharing of the worker pool at the group level to maximize multiplexing gain. To minimize the chance of head-of-line blocking for short jobs and avoid starvation for long jobs, two weighted fair queues are employed in each master to accommodate tasks from short and long jobs, separately, and a small portion of the workers are reserved for short jobs. Evaluation via theoretical analysis, trace-driven simulations, and a prototype implementation shows that Pigeon significantly outperforms Sparrow, a representative distributed scheduler, and Eagle, a hybrid scheduler.

References

Eric Boutin, Jaliya Ekanayake, Wei Lin, Bing Shi, Jingren Zhou, Zhengping Qian, Ming Wu, and Lidong Zhou. 2014. Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing. In Proceedings of OSDI.Google Scholar
Jake Brutlag. 2009. Speed matters for google web search. In Google.Google Scholar
Wei Chen, Jia Rao, and Xiaobo Zhou. 2017. Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization. In Proceedings of USENIX Annual Technical Conference.Google Scholar
Yanpei Chen, Sara Alspaugh, and Randy Katz. 2012. Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads. In Proceedings of VLDB Endowment.Google ScholarDigital Library
Yanpei Chen, Archana Ganapathi, Rean Griffith, and Randy Katz. 2011. The Case for Evaluating MapReduce Performance Using Workload Suites. In Proceedings of MASCOTS.Google ScholarDigital Library
Robert B. Cooper. 1981. Introduction to Queueing Theory. North Holland.Google Scholar
Carlo Curino, Subru Krishnan, Konstantinos Karanasos, Sriram Rao, Giovanni M. Fumarola, Botong Huang, Kishore Chaliparambil, Arun Suresh, Young Chen, Solom Heddaya, Roni Burd, Sarvesh Sakalanaga, Chris Douglas, Bill Ramsey, and Raghu Ramakrishnan. 2019. Hydra: a federated resource manager for data-center scale analytics. In Proceedings of USENIX Symposium on Networked Systems Design and Implementation (NSDI).Google Scholar
Jeffrey Dean and Luiz André Barroso. 2013. The Tail at Scale. Commun. ACM 56, 2 (2013).Google Scholar
Pamela Delgado, Diego Didona, Florin Dinu, and Willy Zwaenepoel. 2016. Job-aware scheduling in eagle: divide and stick to your probes. In Proceedings of ACM Symposium on Clod Computing (SOCC).Google ScholarDigital Library
Pamela Delgado, Diego Didona, Florin Dinu, and Willy Zwaenepoel. 2018. Kairos: Preemptive Data Center Scheduling Without Runtime Estimates. In Proceedings of ACM Symposium on Clod Computing (SOCC).Google ScholarDigital Library
Pamela Delgado, Florin Dinu, Anne-Marie Kermarrec, and Willy Zwaenepoel. 2015. Hawk: Hybrid Datacenter Scheduling. In Proceedings of USENIX Annual Technical Conference (ATC).Google Scholar
Andrew D. Fergusin, Peter Bodik, Srikanth Kandula, Eric Boutin, and Rodrigo Fonseca. 2012. Jockey: Guaranteed job latency in data parallel clusters. In Proceedings of EuroSys.Google ScholarDigital Library
Apache Software Foundation. 2018. Hadoop: YARN Federation. https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/Federation.htmlGoogle Scholar
Ionel Gog, Malte Schwarzkopf, Adam Gleave, Robert N. M. Watson, and Steven Hand. 2016. Firmanent: Fast, Centralized Cluster Scheduling at Scale. In Proceedings of USENIX Symposium on Iperating System Design (OSDI).Google Scholar
Benjamin Hindman, Andy Konwinski, Mati Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, Scott Shenker, and Ion Stoica. 2011. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In Proceedings of NSDI.Google Scholar
Chien-Chun Hung, Leana Golubchik, and Minlan Yu. 2011. Scheduling Jobs Across Geo-distributed Datacenters. In Proceedings of SoCC.Google Scholar
Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, and Andrew Goldberg. 2012. Quincy: Fair scheduling for distributed computing clusters. In Proceedings of SOSP.Google Scholar
Myeongjae Jeon, Saehoon Kim, Seung won Hwang, Yuxiong He, Sameh Elnikety, Alan L. Cox, and Scott Rixner. 2014. Predictive Parallelization: Taming Tail Latencies in Web Search. In Proceedings of the ACM SIGIR.Google ScholarDigital Library
Sangeetha Abdu Jyothi, Carlo Curino, Ishai Menache, Shravan Matthur Narayanamurthy, Alexey Tumanov, Jonathan Yaniv, Ruslan Mavlyutov, Inigo Goiri, Subru Krishnan, Janardhan Kulkarni, and Sriram Rao. 2016. Morpheus: Towards Automated SLOs for Enterprise Clusters. In Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI).Google Scholar
Konstantinos Karanasos, Sriram Rao, Chris Douglas, Kishore Chaliparambil, Giovanni Matteo Fumarola, Solom Heddaya, Raghu Ramakrishnan, and Sarvesh Sakalanaga. 2015. Mercury: Hybrid centralized and distributed scheduling in large shared clusters. In Proceedings of USENIX Annual Technical Conference (ATC).Google Scholar
Mansour Khelghatdoust and Vincent Gramolim. 2018. Peacock: Probe-Based Scheduling of Jobs by Rotating Between Elastic Queuess. In Proceedings of International Conference on Parallel and Distributed Computing.Google ScholarCross Ref
Kay Ousterhout, Patrick Wendell, Matei Zaharia, and Ion Stoica. 2013. Sparrow: Distributed, Low Latency Scheduling. In Proceedings of ACM Symposium on Operating System (SODP).Google ScholarDigital Library
Jeff Rasley, Konstantinos Karanasos, Srikanth Kandula, Rodrigo Fonseca, Milan Vojnovic, and Sriram Rao. 2016. Efficient Queue Management for Cluster Scheduling. In Proceedings EroSys.Google ScholarDigital Library
Charles Reiss, Alexey Tumanov, Gregory R. Ganger, Randy H. Katz, and Michael A. Kozuch. 2012. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of ACM Symposium on Cloud Computing (SOCC).Google Scholar
Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, and John Wilkes. 2013. Omega: flexible, scalable schedulers for large compute clusters. In Proceedings of EuroSys.Google ScholarDigital Library
Ross Sheldon. 2014. Introduction to Probability Models. Academic Press.Google Scholar
Ryan Scott Stutsman. 1987. Durabilit and Crash Recovery in Distributed In-Memory Storage Systems. In Dissertation of Doctor Philosophy.Google Scholar
Kun Suo, Jia Rao, Hong Jiang, and Witawas Srisa-an. 2018. Characterizing and Optimizing Hotspot Parallel Garbage Collection on Multicore Systems. In Proceedings of ACM European Conference on Computer systems (EuroSys).Google ScholarDigital Library
Lalith Suresh, Marco Canini, Stefan Schmid, and Anja Feldmann. 2015. C3: cutting tail latency in cloud data stores via adaptive replica selection. In Proceeding of USENIX NSDI.Google Scholar
Apache Thrift. 2017. Apache Thrift. https://thrift.apache.org/Google Scholar
Alexey Tumanov, Timothy Zhu, Jun Woo Park, Michael A. Kozuch, Mor Harchol-Balter, and Gregory R. Ganger. 2016. Tetrisched: Global rescheduling with adaptive plan-ahead in dynamic heterogeneous clusters. In Proceedings of EuroSys.Google Scholar
Vinod Kumar Vavilapalli, Arun C Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, Bikas Saha, Carlo Curino, Owen O'Malley, Sanjay Radia, Benjamin Reed, and Eric Baldeschwieler. 2013. Apache Hadoop YARN: Yet Another Resource Negotiator. In Proceedings of ACM Symposium on Cloud Computing (SOCC).Google ScholarDigital Library
Yiqian Xia, Rui Ren, Hongming Cai, Athanasios V. Vasilakos, and Zheng Lv. 2018. Daphne: A Flexible and Hybrid Scheduling Framework in Multi-Tenant Clusters. IEEE Transactions on Network and Service Management 15, 1 (2018).Google ScholarCross Ref
Matei Zaharia, Dhruba Borthakur, Joydeep Sarma, Khaled Elmeleegy, Scott Shenker, and Ion Stoica. 2010. Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling. In Proceedings of EuroSys.Google ScholarDigital Library

Index Terms

Pigeon: an Effective Distributed, Hierarchical Datacenter Job Scheduler
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
      1. Cloud computing

Recommendations

Modified Rate-Monotonic Algorithm for Scheduling Periodic Jobs with Deferred Deadlines

The deadline of a request is the time instant at which its execution must complete. The deadline of the request in any period of a job with deferred deadline is some time instant after the end of the period. The authors describe a semi-static priority-...
Read More
Improving Short Job Latency Performance in Hybrid Job Schedulers with Dice
ICPP '19: Proceedings of the 48th International Conference on Parallel Processing

It is common to find a mixture of both long batch jobs and latency-sensitive short jobs in enterprise data centers. Recently hybrid job schedulers emerge as attractive alternatives of conventional centralized job schedulers.

In this paper, we conduct ...
Read More
Toward balanced and sustainable job scheduling for production supercomputers

Job scheduling on production supercomputers is complicated by diverse demands of system administrators and amorphous characteristics of workloads. Specifically, various scheduling goals such as queuing efficiency and system utilization are usually ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

SoCC '19: Proceedings of the ACM Symposium on Cloud Computing
November 2019
503 pages
ISBN:9781450369732
DOI:10.1145/3357223

Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 November 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Job scheduling
datacenter
resource management
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
SoCC '19 Paper Acceptance Rate39of157submissions,25%Overall Acceptance Rate169of722submissions,23%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 23
  Total Citations
  View Citations
- 826
  Total Downloads
- Downloads (Last 12 months)170
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Pigeon: an Effective Distributed, Hierarchical Datacenter Job Scheduler

SoCC '19: Proceedings of the ACM Symposium on Cloud Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Modified Rate-Monotonic Algorithm for Scheduling Periodic Jobs with Deferred Deadlines

Improving Short Job Latency Performance in Hybrid Job Schedulers with Dice

Toward balanced and sustainable job scheduling for production supercomputers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Pigeon: an Effective Distributed, Hierarchical Datacenter Job Scheduler

SoCC '19: Proceedings of the ACM Symposium on Cloud Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Modified Rate-Monotonic Algorithm for Scheduling Periodic Jobs with Deferred Deadlines

Improving Short Job Latency Performance in Hybrid Job Schedulers with Dice

Toward balanced and sustainable job scheduling for production supercomputers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media