ABSTRACT
We present Falcon, a novel scheduler design for large scale data analytics workloads. To improve the quality of the scheduling decisions, Falcon uses a single central scheduler. To scale the central scheduler to support large clusters, Falcon offloads the scheduling operation to a programmable switch. The core of the Falcon design is a novel pipeline-based scheduling logic that can schedule tasks at line-rate. Our prototype evaluation on a cluster with a Barefoot Tofino switch shows that the proposed approach can reduce scheduling overhead by 26 times and increase the scheduling throughput by 25 times compared to state-of-the-art centralized and decentralized schedulers.
Supplemental Material
- Stonebraker, M., Çetintemel, U. and Zdonik, S. The 8 requirements of real-time stream processing. ACM Sigmod Record, 34, 4 (2005), 42--47. Google ScholarDigital Library
- Wang, S., Liagouris, J., Nishihara, R., Moritz, P., Misra, U., Tumanov, A. and Stoica, I. Lineage stash: fault tolerance off the critical path, 2019.Google Scholar
- Ousterhout, K., Panda, A., Rosen, J., Venkataraman, S., Xin, R., Ratnasamy, S., Shenker, S. and Stoica, I. The case for tiny tasks in compute clusters, 2013.Google Scholar
- Zhang, T., Chowdhery, A., Bahl, P., Jamieson, K. and Banerjee, S. The design and implementation of a wireless video surveillance system, 2015.Google Scholar
- Venkataraman, S., Panda, A., Ousterhout, K., Armbrust, M., Ghodsi, A., Franklin, M. J., Recht, B. and Stoica, I. Drizzle: Fast and adaptable stream processing at scale, 2017.Google Scholar
- Zaharia M., Chowdhury, M., Franklin, M. J., Shenker, S. and Stoica, I. Spark: Cluster computing with working sets. HotCloud, 10, 10--10 (2010), 95. Google ScholarDigital Library
- Dean, J. and Ghemawat, S. MapReduce: Simplified data processing on large clusters, 2004.Google Scholar
- Ren, X., Ananthanarayanan, G., Wierman, A. and Yu, M. Hopper: Decentralized speculation-aware cluster scheduling at scale, 2015.Google Scholar
- Gog, I., Schwarzkopf, M., Gleave, A., Watson, R. N. M. and Hand, S. Firmament: Fast, Centralized Cluster Scheduling at Scale. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 2016) Google ScholarDigital Library
- Ousterhout, K., Wendell, P., Zaharia, M. and Stoica, I. Sparrow: distributed, low latency scheduling, 2013.Google Scholar
- Delgado, P., Dinu, F., Kermarrec, A.-M. and Zwaenepoel, W. Hawk: Hybrid datacenter scheduling, 2015.Google Scholar
- Boutin, E., Ekanayake, J., Lin, W., Shi, B., Zhou, J., Qian, Z., Wu, M. and Zhou, L. Apollo: Scalable and coordinated scheduling for cloud-scale computing, 2014.Google Scholar
- Tofino-2 Second-generation of World's fastest P4-programmable Ethernet switch ASICs.Google Scholar
- Garefalakis, P., Karanasos, K. and Pietzuch, P. Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications. In Proceedings of the SoCC '19: Proceedings of the ACM Symposium on Cloud Computing, 2019) Google ScholarDigital Library
- Al-Kiswany, S., Yang, S., Arpaci-Dusseau, A. C. and Arpaci-Dusseau, R. H. NICE: Network-integrated cluster-efficient storage, 2017.Google Scholar
- Li, X., Sethi, R., Kaminsky, M., Andersen, D. G. and Freedman, M. J. Be fast, cheap and in control with SwitchKV, 2016.Google Scholar
- Li, J., Michael, E., Sharma, N. K., Szekeres, A. and Ports, D. R. Just say NO to paxos overhead: Replacing consensus with network ordering, 2016.Google Scholar
- Ports, D. R. K., Li, J., Liu, V., Sharma, N. K. and Krishnamurthy, A. Designing Distributed Systems Using Approximate Synchrony in Data Center Networks. In Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15), 2015) Google ScholarDigital Library
- Karanasos, K., Rao, S., Curino, C., Douglas, C., Chaliparambil, K., Fumarola, G. M., Heddaya, S., Ramakrishnan, R. and Sakalanaga, S. Mercury: Hybrid centralized and distributed scheduling in large shared clusters, 2015.Google Scholar
- Takruri, H., Kettaneh, I., Alquraan, A. and Al-Kiswany, S. FLAIR: Accelerating Reads with Consistency-Aware Network Routing, 2020.Google Scholar
- Jin, X., Li, X., Zhang, H., Foster, N., Lee, J., Soulé, R., Kim, C. and Stoica, I. Netchain: Scale-free sub-rtt coordination, 2018.Google Scholar
- Dang, H. T., Sciascia, D., Canini, M., Pedone, F. and Soulé, R. Netpaxos: Consensus at network speed, 2015.Google Scholar
- Jin, X., Li, X., Zhang, H., Soulé, R., Lee, J., Foster, N., Kim, C. and Stoica, I. Netcache: Balancing key-value stores with fast in-network caching, 2017.Google Scholar
- Ports, D. R. and Nelson, J. When Should The Network Be The Computer?, 2019.Google Scholar
- Sapio, A., Abdelaziz, I., Aldilaijan, A., Canini, M. and Kalnis, P. In-network computation is a dumb idea whose time has come, 2017.Google Scholar
- Kogias, M., Prekas, G., Ghosn, A., Fietz, J. and Bugnion, E. R2P2: Making RPCs first-class datacenter citizens. In Proceedings of the 2019 USENIX Annual Technical Conference (ATC 19), 2019) Google ScholarDigital Library
Index Terms
- Falcon: Low Latency, Network-Accelerated Scheduling
Recommendations
Scheduling of deteriorating jobs with release dates to minimize the maximum lateness
In this paper, we consider the problem of scheduling n deteriorating jobs with release dates on a single (batching) machine. Each job's processing time is a simple linear function of its starting time. The objective is to minimize the maximum lateness. ...
Modified Rate-Monotonic Algorithm for Scheduling Periodic Jobs with Deferred Deadlines
The deadline of a request is the time instant at which its execution must complete. The deadline of the request in any period of a job with deferred deadline is some time instant after the end of the period. The authors describe a semi-static priority-...
Scheduling jobs with agreeable processing times and due dates on a single batch processing machine
In this paper we study the problems of scheduling jobs with agreeable processing times and due dates on a single batch processing machine to minimize total tardiness, and weighted number of tardy jobs. We prove that the problem of minimizing total ...
Comments