skip to main content
10.1145/3426744.3431322acmconferencesArticle/Chapter ViewAbstractPublication PagesconextConference Proceedingsconference-collections
research-article

Falcon: Low Latency, Network-Accelerated Scheduling

Published:01 December 2020Publication History

ABSTRACT

We present Falcon, a novel scheduler design for large scale data analytics workloads. To improve the quality of the scheduling decisions, Falcon uses a single central scheduler. To scale the central scheduler to support large clusters, Falcon offloads the scheduling operation to a programmable switch. The core of the Falcon design is a novel pipeline-based scheduling logic that can schedule tasks at line-rate. Our prototype evaluation on a cluster with a Barefoot Tofino switch shows that the proposed approach can reduce scheduling overhead by 26 times and increase the scheduling throughput by 25 times compared to state-of-the-art centralized and decentralized schedulers.

Skip Supplemental Material Section

Supplemental Material

3426744.3431322.mp4

mp4

33.6 MB

References

  1. Stonebraker, M., Çetintemel, U. and Zdonik, S. The 8 requirements of real-time stream processing. ACM Sigmod Record, 34, 4 (2005), 42--47. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Wang, S., Liagouris, J., Nishihara, R., Moritz, P., Misra, U., Tumanov, A. and Stoica, I. Lineage stash: fault tolerance off the critical path, 2019.Google ScholarGoogle Scholar
  3. Ousterhout, K., Panda, A., Rosen, J., Venkataraman, S., Xin, R., Ratnasamy, S., Shenker, S. and Stoica, I. The case for tiny tasks in compute clusters, 2013.Google ScholarGoogle Scholar
  4. Zhang, T., Chowdhery, A., Bahl, P., Jamieson, K. and Banerjee, S. The design and implementation of a wireless video surveillance system, 2015.Google ScholarGoogle Scholar
  5. Venkataraman, S., Panda, A., Ousterhout, K., Armbrust, M., Ghodsi, A., Franklin, M. J., Recht, B. and Stoica, I. Drizzle: Fast and adaptable stream processing at scale, 2017.Google ScholarGoogle Scholar
  6. Zaharia M., Chowdhury, M., Franklin, M. J., Shenker, S. and Stoica, I. Spark: Cluster computing with working sets. HotCloud, 10, 10--10 (2010), 95. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Dean, J. and Ghemawat, S. MapReduce: Simplified data processing on large clusters, 2004.Google ScholarGoogle Scholar
  8. Ren, X., Ananthanarayanan, G., Wierman, A. and Yu, M. Hopper: Decentralized speculation-aware cluster scheduling at scale, 2015.Google ScholarGoogle Scholar
  9. Gog, I., Schwarzkopf, M., Gleave, A., Watson, R. N. M. and Hand, S. Firmament: Fast, Centralized Cluster Scheduling at Scale. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 2016) Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ousterhout, K., Wendell, P., Zaharia, M. and Stoica, I. Sparrow: distributed, low latency scheduling, 2013.Google ScholarGoogle Scholar
  11. Delgado, P., Dinu, F., Kermarrec, A.-M. and Zwaenepoel, W. Hawk: Hybrid datacenter scheduling, 2015.Google ScholarGoogle Scholar
  12. Boutin, E., Ekanayake, J., Lin, W., Shi, B., Zhou, J., Qian, Z., Wu, M. and Zhou, L. Apollo: Scalable and coordinated scheduling for cloud-scale computing, 2014.Google ScholarGoogle Scholar
  13. Tofino-2 Second-generation of World's fastest P4-programmable Ethernet switch ASICs.Google ScholarGoogle Scholar
  14. Garefalakis, P., Karanasos, K. and Pietzuch, P. Neptune: Scheduling Suspendable Tasks for Unified Stream/Batch Applications. In Proceedings of the SoCC '19: Proceedings of the ACM Symposium on Cloud Computing, 2019) Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Al-Kiswany, S., Yang, S., Arpaci-Dusseau, A. C. and Arpaci-Dusseau, R. H. NICE: Network-integrated cluster-efficient storage, 2017.Google ScholarGoogle Scholar
  16. Li, X., Sethi, R., Kaminsky, M., Andersen, D. G. and Freedman, M. J. Be fast, cheap and in control with SwitchKV, 2016.Google ScholarGoogle Scholar
  17. Li, J., Michael, E., Sharma, N. K., Szekeres, A. and Ports, D. R. Just say NO to paxos overhead: Replacing consensus with network ordering, 2016.Google ScholarGoogle Scholar
  18. Ports, D. R. K., Li, J., Liu, V., Sharma, N. K. and Krishnamurthy, A. Designing Distributed Systems Using Approximate Synchrony in Data Center Networks. In Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15), 2015) Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Karanasos, K., Rao, S., Curino, C., Douglas, C., Chaliparambil, K., Fumarola, G. M., Heddaya, S., Ramakrishnan, R. and Sakalanaga, S. Mercury: Hybrid centralized and distributed scheduling in large shared clusters, 2015.Google ScholarGoogle Scholar
  20. Takruri, H., Kettaneh, I., Alquraan, A. and Al-Kiswany, S. FLAIR: Accelerating Reads with Consistency-Aware Network Routing, 2020.Google ScholarGoogle Scholar
  21. Jin, X., Li, X., Zhang, H., Foster, N., Lee, J., Soulé, R., Kim, C. and Stoica, I. Netchain: Scale-free sub-rtt coordination, 2018.Google ScholarGoogle Scholar
  22. Dang, H. T., Sciascia, D., Canini, M., Pedone, F. and Soulé, R. Netpaxos: Consensus at network speed, 2015.Google ScholarGoogle Scholar
  23. Jin, X., Li, X., Zhang, H., Soulé, R., Lee, J., Foster, N., Kim, C. and Stoica, I. Netcache: Balancing key-value stores with fast in-network caching, 2017.Google ScholarGoogle Scholar
  24. Ports, D. R. and Nelson, J. When Should The Network Be The Computer?, 2019.Google ScholarGoogle Scholar
  25. Sapio, A., Abdelaziz, I., Aldilaijan, A., Canini, M. and Kalnis, P. In-network computation is a dumb idea whose time has come, 2017.Google ScholarGoogle Scholar
  26. Kogias, M., Prekas, G., Ghosn, A., Fietz, J. and Bugnion, E. R2P2: Making RPCs first-class datacenter citizens. In Proceedings of the 2019 USENIX Annual Technical Conference (ATC 19), 2019) Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Falcon: Low Latency, Network-Accelerated Scheduling

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        EuroP4'20: Proceedings of the 3rd P4 Workshop in Europe
        December 2020
        71 pages
        ISBN:9781450381819
        DOI:10.1145/3426744

        Copyright © 2020 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 December 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader