skip to main content
research-article

Hopper: Decentralized Speculation-aware Cluster Scheduling at Scale

Published:17 August 2015Publication History
Skip Abstract Section

Abstract

As clusters continue to grow in size and complexity, providing scalable and predictable performance is an increasingly important challenge. A crucial roadblock to achieving predictable performance is stragglers, i.e., tasks that take significantly longer than expected to run. At this point, speculative execution has been widely adopted to mitigate the impact of stragglers. However, speculation mechanisms are designed and operated independently of job scheduling when, in fact, scheduling a speculative copy of a task has a direct impact on the resources available for other jobs. In this work, we present Hopper, a job scheduler that is speculation-aware, i.e., that integrates the tradeoffs associated with speculation into job scheduling decisions. We implement both centralized and decentralized prototypes of the Hopper scheduler and show that 50% (66%) improvements over state-of-the-art centralized (decentralized) schedulers and speculation strategies can be achieved through the coordination of scheduling and speculation.

Skip Supplemental Material Section

Supplemental Material

p379-ren.webm

webm

151.7 MB

References

  1. Apache Thrift. https://thrift.apache.org/.Google ScholarGoogle Scholar
  2. Cloudera Impala. http://www.cloudera.com/content/cloudera/en/products-and-services/cdh/impala.html.Google ScholarGoogle Scholar
  3. Hadoop. http://hadoop.apache.org.Google ScholarGoogle Scholar
  4. Hadoop Capacity Scheduler. http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html.Google ScholarGoogle Scholar
  5. Hadoop Distributed File System. http://hadoop.apache.org/hdfs.Google ScholarGoogle Scholar
  6. Hadoop Slowstart. https://issues.apache.org/jira/browse/MAPREDUCE-1184/.Google ScholarGoogle Scholar
  7. Hive. http://wiki.apache.org/hadoop/Hive.Google ScholarGoogle Scholar
  8. Hopper Technical Report. https://sites.google.com/site/sigcommhoppertechreport/.Google ScholarGoogle Scholar
  9. Sparrow. https://github.com/radlab/sparrow.Google ScholarGoogle Scholar
  10. The Next Generation of Apache Hadoop MapReduce. http://developer.yahoo.com/blogs/hadoop/posts/2011/02/mapreduce-nextgen/.Google ScholarGoogle Scholar
  11. G. Ananthanarayanan, S. Agarwal, S. Kandula, A. Greenberg, I. Stoica, D. Harlan, and E. Harris. Scarlett: Coping with Skewed Popularity Content in MapReduce Clusters. In EuroSys, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Ananthanarayanan, A. Ghodsi, S. Shenker, and I. Stoica. Effective Straggler Mitigation: Attack of the Clones. In USENIX NSDI, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. Ananthanarayanan, A. Ghodsi, A. Wang, D. Borthakur, S. Kandula, S. Shenker, and I. Stoica. PACMan: Coordinated Memory Caching for Parallel Jobs. In USENIX NSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. G. Ananthanarayanan, M. Hung, X. Ren, I. Stoica, A. Wierman, and M. Yu. GRASS: Trimming Stragglers in Approximation Analytics. In USENIX NSDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, E. Harris, and B. Saha. Reining in the Outliers in Map-Reduce Clusters Using Mantri. In USENIX OSDI, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. E. Bortnikov, A. Frank, E. Hillel, and S. Rao. Predicting Execution Bottlenecks in Map-Reduce Clusters. In USENIX HotCloud, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. E. Boutin, J. Ekanayake, W. Kin, B. Shi, J. Zhou, Z. Qian, M. Wu, and L. Zhou. Apollo: Scalable and Coordinated Scheduling for Cloud-Scale Computing. In USENIX OSDI, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Bramson, Y. Lu, and B. Prabhakar. Randomized load balancing with general service time distributions. In Proceedings of Sigmetrics, pages 275--286, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. R. Chaiken, B. Jenkins, P. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou. SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets. Proceedings of the VLDB Endowment, (2), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Chaiken, B. Jenkins, P. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou. SCOPE: Easy and Efficient Parallel Processing of Massive Datasets. In VLDB, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Chen, J. Marden, and A. Wierman. On the Impact of Heterogeneity and Back-end Scheduling in Load Balancing Designs. In INFOCOM. IEEE, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  22. J. Dean. Achieving Rapid Response Times in Large Online Services. In Berkeley AMPLab Cloud Seminar, 2012.Google ScholarGoogle Scholar
  23. J. Dean and L. Barroso. The Tail at Scale. Communications of the ACM, (2), 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. F. Dogar, T. Karagiannis, H. Ballani, and A. Rowstron. Decentralized Task-aware Scheduling for Data Center Networks. In ACM SIGCOMM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. Dominant Resource Fairness: Fair Allocation of Multiple Resource Types. In USENIX NSDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Grandl, G. Ananthanarayanan, S. Kandula, S. Rao, and A. Akella. Multi-Resource Packing for Cluster Schedulers. In ACM SIGCOMM, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. M. Harchol-Balter, B. Schroeder, N. Bansal, and M. Agrawal. Size-based scheduling to improve web performance. ACM Transactions on Computer Systems (TOCS), 21(2):207--233, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In USENIX NSDI, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: Fair Scheduling for Distributed Computing Clusters. In ACM SOSP, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. M. Lin, L. Zhang, A. Wierman, and J. Tan. Joint Optimization of Overlapping Phases in MapReduce. Performance Evaluation, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. S. Melnik, A. Gubarev, J. J. Long, G. Romer, S. Shivakumar, M. Tolton, and T. Vassilakis. Dremel: Interactive Analysis of Web-Scale Datasets. In VLDB, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. B. Moseley, A. Dasgupta, R. Kumar, and T. Sarlós. On Scheduling in Map-reduce and Flow-shops. In ACM SPAA, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. K. Ousterhout, A. Panda, J. Rosen, S. Venkataraman, R. Xin, S. Ratnasamy, S. Shenker, and I. Stoica. The Case for Tiny Tasks in Compute Clusters. In USENIX HotOS, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. K. Ousterhout, R. Rasti, S. Ratnasamy, S. Shenker, and B. Chun. Making Sense of Performance in Data Analytics Frameworks. In USENIX NSDI, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. K. Ousterhout, P. Wendell, M. Zaharia, and I. Stoica. Sparrow: Distributed, Low Latency Scheduling. In ACM SOSP, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. K. Pruhs, J. Sgall, and E. Torng. Online scheduling. Handbook of scheduling: algorithms, models, and performance analysis, pages 15--1, 2004.Google ScholarGoogle Scholar
  38. A. Richa, M. Mitzenmacher, and R. Sitaraman. The power of two random choices: A survey of techniques and results. Combinatorial Optimization, 2001.Google ScholarGoogle Scholar
  39. L. Schrage. A proof of the optimality of the shortest remaining processing time discipline. Operations Research, 16(3):687--690, 1968.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. B. Sharma, V. Chudnovsky, J. L. Hellerstein, R. Rifaat, and C. R. Das. Modeling and Synthesizing Task Placement Constraints in Google Compute Clusters. In ACM SOCC, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Tan, X. Meng, and L. Zhang. Delay Tails in MapReduce Scheduling. ACM SIGMETRICS Performance Evaluation Review, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Y. Wang, J. Tan, W. Yu, L. Zhang, and X. Meng. Preemptive ReduceTask Scheduling for Fast and Fair Job Completion. USENIX ICAC, 2013.Google ScholarGoogle Scholar
  43. A. Wierman. Fairness and scheduling in single server queues. Surveys in Operations Research and Management Science, 16(1):39--48, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  44. A. Wierman and M. Harchol-Balter. Classifying scheduling policies with respect to unfairness in an m/gi/1. In ACM SIGMETRICS Performance Evaluation Review, volume 31, pages 238--249. ACM, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. J. Wolf, D. Rajan, K. Hildrum, R. Khandekar, V. Kumar, S. Parekh, K. Wu, and A. Balmin. FLEX: a Slot Allocation Scheduling Optimizer for MapReduce Workloads. In Middleware 2010. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. N. Yadwadkar, G. Ananthanarayanan, and R. Katz. Wrangler: Predictable and Faster Jobs using Fewer Resources. In ACM SoCC, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. M. Zaharia, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. Job scheduling for multi-user mapreduce clusters. In UC Berkeley Technical Report UCB/EECS-2009--55, 2009.Google ScholarGoogle Scholar
  48. M. Zaharia, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling. In ACM EuroSys, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. Franklin, S. Shenker, and I. Stoica. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. In USENIX NSDI, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica. Improving MapReduce Performance in Heterogeneous Environments. In USENIX OSDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Hopper: Decentralized Speculation-aware Cluster Scheduling at Scale

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGCOMM Computer Communication Review
        ACM SIGCOMM Computer Communication Review  Volume 45, Issue 4
        SIGCOMM'15
        October 2015
        659 pages
        ISSN:0146-4833
        DOI:10.1145/2829988
        Issue’s Table of Contents
        • cover image ACM Conferences
          SIGCOMM '15: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication
          August 2015
          684 pages
          ISBN:9781450335423
          DOI:10.1145/2785956

        Copyright © 2015 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 August 2015

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader