skip to main content
10.1145/2312005.2312051acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article

Near-optimal scheduling mechanisms for deadline-sensitive jobs in large computing clusters

Published:25 June 2012Publication History

ABSTRACT

We consider a market-based resource allocation model for batch jobs in cloud computing clusters. In our model, we incorporate the importance of the due date of a job rather than the number of servers allocated to it at any given time. Each batch job is characterized by the work volume of total computing units (e.g., CPU hours) along with a bound on maximum degree of parallelism. Users specify, along with these job characteristics, their desired due date and a value for finishing the job by its deadline. Given this specification, the primary goal is to determine the scheduling} of cloud computing instances under capacity constraints in order to maximize the social welfare (i.e., sum of values gained by allocated users). Our main result is a new ( C/(C-k) ⋅ s/(s-1))-approximation algorithm for this objective, where C denotes cloud capacity, k is the maximal bound on parallelized execution (in practical settings, k l C) and s is the slackness on the job completion time i.e., the minimal ratio between a specified deadline and the earliest finish time of a job. Our algorithm is based on utilizing dual fitting arguments over a strengthened linear program to the problem.

Based on the new approximation algorithm, we construct truthful allocation and pricing mechanisms, in which reporting the job true value and properties (deadline, work volume and the parallelism bound) is a dominant strategy for all users. To that end, we provide a general framework for transforming allocation algorithms into truthful mechanisms in domains of single-value and multi-properties. We then show that the basic mechanism can be extended under proper Bayesian assumptions to the objective of maximizing revenues, which is important for public clouds. We empirically evaluate the benefits of our approach through simulations on data-center job traces, and show that the revenues obtained under our mechanism are comparable with an ideal fixed-price mechanism, which sets an on-demand price using oracle knowledge of users' valuations. Finally, we discuss how our model can be extended to accommodate uncertainties in job work volumes, which is a practical challenge in cloud settings.

References

  1. Ganesh Ananthanarayanan, Srikanth Kandula, Albert G. Greenberg, Ion Stoica, Yi Lu, Bikas Saha, and Edward Harris. Reining in the outliers in map-reduce clusters using mantri. In OSDI, pages 1--16. USENIX Association, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Aaron Archer and Eva Tardos. Truthful mechanisms for one-parameter agents. In FOCS, pages 482--491, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aaron Archer and Robert Kleinberg. Characterizing truthful mechanisms with convex type spaces. SIGecom Exchanges, 7(3), 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Amotz Bar-Noy, Reuven Bar-Yehuda, Ari Freund, Joseph Naor, and Baruch Schieber. A unified approach to approximating resource allocation and scheduling. Journal of the ACM (JACM), 48:1069--1090, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Amotz Bar-Noy, Sudipto Guha, Joseph Naor, and Baruch Schieber. Approximating the throughput of multiple machines in real-time scheduling. SIAM Journal of Computing, 31(2):331--352, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sushil Bikhchandani, Shurojit Chatterji, Ron Lavi, Ahuva Mualem, Noam Nisam, and Arunava Sen. Weak monotonicity characterizes deterministic dominant strategy implementations. Econometrica, 74:1109--1132, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  7. Peter Brucker. Scheduling Algorithms. Springer, 4th edition, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yang Cai, Constantinos Daskalakis, and S. Matthew Weinberg. On optimal multidimensional mechanism design. ACM SIGecom Exchanges, 10(2):29--33, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Robert D. Carr, Lisa K. Fleischer, Vitus J. Leung, and Cynthia A. Phillips. Strengthening integrality gaps for capacitated network design and covering problems. In SODA, pages 106--115, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Uriel Feige and Moshe Tennenholtz. Mechanism design with uncertain inputs: (to err is human, to forgive divine). pages 549--558, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Albert G. Greenberg, James R. Hamilton, David A. Maltz, and Parveen Patel. The cost of a cloud: research problems in data center networks. ACM SIGCOMM Computer Communication Review, 39(1):68--73, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mohammad Taghi Hajiaghayi, Robert Kleinberg, Mohammad Mahdian, and David C. Parkes. Online auctions with re-usable goods. pages 165--174, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Navendu Jain, Ishai Menache, Joseph Naor, and Jonathan Yaniv. A truthful mechanism for value-based scheduling in cloud computing. In SAGT, pages 178--189, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Ron Lavi and Chaitanya Swamy. Truthful mechanism design for multi-dimensional scheduling via cycle monotonicity. In EC, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Eugene L. Lawler. A dynamic programming algorithm for preemptive scheduling of a single machine to minimize the number of late jobs. Annals of Operation Research, 26:125--133, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Roger Myerson. Optimal auction design. In Mathematics of Operations Research, volume 6, pages 58--73. 1981.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Noam Nisan and Amir Ronen. Algorithmic mechanism design. In STOC, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Noam Nisan, Tim Roughgarden, Eva Tardos, and Vijay V. Vazirani. Algorithmic game theory. Cambridge University Press, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Cynthia A. Phillips, R. N. Uma, and Joel Wein. Off-line admission control for general scheduling problems. In SODA, pages 879--888, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jean Charles Rochet. A necessary and sufficient condition for rationalizability in quasi-linear context. Journal of Mathematical Economics, 16(2):191--200, 1987.Google ScholarGoogle ScholarCross RefCross Ref
  21. Michael Saks and Lan Yu. Weak monotonicity suffices for truthfulness on convex domains. In EC, pages 286--293, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Near-optimal scheduling mechanisms for deadline-sensitive jobs in large computing clusters

            Recommendations

            Reviews

            Amrinder Arora

            The nearly simultaneous emergence of cloud computing and big data analytics has brought on new sets of challenges. Organizations have started to replace their own infrastructure with large computing clusters hosted by cloud providers, such as Amazon and Google, and they buy these services using complex pricing mechanisms. The cloud providers currently provide computing power, while the application hosts are interested in completion of their jobs, irrespective of the computing power needed to complete those jobs. The missing link between the demand (job completion) and the supply (computing power) is a key challenge in the new model. This paper tries to address exactly this kind of challenge. The authors assume that the jobs are provided along with their completion values and resource requirements, and that the cloud providers can choose the sequence and resources with which to schedule the jobs. They propose an algorithm called GreedyRTL, which sorts the jobs in the order of their marginal values (value to resource ratio) similar to the greedy knapsack algorithm, and then schedules a job if it can be completed within its deadline. It can also reallocate previous resources within some constraints. The authors prove that GreedyRTL is a ( C / C - k ). ( s / s -1) approximation algorithm, where C is the capacity of the cloud, k is the bound on parallelization, and s is the slackness guarantee on the job completion time. "The objective of [... this] algorithm is to maximize the social welfare, which is the sum of [the] values of jobs that are completed before their deadline." Like many other interesting algorithms, GreedyRTL itself is easy to describe. The analysis is significantly more involved: it uses the formulation of the problem as a linear program and applies the technique of dual fitting to prove the approximation factor. In coming years, it is quite conceivable that the model discussed in this work may evolve and prove to be a key contribution of this paper, even more so than the algorithm and the analysis. That is understandable for works in new and rapidly evolving fields, especially considering that our understanding of what exactly should be measured is likely to evolve as well. Online Computing Reviews Service

            Access critical reviews of Computing literature here

            Become a reviewer for Computing Reviews.

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              SPAA '12: Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
              June 2012
              348 pages
              ISBN:9781450312134
              DOI:10.1145/2312005

              Copyright © 2012 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 25 June 2012

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Acceptance Rates

              Overall Acceptance Rate447of1,461submissions,31%

              Upcoming Conference

              SPAA '24

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader