skip to main content
10.1145/1998582.1998637acmconferencesArticle/Chapter ViewAbstractPublication PagesicacConference Proceedingsconference-collections
research-article

ARIA: automatic resource inference and allocation for mapreduce environments

Published:14 June 2011Publication History

ABSTRACT

MapReduce and Hadoop represent an economically compelling alternative for efficient large scale data processing and advanced analytics in the enterprise. A key challenge in shared MapReduce clusters is the ability to automatically tailor and control resource allocations to different applications for achieving their performance goals. Currently, there is no job scheduler for MapReduce environments that given a job completion deadline, could allocate the appropriate amount of resources to the job so that it meets the required Service Level Objective (SLO). In this work, we propose a framework, called ARIA, to address this problem. It comprises of three inter-related components. First, for a production job that is routinely executed on a new dataset, we build a job profile that compactly summarizes critical performance characteristics of the underlying application during the map and reduce stages. Second, we design a MapReduce performance model, that for a given job (with a known profile) and its SLO (soft deadline), estimates the amount of resources required for job completion within the deadline. Finally, we implement a novel SLO-based scheduler in Hadoop that determines job ordering and the amount of resources to allocate for meeting the job deadlines.

We validate our approach using a set of realistic applications. The new scheduler effectively meets the jobs' SLOs until the job demands exceed the cluster resources. The results of the extensive simulation study are validated through detailed experiments on a 66-node Hadoop cluster.

References

  1. G. Ananthanarayanan, S. Kandula, A. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris. Reining in the Outliers in Map-Reduce Clusters using Mantri. In Proc. of OSDI'2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Apache. Capacity Scheduler Guide, 2010. URL http://hadoop. apache.org/common/docs/r0.20.1/capacity_scheduler.html.Google ScholarGoogle Scholar
  3. J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51 (1):107--113, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. A. Ganapathi, Y. Chen, A. Fox, R. Katz, and D. Patterson. Statistics-driven workload modeling for the cloud. In Proc. of 5th Intl. Workshop on Self Managing Database Systems, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  5. R.L. Graham. Bounds for certain multiprocessing anomalies. Bell System Tech. Journal, 45(9):1563--1581, 1966.Google ScholarGoogle Scholar
  6. Intel. Optimizing Hadoop* Deployments, 2010. URL http://communities.intel.com/docs/DOC-4218.Google ScholarGoogle Scholar
  7. M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: distributed data-parallel programs from sequential building blocks. ACM SIGOPS OS Review, 41(3):72, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar, and A. Goldberg. Quincy: fair scheduling for distributed computing clusters. In Proc. of SOSP'2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Kambatla, A. Pathak, and H. Pucha. Towards optimizing hadoop provisioning in the cloud. In Proc. of the First Workshop on Hot Topics in Cloud Computing, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. S. Kavulya, J. Tan, R. Gandhi, and P. Narasimhan. An Analysis of Traces from a Production MapReduce Cluster. In Proc. of CCGrid'2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Konwinski, M. Zaharia, R. Katz, and I. Stoica. X-tracing Hadoop. Hadoop Summit, 2008.Google ScholarGoogle Scholar
  12. H. Kwak, C. Lee, H. Park, and S. Moon. What is Twitter, a social network or a news media? In Proc. of WWW'2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. O. O'Malley and A.C. Murthy. Winning a 60 second dash with a yellow elephant, 2009.Google ScholarGoogle Scholar
  14. K. Morton, M. Balazinska, D. Grossman.ParaTimer: a progress indicator for MapReduce DAGs. In Proc. of SIGMOD'2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. L. Phan, Z. Zhang, B. Loo, and I. Lee. Real-time MapReduce Scheduling. Tech. Report No. MS-CIS-10-32, UPenn, 2010.Google ScholarGoogle Scholar
  16. J. Polo, D. Carrera, Y. Becerra, J. Torres, E. Ayguade, M. Steinder, and I. Whalley. Performance-driven task co-scheduling for MapReduce environments. In 12th IEEE/IFIP Network Operations and Management Symposium. ACM, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  17. T. Sandholm and K. Lai. Dynamic Proportional Share Scheduling in Hadoop. LNCS: Proc. of the 15th Workshop on Job Scheduling Strategies for Parallel Processing, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Tan, X. Pan, S. Kavulya, E. Marinelli, R. Gandhi, and P. Narasimhan. Kahuna: Problem Diagnosis for MapReduce-based Cloud Computing Environments. In 12th IEEE/IFIP NOMS, 2010.Google ScholarGoogle Scholar
  19. G. Wang, A.R. Butt, P. Pandey, and K. Gupta. A simulation approach to evaluating design decisions in MapReduce setups. In Proc of MASCOTS'2009.Google ScholarGoogle Scholar
  20. J. Wolf, et al. FLEX: A Slot Allocation Scheduling Optimizer for MapReduce Workloads. In Proc.of Middleware'2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. M. Zaharia, D. Borthakur, J. Sen Sarma, K. Elmeleegy, S. Shenker, and I. Stoica. Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling. In Proc. of EuroSys, pages 265--278. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. M. Zaharia, A. Konwinski, A.D. Joseph, R. Katz, and I. Stoica. Improving MapReduce performance in heterogeneous environments. In OSDI, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. ARIA: automatic resource inference and allocation for mapreduce environments

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICAC '11: Proceedings of the 8th ACM international conference on Autonomic computing
          June 2011
          278 pages
          ISBN:9781450306072
          DOI:10.1145/1998582

          Copyright © 2011 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 June 2011

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader