skip to main content
10.1145/2684464.2684484acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicdcnConference Proceedingsconference-collections
short-paper

Improving Energy Efficiency of IO-Intensive MapReduce Jobs

Published: 04 January 2015 Publication History

Abstract

Map-Reduce is a popular data-parallel programming model for varied analysis of huge volumes of data. While a multicore and many CPU HPC infrastructure can be used to improve parallelism of map-reduce tasks, IO-bandwidth limitations may make them ineffective. IO-intensive activities are essential in any MapReduce cluster. In HPC nodes, IO-intensive jobs get queued at the IO-resources while the CPU remain underutilized, resulting in a poor performance, high power consumption and thus, energy inefficiency. In this paper, we investigate which power management setting can be used to improve the energy efficiency of IO-intensive MapReduce jobs by performing a thorough empirical study. Our analysis indicates that a constant CPU frequency can reduce the energy consumption of an IO-intensive job, while improving its performance. Consequently, we build a set of regression models to predict the energy consumption of IO-intensive jobs at a CPU frequency for a given input data volume. We obtained same set of models, with different coefficients, for two different types of IO-intensive jobs, which substantiates the suitability of identified models. These models predict respective outcomes with 80% accuracy for 80% of the new test cases.

References

[1]
Suse linux enterprise server system analysis and tuning guide. {online}. http://doc.opensuse.org/products/draft/SLES/SLES-tuning_sd_draft/index.html.
[2]
E. Bampis et al. Energy efficient scheduling of mapreduce jobs. CoRR, abs/1402.2810, 2014.
[3]
J. E. Burt et al. Elementary statistics for geographers. London: Guilford Press, 3rd edition, 2009.
[4]
S. Chatterjee et al. Regression Analysis by Example. John Wiley & Sons, Inc., 4th edition, 2006.
[5]
Y. Chen et al. Energy efficiency for large-scale mapreduce workloads with significant interactive analysis. In Proc. of EuroSys, pages 43--56, 2012.
[6]
J. Dean et al. Mapreduce: simplified data processing on large clusters. ACM Communications, 51(1):107--113, 2008.
[7]
B. Feng et al. Energy Efficiency for MapReduce Workloads: An In-depth Study. In Proc. of ADC, volume 124, pages 61--70, 2012.
[8]
Applications powered by Hadoop. {online}. http://wiki.apache.org/hadoop/PoweredBy.
[9]
J. Hartog et al. Configuring a mapreduce framework for dynamic and efficient energy adaptation. In Proc. of CLOUD, pages 914--921, 2012.
[10]
M. Horowitz et al. Low-power digital design. In Low Power Electronics, pages 8--11, 1994.
[11]
M. Jorgensen. Experience with the accuracy of software maintenance task effort prediction models. IEEE Transactions on Software Engineering, 21(8):674--681, 1995.
[12]
R. T. Kaushik et al. Predictive data and energy management in greenhdfs. In Proc. of IGCC, pages 1--9, 2011.
[13]
W. Lang et al. Energy management for mapreduce clusters. VLDB Endow., 3:129--139, 2010.
[14]
J. Leverich et al. On the energy (in)efficiency of hadoop clusters. SIGOPS Oper. Syst. Rev., 44(1):61--65, 2010.
[15]
W. Li et al. Energy prediction for mapreduce workloads. In Proc. of DASC, pages 443--448, 2011.
[16]
N. Maheshwari et al. Dynamic energy efficient data placement and cluster reconfiguration algorithm for mapreduce framework. Future Gener. Comput. Syst., 28(1):119--127, 2012.
[17]
P. Paraskevopoulos et al. Optimal tradeoff between energy consumption and response time in large-scale mapreduce clusters. In Proc. of PCI, pages 144--148, 2011.
[18]
N. B. Rizvandi et al. Multiple frequency selection in dvfs-enabled processors to minimize energy consumption. CoRR, abs/1203.5160, 2012.
[19]
D. Suleiman et al. Dynamic voltage frequency scaling (dvfs) for microprocessors power and energy reduction. In Proc. of ICEEE, 2005.
[20]
N. Tiwari et al. An empirical study of hadoop's energy efficiency on a HPC cluster. In Proc. of ICCS, pages 62--72, 2014.
[21]
N. Vasić et al. Making cluster applications energy-aware. In Proc. of ACDC, pages 37--42, 2009.
[22]
T. Wirtz et al. Improving mapreduce energy efficiency for computation intensive workloads. In Proc. of IGCC, pages 1--8, 2011.
[23]
N. Yigitbasi et al. Energy efficient scheduling of mapreduce workloads on heterogeneous clusters. In Proc. of GCM, pages 1:1--1:6, 2011.

Cited By

View all
  • (2020)A heuristic method towards deadline-aware energy-efficient mapreduce scheduling problem in Hadoop YARNCluster Computing10.1007/s10586-020-03146-724:2(683-699)Online publication date: 5-Jul-2020
  • (2019)An Energy Efficiency Optimization and Control Model for Hadoop ClustersIEEE Access10.1109/ACCESS.2019.29070187(40534-40549)Online publication date: 2019
  • (2018)Optimizing MapReduce for energy efficiencySoftware: Practice and Experience10.1002/spe.259948:9(1660-1687)Online publication date: 26-Jun-2018
  • Show More Cited By

Index Terms

  1. Improving Energy Efficiency of IO-Intensive MapReduce Jobs

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ICDCN '15: Proceedings of the 16th International Conference on Distributed Computing and Networking
      January 2015
      360 pages
      ISBN:9781450329286
      DOI:10.1145/2684464
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 January 2015

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. DVFS
      2. Energy characterization
      3. Energy efficiency
      4. MapReduce
      5. Power aware computing
      6. Predictive energy models

      Qualifiers

      • Short-paper
      • Research
      • Refereed limited

      Conference

      ICDCN '15

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 16 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2020)A heuristic method towards deadline-aware energy-efficient mapreduce scheduling problem in Hadoop YARNCluster Computing10.1007/s10586-020-03146-724:2(683-699)Online publication date: 5-Jul-2020
      • (2019)An Energy Efficiency Optimization and Control Model for Hadoop ClustersIEEE Access10.1109/ACCESS.2019.29070187(40534-40549)Online publication date: 2019
      • (2018)Optimizing MapReduce for energy efficiencySoftware: Practice and Experience10.1002/spe.259948:9(1660-1687)Online publication date: 26-Jun-2018
      • (2016)A review of big data environment and its related technologies2016 International Conference on Information Communication and Embedded Systems (ICICES)10.1109/ICICES.2016.7518904(1-5)Online publication date: Feb-2016

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media