Abstract
Recently, MapReduce has been a popular distributed programming framework for solving data-intensive applications. However, a large-scale MapReduce cluster has inevitable machine/node failures and considerable energy consumption. To solve these problems, MapReduce has employed several policies for replicating input data, storing/replicating intermediate data, and re-executing failed tasks. In this study, we concentrate on two typical policies for storing/replicating intermediate data, and derive the job completion reliability (JCR for short) and job energy consumption (JEC for short) of a MapReduce cluster when the two policies are individually employed. The two policies are further analyzed and compared given various scenarios in which jobs with different input data sizes, numbers of reduce tasks, and other parameters are run in a MapReduce cluster with two extreme parallel execution capabilities. From the analytical results, MapReduce managers are able to comprehend how the two policies influence the JCR and JEC of a MapReduce cluster.
Similar content being viewed by others
References
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Dean J, Ghemawat S (2010) MapReduce: a flexible data processing tool. Commun ACM 53(1):72–77
Hadoop (2013). http://hadoop.apache.org. Accessed 22 April 2013
Chen S, Schlosser S (2008) Map-Reduce meets wider varieties of applications. Technical report IRP-TR-08-05, Intel Research
White B, Yeh T, Lin J, Davis L (2010) Web-scale computer vision using MapReduce for multimedia data mining. In: Proceedings of the international workshop on multimedia data mining, pp 1–10
Matsunaga A, Tsugawa M, Fortes J (2008) CloudBLAST: combining MapReduce and virtualization on distributed resources for bioinformatics applications. In: the IEEE international conference on e-science, pp 222–229
Wiley K, Connolly A, Gardner JP, Krughof S, Balazinska M, Howe B, Kwon Y, Bu Y (2011) Astronomy in the cloud: using Mapreduce for image coaddition. Astronomy 123(901):366–380
Ko S, Hoque I, Cho B, Gupta I (2010) Making cloud intermediate data fault-tolerant. In: Proceedings of the ACM symposium on cloud computing, pp 181–192
Barroso LA, Hölzle U (2009) The datacenter as a computer: an introduction to the design of Warehouse-Scale machines. Synthe Lect Comput Archit 4(1):1–108
Olston C, Reed B, Srivastava U, Kumar R, Tomkins A (2008) Pig latin: a not-so-foreign language for data processing. In: Proceedings of the ACM SIGMOD international conference on management of data, pp 1099–1110
Isard M, Budiu M, Yu Y, Birrell A, Fetterly D (2007) Dryad: distributed data-parallel programs from sequential building blocks. In: Proceedings of the EuroSys conference, pp 59–72
Moise D, Trieu T-T-L, Bouge L, Antoniu G (2011) Optimizing intermediate data management in MapReduce computations. In: Proceedings of the first international workshop on cloud computing platforms
Jiang D, Ooi BC, Shi L, Wu S (2010) The performance of Mapreduce: an in-depth study. Proc VLDB Endowm 3(1–2):472–483
Okorafor E, Patrick MK (2012) Availability of JobTracker machine in Hadoop/MapReduce zookeeper coordinated clusters. Adv Comput: Int J 3(3):19–29
Lin J-C, Leu F-Y, Chen Y-p (2013) Deriving job completion reliability and job energy consumption for a general MapReduce infrastructure from single-job perspective. In: The international conference on advanced information networking and applications workshops, pp 1642–1647
Dai Y-S, Yang B, Dongarra J, Zhang G (2009) Cloud service reliability: modeling and analysis. In: The IEEE Pacific Rim international symposium on dependable computing
Dinu F, Ng TS (2012) Understanding the effects and implications of compute node related failures in Hadoop. In: Proceedings of the international symposium on high-performance parallel and distributed computing, pp 187–198
Jin H, Qiao K, Sun X-H, Li Y (2011) Performance under failures of MapReduce applications. In: Proceedings of the IEEE/ACM international symposium on cluster, cloud and grid computing, pp 608–609
Liu C, Qin X, Kulkarni S, Wang C, Li S, Manzanares A, Baskiyar S (2008) Distributed energy-efficient scheduling for data-intensive applications with deadline constraints on data grids. In: The IEEE international conference on performance, computing and communications conference, pp 26–33
Lang W, Patel JM (2010) Energy management for MapReduce clusters. Proc VLDB Endowm 3(1–2):129–139
Feng B, Lu J, Zhou Y, Yang N (2012) Energy efficiency for MapReduce workloads: an in-depth study. In: Proceedings of the Australasian database conference, pp 61–69
White T (2009) Hadoop: the definitive guide, O’Reilly Media, Yahoo! Press, 5 June 2009
Wang G, Butt AR, Pandey P, Gupta K (2009) A simulation approach to evaluating design decisions in MapReduce setups. In: The international symposium on modelling, analysis and simulation of computer and telecommunication systems, pp 1–11
Haight FA (1967) Handbook of the Poisson distribution. Wiley, New York
Lin J-C, Leu F-Y, Chen Y-p (2013) Analyzing job completion reliability and job energy consumption for a general MapReduce infrastructure. J High Speed Netw 19(3):203–214
Acknowledgments
The work was partially supported by the GREENs project of TungHai University and the Ministry of Science and Technology, Taiwan under Grants NSC 101-2221-E-009-003-MY3 and NSC 101-2628-E-009-024-MY3, and NSC 102-2911-I-100-524. The authors are grateful to the National Center for High-performance Computing for facilities.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lin, JC., Leu, FY. & Chen, YP. Analyzing job completion reliability and job energy consumption for a heterogeneous MapReduce cluster under different intermediate-data replication policies. J Supercomput 71, 1657–1677 (2015). https://doi.org/10.1007/s11227-014-1286-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-014-1286-7