Abstract
MapReduce has been demonstrated to be a promising alternative to simplify parallel programming with high performance on single multicore machine. Compared to the cluster version, MapReduce does not have bottlenecks in disk and network I/O on single multicore machine, and it is more sensitive to characteristics of workloads. A single execution flow may be inefficient for many classes of workloads. For example, the fixed execution flow of the MapReduce program structure can impose significant overheads for workloads that inherently have only one emitted value per key, which are mainly caused by the unnecessary reduce phase. In this paper, we refine the workload characterization from Phoenix++ according to the attributes of key-value pairs, and give a demonstration that the refined workload characterization model covers all classes of MapReduce workloads. Based on the model, we propose a new MapReduce system with workload-customizable execution flow. The system, namely Peacock, is implemented on top of Phoenix++. Experiments with four different classes of benchmarks on a 16-core Intel-based server show that Peacock achieves better performance than Phoenix++ for workloads that inherently have only one emitted value per key (up to a speedup of \(3.6\times \)) while identical for other classes of workloads.








Similar content being viewed by others
References
The apache software foundation. Hadoop. http://hadoop.apache.org
Intel Corporation. Threading building blocks. http://www.threadingbuildingblocks.org
Stanford University. The Phoenix system for mapreduce programming. http://mapreduce.stanford.edu
Aviram A, Weng SC, Hu S, Ford B (2010) Efficient system-enforced deterministic parallelism. In: Proceedings of the 9th USENIX conference on operating systems design and implementation, OSDI’10USENIX Association, Berkeley, CA, USA, pp 1–16
Bergan T, Anderson O, Devietti J, Ceze L, Grossman D (2010) Coredet: a compiler and runtime system for deterministic multithreaded execution. In: Proceedings of the fifteenth edition of ASPLOS on architectural support for programming languages and operating systems, ASPLOS XVACM, New York, NY, USA, pp 53–64
Borkar S (2007) Thousand core chips: a technology perspective. In: Proceedings of the 44th annual design automation conference, DAC ’07ACM, New York, NY, USA, pp 746–749
Chen R, Chen H, Zang B (2010) Tiled-mapreduce: optimizing resource usages of data-parallel applications on multicore with tiling. In: Proceedings of the 19th international conference on parallel architectures and compilation techniques, PACT ’10ACM, New York, NY, USA, pp 523–534
Coplien JO (1995) Curiously recurring template patterns. C++ Rep 7(2):24–27
Dagum L, Menon R (1998) Openmp: an industry-standard api for shared-memory programming. IEEE Comput Sci Eng 5(1):46–55
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Feng M, Gupta R, Hu Y (2011) Spicec: scalable parallelism via implicit copying and explicit commit. SIGPLAN Not 46(8):69–80
He B, Fang W, Luo Q, Govindaraju NK, Wang T (2008) Mars: a mapreduce framework on graphics processors. In: Proceedings of the 17th international conference on parallel architectures and compilation techniques, PACT ’08ACM, New York, NY, USA, pp 260–269
Jiang W, Ravi VT, Agrawal G (2010) A map-reduce system with an alternate api for multi-core environments. In: Proceedings of the 2010 10th IEEE/ACM international conference on cluster, cloud and grid computing, CCGRID ’10IEEE Computer Society, Washington, DC, USA, pp 84–93
Jim G. Sort benchmark home page. http://sortbenchmark.org
Jin G, Zhang W, Deng D, Liblit B, Lu S (2012) Automated concurrency-bug fixing. In: Proceedings of the 10th USENIX conference on operating systems design and implementation, OSDI’12USENIX Association, Berkeley, CA, USA, pp 221–236
Liu T, Curtsinger C, Berger ED (2011) Dthreads: efficient deterministic multithreading. In: Proceedings of the twenty-third ACM symposium on operating systems principles, SOSP ’11ACM, New York, NY, USA, pp 327–336
Mao Y, Morris R, Kaashoek MF (2010) Optimizing mapreduce for multicore architectures. Technical report, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology
Ranger C, Raghuraman R, Penmetsa A, Bradski G, Kozyrakis C (2007) Evaluating mapreduce for multi-core and multiprocessor systems. In: Proceedings of the 2007 IEEE 13th international symposium on high performance computer architecture, HPCA ’07IEEE Computer Society, Washington, DC, USA, pp 13–24
Talbot J, Yoo RM, Kozyrakis C (2011) Phoenix++: modular mapreduce for shared-memory systems. In: Proceedings of the second international workshop on MapReduce and Its Applications, MapReduce ’11ACM, New York, NY, USA, pp 9–16
Yoo RM, Romano A, Kozyrakis C (2009) Phoenix rebirth: scalable mapreduce on a large-scale shared-memory system. In: Proceedings of the 2009 IEEE international symposium on workload characterization (IISWC), IISWC ’09IEEE Computer Society, Washington, DC, USA, pp 198–207
Yuan D, Zheng J, Park S, Zhou Y, Savage S (2012) Improving software diagnosability via log enhancement. ACM Trans Comput Syst 30(1):4:1–4:28
Zhang W, Lim J, Olichandran R, Scherpelz J, Jin G, Lu S, Reps T (2011) Conseq: detecting concurrency bugs through sequential errors. In: Proceedings of the sixteenth international conference on architectural support for programming languages and operating systems, ASPLOS XVIACM, New York, NY, USA, pp 251–264
Acknowledgments
The research is supported by National Science Foundation of China under Grant No. 61232008, National 863 Hi-Tech Research and Development Program under Grant No. 2013AA01A213, Guangzhou Science and Technology Program under Grant 2012Y2-00040, Chinese Universities Scientific Fund under Grant No. 2013TS094, and Research Fund for the Doctoral Program of MOE under Grant No. 20110142130005.
Author information
Authors and Affiliations
Corresponding author
Additional information
Note that Phoenix++ is the best available implementation of MapReduce on shared-memory multicore platform.
Rights and permissions
About this article
Cite this article
Wu, S., Peng, Y., Jin, H. et al. Peacock: a customizable MapReduce for multicore platform. J Supercomput 70, 1496–1513 (2014). https://doi.org/10.1007/s11227-014-1238-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-014-1238-2