skip to main content
10.1145/2600212.2600706acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
short-paper

Glasswing: accelerating mapreduce on multi-core and many-core clusters

Published: 23 June 2014 Publication History

Abstract

The impact and significance of parallel computing techniques is continuously increasing given the current trend of incorporating more cores in new processor designs. However, many Big Data systems fail to exploit the abundant computational power of multi-core CPUs and GPUs to their full potential. We present Glasswing, a scalable MapReduce framework that employs a configurable mixture of coarse- and fine-grained parallelism to achieve high performance on multi-core CPUs and GPUs. We experimentally evaluated the performance of five MapReduce applications and show that Glasswing outperforms Hadoop on a 64-node multi-core CPU cluster by a factor between 1.8 and 4, and by a factor from 20 to 30 on a 16-node GPU cluster.

References

[1]
L. Chen and G. Agrawal. Optimizing MapReduce for GPUs with effective shared memory usage. In Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing, HPDC '12, pages 199--210, New York, NY, USA, 2012. ACM.
[2]
L. Chen, X. Huo, and G. Agrawal. Accelerating MapReduce on a coupled CPU-GPU architecture. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, SC '12, pages 25:1--25:11, Los Alamitos, CA, USA, 2012. IEEE Computer Society Press.
[3]
R. Chen, H. Chen, and B. Zang. Tiled-MapReduce: optimizing resource usages of data-parallel applications on multicore with tiling. In Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, pages 523--534, New York, NY, USA, 2010. ACM.
[4]
J. Dean and S. Ghemawat. MapReduce: simplified data processing on large clusters. In Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6, OSDI'04, pages 10--10, Berkeley, CA, USA, 2004. USENIX Association.
[5]
R. Farivar, A. Verma, E. Chan, and R. Campbell. MITHRA: Multiple data independent tasks on a heterogeneous resource architecture. In Cluster Computing and Workshops, 2009. CLUSTER '09. IEEE International Conference on, pages 1--10, 31 2009-Sept. 4 2009.
[6]
M. Grossman, M. Breternitz, and V. Sarkar. HadoopCL: MapReduce on Distributed Heterogeneous Platforms Through Seamless Integration of Hadoop and OpenCL. In Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, IPDPSW '13, pages 1918--1927, Washington, DC, USA, 2013. IEEE Computer Society.
[7]
B. He, W. Fang, Q. Luo, N. K. Govindaraju, and T. Wang. Mars: a MapReduce framework on graphics processors. In Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT '08, pages 260--269, New York, NY, USA, 2008. ACM.
[8]
C. Hong, D. Chen, W. Chen, W. Zheng, and H. Lin. MapCG: writing parallel program portable between CPU and GPU. In Proceedings of the 19th international conference on Parallel architectures and compilation techniques, PACT '10, pages 217--226, New York, NY, USA, 2010. ACM.
[9]
F. Ji and X. Ma. Using Shared Memory to Accelerate MapReduce on Graphics Processing Units. In Parallel Distributed Processing Symposium (IPDPS), 2011 IEEE International, pages 805--816, May 2011.
[10]
M. D. Linderman, J. D. Collins, H. Wang, and T. H. Meng. Merge: a programming model for heterogeneous multi-core systems. SIGPLAN Not., 43(3):287--296, Mar. 2008.
[11]
A. Papagiannis and D. Nikolopoulos. Rearchitecting MapReduce for Heterogeneous Multicore Processors with Explicitly Managed Memories. In Parallel Processing (ICPP), 2010 39th International Conference on, pages 121--130, Sept. 2010.
[12]
M. Rafique, B. Rose, A. Butt, and D. Nikolopoulos. CellMR: A framework for supporting mapreduce on asymmetric cell-based clusters. In Parallel Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pages 1--12, May 2009.
[13]
C. Ranger, R. Raghuraman, A. Penmetsa, G. Bradski, and C. Kozyrakis. Evaluating MapReduce for Multi-core and Multiprocessor Systems. In High Performance Computer Architecture, 2007. HPCA 2007. IEEE 13th International Symposium on, pages 13--24, Feb. 2007.
[14]
Y. Shan, B. Wang, J. Yan, Y. Wang, N. Xu, and H. Yang. FPMR: MapReduce framework on FPGA. In Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays, FPGA '10, pages 93--102, New York, NY, USA, 2010. ACM.
[15]
J. Stuart and J. Owens. Multi-GPU MapReduce on GPU Clusters. In Parallel Distributed Processing Symposium (IPDPS), 2011 IEEE International, pages 1068--1079, May 2011.
[16]
J. Talbot, R. M. Yoo, and C. Kozyrakis. Phoenix: modular MapReduce for shared-memory systems. In Proceedings of the second international workshop on MapReduce and its applications, MapReduce '11, pages 9--16, New York, NY, USA, 2011. ACM.

Cited By

View all
  • (2024)Topo: Towards a Fine-grained Topological Data Processing Framework on Tianhe-3 SupercomputerJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104926(104926)Online publication date: May-2024
  • (2021)Big Data Resource Management & Networks: Taxonomy, Survey, and Future DirectionsIEEE Communications Surveys & Tutorials10.1109/COMST.2021.309499323:4(2098-2130)Online publication date: Dec-2022
  • (2020)A Smart Water Metering Deployment Based on the Fog Computing ParadigmApplied Sciences10.3390/app1006196510:6(1965)Online publication date: 13-Mar-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HPDC '14: Proceedings of the 23rd international symposium on High-performance parallel and distributed computing
June 2014
334 pages
ISBN:9781450327497
DOI:10.1145/2600212
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 June 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. heterogeneous
  2. mapreduce
  3. opencl
  4. scalability

Qualifiers

  • Short-paper

Conference

HPDC'14
Sponsor:

Acceptance Rates

HPDC '14 Paper Acceptance Rate 21 of 130 submissions, 16%;
Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)2
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Topo: Towards a Fine-grained Topological Data Processing Framework on Tianhe-3 SupercomputerJournal of Parallel and Distributed Computing10.1016/j.jpdc.2024.104926(104926)Online publication date: May-2024
  • (2021)Big Data Resource Management & Networks: Taxonomy, Survey, and Future DirectionsIEEE Communications Surveys & Tutorials10.1109/COMST.2021.309499323:4(2098-2130)Online publication date: Dec-2022
  • (2020)A Smart Water Metering Deployment Based on the Fog Computing ParadigmApplied Sciences10.3390/app1006196510:6(1965)Online publication date: 13-Mar-2020
  • (2020)Resource-Aware MapReduce Runtime for Multi/Many-core Architectures2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE48585.2020.9116281(897-902)Online publication date: Mar-2020
  • (2020)Efficient Compilation and Execution of JVM-Based Data Processing Frameworks on Heterogeneous Co-Processors2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE48585.2020.9116246(175-179)Online publication date: Mar-2020
  • (2019)A Proposed Architecture for Parallel HPC-based Resource Management System for Big Data ApplicationsAdvances in Science, Technology and Engineering Systems Journal10.25046/aj0401054:1Online publication date: 2019
  • (2018)Challenges and Proposals for Enabling Dynamic Heterogeneous Execution of Big Data Frameworks2018 IEEE International Conference on Cloud Computing Technology and Science (CloudCom)10.1109/CloudCom2018.2018.00070(335-341)Online publication date: Dec-2018
  • (2018)Optimizing MapReduce for energy efficiencySoftware: Practice and Experience10.1002/spe.259948:9(1660-1687)Online publication date: 26-Jun-2018
  • (2016)Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter ScaleProceedings of the Seventh ACM Symposium on Cloud Computing10.1145/2987550.2987569(456-469)Online publication date: 5-Oct-2016
  • (2016)SWATProceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing10.1145/2907294.2907307(81-92)Online publication date: 31-May-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media