Evaluating SPLASH-2 Applications Using MapReduce

Zhu, Shengkai; Xiao, Zhiwei; Chen, Haibo; Chen, Rong; Zhang, Weihua; Zang, Binyu

doi:10.1007/978-3-642-03644-6_35

Shengkai Zhu¹⁹,
Zhiwei Xiao¹⁹,
Haibo Chen¹⁹,
Rong Chen¹⁹,
Weihua Zhang¹⁹ &
…
Binyu Zang¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5737))

Included in the following conference series:

International Workshop on Advanced Parallel Processing Technologies

Abstract

MapReduce has been prevalent for running data-parallel applications. By hiding other non-functionality parts such as parallelism, fault tolerance and load balance from programmers, MapReduce significantly simplifies the programming of large clusters. Due to the mentioned features of MapReduce above, researchers have also explored the use of MapReduce on other application domains, such as machine learning, textual retrieval and statistical translation, among others.

In this paper, we study the feasibility of running typical supercomputing applications using the MapReduce framework. We port two applications (Water Spatial and Radix Sort) from the Stanford SPLASH-2 suite to MapReduce. By completely evaluating them in Hadoop, an open-source MapReduce framework for clusters, we analyze the major performance bottleneck of them in the MapReduce framework. Based on this, we also provide several suggestions in enhancing the MapReduce framework to suite these applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

MapReduce Parallel Programming Model: A State-of-the-Art Survey

Article 29 October 2015

Big SQL systems: an experimental evaluation

Article 11 February 2019

ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems

Article 17 January 2020

References

Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Communications of the ACM 51(1), 107–113 (2008)
Article Google Scholar
Bialecki, A., Cafarella, M., Cutting, D., O’Malley, O.: Hadoop: a framework for running applications on large clusters built of commodity hardware (2005), http://lucene.apache.org/hadoop
Dyer, C., Cordova, A., Mont, A., Lin, J.: Fast, easy, and cheap: Construction of statistical machine translation models with MapReduce. In: Proceedings of the Third Workshop on Statistical Machine Translation at ACL, pp. 199–207 (2008)
Google Scholar
Elsayed, T., Lin, J., Oard, D.W.: Pairwise document similarity in large collections with mapreduce. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pp. 265–268 (2008)
Google Scholar
Wolfe, J., Haghighi, A., Klein, D.: Fully distributed EM for very large datasets. In: Proceedings of the 25th international conference on Machine learning, pp. 1184–1191. ACM, New York (2008)
Google Scholar
Bryant, R.: Data-intensive supercomputing: The case for DISC (2007)
Google Scholar
Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 Programs: Characterization and Methodological Considerations. In: Proc. ISCA (1995)
Google Scholar
Singh, J.P., Gupta, A., Levoy, M.: SPLASH: Stanford parallel applications for shared memory. Computer Architecture News 20(1), 5–44 (1994)
Article Google Scholar
Lie, G., Clementi, E.: Moleculear-dynamics simulation of liquid water with an ab initio flexible water-water interaction potential. Physical Review A33, 2679–2693 (1986)
Article Google Scholar
Matsuoka, O., Clementi, E., Yoshimine, M.: CI study of the water dimer potential suface. Journal of Chemical Physics 64(4), 1351–1361 (1976)
Article Google Scholar
Barlett, R., Shavitt, I., Purvis, G.: The quartic force field of H ₂ O determined by many-body methods that include quadruple excitation effects. Journal of Chemical Physics 71(1), 281–291 (1979)
Article Google Scholar
Blelloch, G.E., Leiserson, C.E., Maggs, B.M., Plaxton, C.G., Smith, S.J., Zagha, M.: A comparison of sorting algorithm for the connection machine CM-2. In: Proc. SPAA (1991)
Google Scholar
Yang, H., Dasdan, A., Hsiao, R., Parker, D.: Map-reduce-merge: simplified relational data processing on large clusters. In: Proc. SIGMOD (2007)
Google Scholar
Chu, C., Kim, S., Lin, Y., Yu, Y., Bradski, G., Ng, A., Olukotun, K.: Map-reduce for machine learning on multicore. In: Advances in Neural Information Processing Systems: Proceedings of the 2006 Conference, p. 281. MIT Press, Cambridge (2007)
Google Scholar
Ekanayake, J., Pallickara, S., Fox, G.: MapReduce for Data Intensive Scientific Analyses. In: IEEE Fourth International Conference on eScience, 2008. eScience 2008, pp. 277–284 (2008)
Google Scholar
Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating mapreduce for multi-core and multiprocessor systems. In: Proc. HPCA (2007)
Google Scholar
He, B., Fang, W., Luo, Q., Govindaraju, N., Wang, T.: Mars: a MapReduce framework on graphics processors. In: Proc. PACT (2008)
Google Scholar
de Kruijf, M., Sankaralingam, K.: MapReduce for the Cell BE Architecture. University of Wisconsin Computer Sciences Technical Report CS-TR-2007
Google Scholar

Download references

Author information

Authors and Affiliations

Parallel Processing Institute, Fudan University, China
Shengkai Zhu, Zhiwei Xiao, Haibo Chen, Rong Chen, Weihua Zhang & Binyu Zang

Authors

Shengkai Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwei Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Haibo Chen
View author publications
You can also search for this author in PubMed Google Scholar
Rong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Weihua Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Binyu Zang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National University of Defense Technology, Department of Computer Science, 410073, Changsha, P.R. China
Yong Dou
Lausanne (EPFL), Ecole Polytechnique Fédérale de ,Dépt. Physique, 1015, LAUSANNE, Switzerland
Ralf Gruber
Technik Rapperswil, HSR - Hochschule für, Oberseestr. 10, 8640, RAPPERSWIL , SCHWEIZ
Josef M. Joller

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhu, S., Xiao, Z., Chen, H., Chen, R., Zhang, W., Zang, B. (2009). Evaluating SPLASH-2 Applications Using MapReduce. In: Dou, Y., Gruber, R., Joller, J.M. (eds) Advanced Parallel Processing Technologies. APPT 2009. Lecture Notes in Computer Science, vol 5737. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03644-6_35

Download citation

DOI: https://doi.org/10.1007/978-3-642-03644-6_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03643-9
Online ISBN: 978-3-642-03644-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Evaluating SPLASH-2 Applications Using MapReduce

Abstract

Access this chapter

Preview

Similar content being viewed by others

MapReduce Parallel Programming Model: A State-of-the-Art Survey

Big SQL systems: an experimental evaluation

ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Evaluating SPLASH-2 Applications Using MapReduce

Abstract

Access this chapter

Preview

Similar content being viewed by others

MapReduce Parallel Programming Model: A State-of-the-Art Survey

Big SQL systems: an experimental evaluation

ExaHDF5: Delivering Efficient Parallel I/O on Exascale Computing Systems

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation