Abstract
Recently, research on big data has been actively made because big data are generated in various scientific applications, such as biology and astronomy. Therefore, distributed data processing techniques have been studied to manage the big data in large number servers. Meanwhile, some scientific applications like genome data analysis require loop control in analyzing big data using a MapReduce framework. In this paper, we first describe the existing MapReduce-based distributed systems which support iterative data processing. In addition, we do the performance analysis of the existing distributed systems in terms of execution time for various scientific applications which require iterative data processing. Finally, based on the performance analysis, we discuss some requirements for a new MapReduce-based distributed system which supports iterative data processing efficiently.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dean, J., Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters. Operating System Design and Implementation, 10 (2004)
Apache Software Foundation, Apache Hadoop, http://hadoop.apache.org/
Apache Software Foundation, Hadoop Map- Redce, http://hadoop.apache.org/mapreduce
Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: HaLoop: Efficient Iterative Data Processing on Large Clusters. In: VLDB (2010)
Ekanayake, J., Li, H., Zhang, B., Gunarathne, T., Bae, S.- H., Qiu, J., Fox, G.C.: Twister: A Runtime for Iterative MapReduce. In: The ACM International Symposium on High Performance Distributed Computing, HPDC (2010)
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Technical Report. Stanford InfoLab (1999)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yoon, M., Kim, Hi., Choi, D.H., Jo, H., Chang, Jw. (2014). Performance Analysis of MapReduce-Based Distributed Systems for Iterative Data Processing Applications. In: Park, J., Adeli, H., Park, N., Woungang, I. (eds) Mobile, Ubiquitous, and Intelligent Computing. Lecture Notes in Electrical Engineering, vol 274. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40675-1_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-40675-1_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40674-4
Online ISBN: 978-3-642-40675-1
eBook Packages: EngineeringEngineering (R0)