Join Operations to Enhance Performance in Hadoop MapReduce Environment

Pagadala, Pavan Kumar; Vikram, M.; Eswarawaka, Rajesh; Reddy, P Srinivasa

doi:10.1007/978-981-10-3156-4_51

Pavan Kumar Pagadala¹⁸,
M. Vikram¹⁹,
Rajesh Eswarawaka²⁰ &
…
P Srinivasa Reddy¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 516))

943 Accesses

Abstract

Analyzing large data sets is gaining more importance because of its wide variety of applications in parallel and distributed environment. Hadoop environment gives more flexibility to programmers in parallel computing. One of the advantages of Hadoop is query evaluation over large datasets. Join operations in query evaluation plays a major role over the large data. This paper Ferret outs the earlier solutions, prolongs them and recommends a new approach for the implementation of joins in Hadoop.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Softcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

J. Dean, S. Ghemawat, Mapreduce: simplified data processing on large clusters, in Design and Implementation 6th Symposium on Operating Systems, ACM, pp. 137–150 2004
Google Scholar
Y. Mao, R. Morris, F. Kaashoek, Optimizing MapReduce for Multicore Architectures (Massachusetts Institute of Technology, Cambridge)
Google Scholar
Thesis on Performance Analysis and Optimization of Left Outer Join on Map Side, Ming Hao, Stavanger, 15th June 2012
Google Scholar
S. Blanas, J.M. Patel, V. Ercegovac, J. Rao,E.J. Shekita, Y. Tian, A comparison of joinalgorithms for log processing in MaPreduce, in Proceedings of the 2010 International Conference on Management of Data (2010) pp. 975–986
Google Scholar
A. Abouzeid, K. Bajda-Pawlikowski, D. Abadi, A. Silberschatz, A. Rasin, Hadoopdb, An architectural hybrid of MapReduce and dbms technologies for analytical workloads, in VLDB, 2009
Google Scholar
K.H. Lee, Y.J. Lee, H. Choi, Y.D. Chung, parallel Data Processing with MapReduce: a Survey, Department of Computer Science, Department of Computer Science and Engineering (Korea University in KAIST)
Google Scholar
V. Jadhav1, J. Aghav, S. Dorwani2, Join algorithms using mapreduce a surveyn, in International Conference on Electrical Engineering and Computer Science, 21 Apr 2013
Google Scholar
Binary Theta-Joins using MapReduce: Efficiency Analysis and Improvements, Ioannis K. Koumarelas, Athanasios Naskos, Anastasios Gounaris, Dept. of Informatics, Aristotle University
Google Scholar
J. Dean, S. Ghemawat, Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Article Google Scholar
Thesis in Implementation and Analysis of Join Algorithms to handle skew for the Hadoop MapReduce Framework, Fariha Atta, University of Eidenburgh 2010
Google Scholar
Minimal MapReduce Algorithms, Yufei Tao, 1Chinese University of Hong Kong, Hong Kong, Wenqing Lin, Korea Advanced Institute of Science and Technology, Korea, Xiaokui Xiao, Nanyang Technological University, Singapore
Google Scholar
K. Palla, A comparative analysis of join algorithms using the hadoop MapReduce framework. Master’s thesis, MSc Informatics, School of Informatics, University of Edinburgh (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Bharat Institute of Engineering & Technology, JNTUH, Hyderabad, India
Pavan Kumar Pagadala & P Srinivasa Reddy
Sri Venkateswara College of Engineering, JNTUA, Anantapur, India
M. Vikram
Dayananda Sagar University, Bangalore, India
Rajesh Eswarawaka

Authors

Pavan Kumar Pagadala
View author publications
You can also search for this author in PubMed Google Scholar
M. Vikram
View author publications
You can also search for this author in PubMed Google Scholar
Rajesh Eswarawaka
View author publications
You can also search for this author in PubMed Google Scholar
P Srinivasa Reddy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pavan Kumar Pagadala .

Editor information

Editors and Affiliations

Anil Neerukonda Inst. of Tech. & Sci., Prof., Dept. of Computer Sci. & Engg. Anil Neerukonda Inst. of Tech. & Sci., Vishakapatnam, Andhra Pradesh, India
Suresh Chandra Satapathy
Professional Colleges (SRMGPC), Shri Ramswaroop Memorial Group of Professional Colleges (SRMGPC), Lucknow, Uttar Pradesh, India
Vikrant Bhateja
SCIS, University of Hyderabad , Hyderabad, India
Siba K. Udgata
KIIT University, School of Computer Engineering KIIT University, Bhubaneswar, Odisha, India
Prasant Kumar Pattnaik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pagadala, P.K., Vikram, M., Eswarawaka, R., Reddy, P.S. (2017). Join Operations to Enhance Performance in Hadoop MapReduce Environment. In: Satapathy, S., Bhateja, V., Udgata, S., Pattnaik, P. (eds) Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications . Advances in Intelligent Systems and Computing, vol 516. Springer, Singapore. https://doi.org/10.1007/978-981-10-3156-4_51

Download citation

DOI: https://doi.org/10.1007/978-981-10-3156-4_51
Published: 03 March 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3155-7
Online ISBN: 978-981-10-3156-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics