Improved Time Complexity and Load Balance for DFS in Multiple NameNode

Nurul Islam, Mohammad; Nasim Akhtar, Md.

doi:10.1007/978-981-13-7564-4_55

Mohammad Nurul Islam⁶ &
Md. Nasim Akhtar⁶

Part of the book series: Algorithms for Intelligent Systems ((AIS))

869 Accesses

Abstract

Apache Hadoop is a software framework delivered by the open basis communal. This is supportive in storing and processing of data sets of bulky scale on clusters of commodity hardware. HDFS (Hadoop Distributed File System) is a principal distributed storage used by the Hadoop applications. An HDFS cluster mainly is made up of a NameNode and the DataNode. The NameNode accomplishes the file system metadata and DataNodes procedure to store the actual data. Hadoop is ascendable, fault tolerant, and very simple to increase. NameNode frequently converts bottleneck, particularly when handling huge number of minor files. To maximize proficiency, NameNode stores the complete metadata of HDFS in the core memory. With too several small files, NameNode can be run out of memory. In this paper, we present a solution used by numerous NameNode. Our explanation has topmost returns than existing one: we implement a system for load balancing, NameNode bottleneck problem solution and time requirements are reduced average in read and write.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Mukhopadhyay D, Agrawal C, Maru D, Yedale P, Gadekar P (2014) Addressing NameNode scalability issue in Hadoop distributed file system using Cache approach. In: International conference on information technology (ICIT)
Google Scholar
Chandrasekar A, Chandrasekar K, Ramasatagopan H, Balasubramaniyan J (2012) Classification based metadata management for HDFS. In: IEEE 14th international conference on high performance computing and communications
Google Scholar
Kim Y, Araragi T, Nakamura J, Masuzawa T (2014) A distributed NameNode cluster for a highly- available Hadoop distributed file system. In: IEEE 33rd international symposium on reliable distributed systems (SRDS)
Google Scholar
Dev D, Patgiri R (2015) HAR+ : archive and metadata distribution! Why not both? In: International conference on computer communication and informatics (ICCCI), IEEE-2015
Google Scholar
Shaha TR, Nasim Akhter Md., Johora FT, Hossain Z (2019) A noble approach to develop dynamically scalable NameNode in Hadoop distributed file system using secondary storage, International Conference on Emerging Electronic Solutions for IOT(ICEESI-2017), JTEC
Google Scholar
Wang Z, Wang D (2013) NCluster: using multiple active NameNodes to achieve high availability for HDFS. In: International conference on high performance computing and communications & IEEE international conference on embedded and ubiquitous computing, HPCC. EUC.2013.329
Google Scholar
Jain B, Agarwal S (2016) Application research of disk space utilization of HDFS and real time trouble shooting to maintain well balanced cluster. In: 6th international conference - cloud system and big data engineering (Confluence) 2016, 978-1-4673-8203-8/16/$31.00_ ©2016 IEEE
Google Scholar
Luo S, Wang Y, Huang W, Yu H (2016) Backup and disaster recovery system for HDFS, 978-1-5090-5493-0/16/$31.00 ©2016 IEEE
Google Scholar
Vorapongkitipun C, Nupairoj N (2014) Improving performance of small-file accessing in Hadoop. In: 2014 11th international joint conference on computer science and software engineering (JCSSE), 978-1-4799-5822-1/14/$31.00 C2014 IEEE
Google Scholar
Asif Khan Md, Memon ZA, Khan S (2013) Highly available Hadoop NameNode architecture, In: 2012 international conference on advanced computer science applications and technologies, 978-0-7695-4959-0/13 $25.00 © 2013 IEEE
Google Scholar
Zhang G, Xie C, Shi L, Du Y (2013) A tile-based scalable raster data management system based on HDFS, 978-1-4673-1 104-5/12/$31.00 ©2012 IEEE
Google Scholar
Varade M, Jethani V (2013) Distributed metadata management scheme in HDFS. Int J Sci Res Publ 3(5) 1 May 2013, ISSN 2250-3153
Google Scholar

Download references

Author information

Authors and Affiliations

Dhaka University of Engineering and Technology, Gazipur, Dhaka, Bangladesh
Mohammad Nurul Islam & Md. Nasim Akhtar

Authors

Mohammad Nurul Islam
View author publications
You can also search for this author in PubMed Google Scholar
Md. Nasim Akhtar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Nurul Islam .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Jahangirnagar University, Dhaka, Bangladesh
Mohammad Shorif Uddin
Department of Mathematics, South Asian University, New Delhi, Delhi, India
Jagdish Chand Bansal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nurul Islam, M., Nasim Akhtar, M. (2020). Improved Time Complexity and Load Balance for DFS in Multiple NameNode. In: Uddin, M., Bansal, J. (eds) Proceedings of International Joint Conference on Computational Intelligence. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-13-7564-4_55

Download citation

DOI: https://doi.org/10.1007/978-981-13-7564-4_55
Published: 04 July 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-7563-7
Online ISBN: 978-981-13-7564-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics