Skip to main content

Improved Time Complexity and Load Balance for DFS in Multiple NameNode

  • Conference paper
  • First Online:
Book cover Proceedings of International Joint Conference on Computational Intelligence

Part of the book series: Algorithms for Intelligent Systems ((AIS))

  • 869 Accesses

Abstract

Apache Hadoop is a software framework delivered by the open basis communal. This is supportive in storing and processing of data sets of bulky scale on clusters of commodity hardware. HDFS (Hadoop Distributed File System) is a principal distributed storage used by the Hadoop applications. An HDFS cluster mainly is made up of a NameNode and the DataNode. The NameNode accomplishes the file system metadata and DataNodes procedure to store the actual data. Hadoop is ascendable, fault tolerant, and very simple to increase. NameNode frequently converts bottleneck, particularly when handling huge number of minor files. To maximize proficiency, NameNode stores the complete metadata of HDFS in the core memory. With too several small files, NameNode can be run out of memory. In this paper, we present a solution used by numerous NameNode. Our explanation has topmost returns than existing one: we implement a system for load balancing, NameNode bottleneck problem solution and time requirements are reduced average in read and write.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mukhopadhyay D, Agrawal C, Maru D, Yedale P, Gadekar P (2014) Addressing NameNode scalability issue in Hadoop distributed file system using Cache approach. In: International conference on information technology (ICIT)

    Google Scholar 

  2. Chandrasekar A, Chandrasekar K, Ramasatagopan H, Balasubramaniyan J (2012) Classification based metadata management for HDFS. In: IEEE 14th international conference on high performance computing and communications

    Google Scholar 

  3. Kim Y, Araragi T, Nakamura J, Masuzawa T (2014) A distributed NameNode cluster for a highly- available Hadoop distributed file system. In: IEEE 33rd international symposium on reliable distributed systems (SRDS)

    Google Scholar 

  4. Dev D, Patgiri R (2015) HAR+ : archive and metadata distribution! Why not both? In: International conference on computer communication and informatics (ICCCI), IEEE-2015

    Google Scholar 

  5. Shaha TR, Nasim Akhter Md., Johora FT, Hossain Z (2019) A noble approach to develop dynamically scalable NameNode in Hadoop distributed file system using secondary storage, International Conference on Emerging Electronic Solutions for IOT(ICEESI-2017), JTEC

    Google Scholar 

  6. Wang Z, Wang D (2013) NCluster: using multiple active NameNodes to achieve high availability for HDFS. In: International conference on high performance computing and communications & IEEE international conference on embedded and ubiquitous computing, HPCC. EUC.2013.329

    Google Scholar 

  7. Jain B, Agarwal S (2016) Application research of disk space utilization of HDFS and real time trouble shooting to maintain well balanced cluster. In: 6th international conference - cloud system and big data engineering (Confluence) 2016, 978-1-4673-8203-8/16/$31.00_ ©2016 IEEE

    Google Scholar 

  8. Luo S, Wang Y, Huang W, Yu H (2016) Backup and disaster recovery system for HDFS, 978-1-5090-5493-0/16/$31.00 ©2016 IEEE

    Google Scholar 

  9. Vorapongkitipun C, Nupairoj N (2014) Improving performance of small-file accessing in Hadoop. In: 2014 11th international joint conference on computer science and software engineering (JCSSE), 978-1-4799-5822-1/14/$31.00 C2014 IEEE

    Google Scholar 

  10. Asif Khan Md, Memon ZA, Khan S (2013) Highly available Hadoop NameNode architecture, In: 2012 international conference on advanced computer science applications and technologies, 978-0-7695-4959-0/13 $25.00 © 2013 IEEE

    Google Scholar 

  11. Zhang G, Xie C, Shi L, Du Y (2013) A tile-based scalable raster data management system based on HDFS, 978-1-4673-1 104-5/12/$31.00 ©2012 IEEE

    Google Scholar 

  12. Varade M, Jethani V (2013) Distributed metadata management scheme in HDFS. Int J Sci Res Publ 3(5) 1 May 2013, ISSN 2250-3153

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad Nurul Islam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nurul Islam, M., Nasim Akhtar, M. (2020). Improved Time Complexity and Load Balance for DFS in Multiple NameNode. In: Uddin, M., Bansal, J. (eds) Proceedings of International Joint Conference on Computational Intelligence. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-13-7564-4_55

Download citation

Publish with us

Policies and ethics