Skip to main content

Optimization of Hadoop Using Software-Internet Wide Area Remote Direct Memory Access Protocol and Unstructured Data Accelerator

  • Conference paper
Software Engineering in Intelligent Systems

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 349))

  • 1485 Accesses

Abstract

Over the last few years, data size grew tremendously in size and thus data analytics is always geared towards low latency processing. Processing of Big Data using traditional methodologies is not cost effective and fast enough to meet the requirements. Existing socket based communication (TCP/IP) used in Hadoop causes performance bottleneck on the significant amount of data transfers through a multi-gigabit network fabric. To fulfill the emerging demands , the underlying design should be modified to make use of data centre’s powerful hardware. The proposed project include integration of Hadoop with remote direct memory access (RDMA).For data-intensive applications, network performance becomes key component as the amount of data being stored and replicated to HDFS increases. RDMA is implemented in a commodity hardware through software ,namely, Soft-iWARP (Software-Internet Wide Area Protocol). Hadoop employs a Java-based network transport stack on top of the JVM . JVM introduces a significant amount of overhead to data processing capability of the native interfaces which constrains use of RDMA. The usage of plug-in library for data shuffling and merging part of Hadoop can take advantage of RDMA . An optimization for Hadoop in data shuffling part can be thus implemented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Konstantinos, K.: An In-Memory RDMA-Based Architecture for the Hadoop Distributed Filesystem. Swiss Federal Institute of Technology in Zurich

    Google Scholar 

  2. Islam, N.S., Rahman, M.W., Jose, J., Rajachandrasekar, R., Wang, H., Subramoni, H., Murthy, C., Panda, D.K.: High Performance RDMA-based Design of HDFS over InfiniBand. Department of Computer Science and Engineering, The Ohio State University and IBM T.J Watson Research Center Yorktown Heights, NY

    Google Scholar 

  3. Wang, Y., Xu, C., Li, X., Yu, W.: JVM-Bypass for Efficient Hadoop Shuffling. Department of Computer Science, Auburn University, AL 36849, USA

    Google Scholar 

  4. Fenn, M., Calderin, L., Nucciarone, J., Argod, V.: Evaluation of iWARP versus InfiniBand Performance. White paper by Pennstate Computer Science and Service System, CSSS 2012, Washington, DC, USA, pp. 574–577 (2012)

    Google Scholar 

  5. Wang, Y., Que, X., Yu, W., Goldenberg, D., Sehgal, D.: Hadoop Acceleration Through Network Levitated Merge. In: SC 2011, November 12-18, Seattle, Washington, USA (2011)

    Google Scholar 

  6. Mellanox Technologies: Unstructured Data Accelerator Rev 3.4.0

    Google Scholar 

  7. Shainer, G.: RDMA based Big Data Analytic. Technion (March 2014)

    Google Scholar 

  8. Mellanox Technologies: Deploying Hadoop with Mellanox End-to-End 10/40Gb Ethernet Solutions (2012)

    Google Scholar 

  9. Mellanox Technologies: Driving IBM BigInsights Performance Over GPFS using Infiniband+RDMA (April 2014)

    Google Scholar 

  10. The OpenFabrics Alliance: A Guide to Installing OFED on Linux (October 2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to V. Vejesh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Vejesh, V., Nayar, G.R., Sathyadevan, S. (2015). Optimization of Hadoop Using Software-Internet Wide Area Remote Direct Memory Access Protocol and Unstructured Data Accelerator. In: Silhavy, R., Senkerik, R., Oplatkova, Z., Prokopova, Z., Silhavy, P. (eds) Software Engineering in Intelligent Systems. Advances in Intelligent Systems and Computing, vol 349. Springer, Cham. https://doi.org/10.1007/978-3-319-18473-9_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18473-9_26

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18472-2

  • Online ISBN: 978-3-319-18473-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics