Skip to main content

A Load Balancing Method Based on Node Features in a Heterogeneous Hadoop Cluster

  • Conference paper
  • First Online:
Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom 2017)

Abstract

In a heterogeneous cluster, how to handle load balancing is an urgent problem. This paper proposes a method of load balancing based on node features. The method first analyses the main indexes that determine node performance. Then, a formula is defined to describe the node performance based on the contributions of those indexes. We combine node performance with node busy status to calculate the relative load value. By analysing the relative load value of each node and the cluster storage utilization rate, the recommended value of the storage utilization rate for each node is calculated. Finally, the balancer threshold is generated dynamically based on the current cluster’s disk load. The results of experiments show that the load balancing method proposed in this paper provides a more reasonable equilibrium for heterogeneous clusters, improves efficiency and substantially reduces the execution time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Su, F., Peng, Y., Mao, X., et al.: The research of big data architecture on telecom industry. In: International Symposium on Communications and Information Technologies, pp. 280–284 (2016)

    Google Scholar 

  2. Apache Hadoop. http://hadoop.apache.org

  3. Parsola, J., Gangodkar, D., Mittal, A.: Efficient storage and processing of video data for moving object detection using Hadoop/MapReduce. In: Lobiyal, D.K., Mohapatra, D.P., Nagar, A., Sahoo, M.N. (eds.) Proceedings of the International Conference on Signal, Networks, Computing, and Systems. LNEE, vol. 395, pp. 137–147. Springer, New Delhi (2017). https://doi.org/10.1007/978-81-322-3592-7_14

    Chapter  Google Scholar 

  4. Bezerra, A., Hernandez, P., et al.: Job scheduling for optimizing data locality in Hadoop clusters. In: European MPI Users’ Group Meeting, pp. 271–276 (2013)

    Google Scholar 

  5. Lin, W.W., Liu, B.: Hadoop data load balancing method based on dynamic bandwidth allocation. Huanan Ligong Daxue Xuebao/J. South China Univ. Technol. 40(9), 42–47 (2012)

    Google Scholar 

  6. Fan, K., Zhang, D., Li, H., et al.: An adaptive feedback load balancing algorithm in HDFS. In: International Conference on Intelligent NETWORKING and Collaborative Systems, pp. 23–29 (2013)

    Google Scholar 

  7. Babu, B.G., Shabeera, T.P., Madhu Kumar, S.D.: Dynamic colocation algorithm for Hadoop. In: International Conference on Advances in Computing, Communications and Informatics, pp. 2643–2647 (2014)

    Google Scholar 

  8. Gao, Z., Liu, D., Yang, Y., et al.: A load balance algorithm based on nodes performance in Hadoop cluster. In: Network Operations and Management Symposium, pp. 1–4. IEEE (2014)

    Google Scholar 

  9. Fan, Y., Wu, W., Cao, H., et al.: LBVP: a load balance algorithm based on Virtual Partition in Hadoop cluster. In: IEEE Asia Pacific Cloud Computing Congress, pp. 37–41. IEEE (2012)

    Google Scholar 

  10. Zheng, X., Ming, X., Zhang, D., et al.: An adaptive tasks scheduling method based on the ability of node in Hadoop cluster. J. Comput. Res. Dev. 51(3), 618–626 (2014)

    Google Scholar 

  11. Lin, C.Y., Lin, Y.C.: A load-balancing algorithm for Hadoop distributed file system. In: International Conference on Network-Based Information Systems, pp. 173–179. IEEE (2015)

    Google Scholar 

  12. Wei, D., Ibrahim, I., Bassiouni, M.: A new replica placement policy for Hadoop distributed file system. In: International Conference on Big Data Security on Cloud, pp. 262–267. IEEE (2016)

    Google Scholar 

  13. Liu, Y., Li, M., Alham, N.K., et al.: Load balancing in MapReduce environments for data intensive applications. In: Eighth International Conference on Fuzzy Systems and Knowledge Discovery, pp. 2675–2678. IEEE (2011)

    Google Scholar 

  14. Xie, J., Yin, S., Ruan, X., et al.: Improving MapReduce performance through data placement in heterogeneous Hadoop clusters. In: IEEE International Symposium on Parallel and Distributed Processing - Workshop Proceedings, IPDPS 2010, Atlanta, Georgia, USA, 19–23 April 2010, pp. 1–9. DBLP (2010)

    Google Scholar 

Download references

Acknowledgements

This paper is supported by National Natural Science Foundation of China under Grant No. 61502294, Natural Science Foundation of Shanghai under Grant No. 15ZR1415200, CERNET Innovation Project under Grant No. NGII20160210, NGII20160614, NGII20160325, The Special Development Foundation of Key Project of Shanghai Zhangjiang National Innovation Demonstration Zone under Grant No. 201411-ZB-B204-012, and The Development Foundation for Cultural and Creative Industries of Shanghai under Grant No. 201610162.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huahu Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, P., Gao, H., Xu, H., Bian, M., Chu, D. (2018). A Load Balancing Method Based on Node Features in a Heterogeneous Hadoop Cluster. In: Romdhani, I., Shu, L., Takahiro, H., Zhou, Z., Gordon, T., Zeng, D. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 252. Springer, Cham. https://doi.org/10.1007/978-3-030-00916-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00916-8_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00915-1

  • Online ISBN: 978-3-030-00916-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics