Abstract
In a heterogeneous cluster, how to handle load balancing is an urgent problem. This paper proposes a method of load balancing based on node features. The method first analyses the main indexes that determine node performance. Then, a formula is defined to describe the node performance based on the contributions of those indexes. We combine node performance with node busy status to calculate the relative load value. By analysing the relative load value of each node and the cluster storage utilization rate, the recommended value of the storage utilization rate for each node is calculated. Finally, the balancer threshold is generated dynamically based on the current cluster’s disk load. The results of experiments show that the load balancing method proposed in this paper provides a more reasonable equilibrium for heterogeneous clusters, improves efficiency and substantially reduces the execution time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Su, F., Peng, Y., Mao, X., et al.: The research of big data architecture on telecom industry. In: International Symposium on Communications and Information Technologies, pp. 280–284 (2016)
Apache Hadoop. http://hadoop.apache.org
Parsola, J., Gangodkar, D., Mittal, A.: Efficient storage and processing of video data for moving object detection using Hadoop/MapReduce. In: Lobiyal, D.K., Mohapatra, D.P., Nagar, A., Sahoo, M.N. (eds.) Proceedings of the International Conference on Signal, Networks, Computing, and Systems. LNEE, vol. 395, pp. 137–147. Springer, New Delhi (2017). https://doi.org/10.1007/978-81-322-3592-7_14
Bezerra, A., Hernandez, P., et al.: Job scheduling for optimizing data locality in Hadoop clusters. In: European MPI Users’ Group Meeting, pp. 271–276 (2013)
Lin, W.W., Liu, B.: Hadoop data load balancing method based on dynamic bandwidth allocation. Huanan Ligong Daxue Xuebao/J. South China Univ. Technol. 40(9), 42–47 (2012)
Fan, K., Zhang, D., Li, H., et al.: An adaptive feedback load balancing algorithm in HDFS. In: International Conference on Intelligent NETWORKING and Collaborative Systems, pp. 23–29 (2013)
Babu, B.G., Shabeera, T.P., Madhu Kumar, S.D.: Dynamic colocation algorithm for Hadoop. In: International Conference on Advances in Computing, Communications and Informatics, pp. 2643–2647 (2014)
Gao, Z., Liu, D., Yang, Y., et al.: A load balance algorithm based on nodes performance in Hadoop cluster. In: Network Operations and Management Symposium, pp. 1–4. IEEE (2014)
Fan, Y., Wu, W., Cao, H., et al.: LBVP: a load balance algorithm based on Virtual Partition in Hadoop cluster. In: IEEE Asia Pacific Cloud Computing Congress, pp. 37–41. IEEE (2012)
Zheng, X., Ming, X., Zhang, D., et al.: An adaptive tasks scheduling method based on the ability of node in Hadoop cluster. J. Comput. Res. Dev. 51(3), 618–626 (2014)
Lin, C.Y., Lin, Y.C.: A load-balancing algorithm for Hadoop distributed file system. In: International Conference on Network-Based Information Systems, pp. 173–179. IEEE (2015)
Wei, D., Ibrahim, I., Bassiouni, M.: A new replica placement policy for Hadoop distributed file system. In: International Conference on Big Data Security on Cloud, pp. 262–267. IEEE (2016)
Liu, Y., Li, M., Alham, N.K., et al.: Load balancing in MapReduce environments for data intensive applications. In: Eighth International Conference on Fuzzy Systems and Knowledge Discovery, pp. 2675–2678. IEEE (2011)
Xie, J., Yin, S., Ruan, X., et al.: Improving MapReduce performance through data placement in heterogeneous Hadoop clusters. In: IEEE International Symposium on Parallel and Distributed Processing - Workshop Proceedings, IPDPS 2010, Atlanta, Georgia, USA, 19–23 April 2010, pp. 1–9. DBLP (2010)
Acknowledgements
This paper is supported by National Natural Science Foundation of China under Grant No. 61502294, Natural Science Foundation of Shanghai under Grant No. 15ZR1415200, CERNET Innovation Project under Grant No. NGII20160210, NGII20160614, NGII20160325, The Special Development Foundation of Key Project of Shanghai Zhangjiang National Innovation Demonstration Zone under Grant No. 201411-ZB-B204-012, and The Development Foundation for Cultural and Creative Industries of Shanghai under Grant No. 201610162.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Yang, P., Gao, H., Xu, H., Bian, M., Chu, D. (2018). A Load Balancing Method Based on Node Features in a Heterogeneous Hadoop Cluster. In: Romdhani, I., Shu, L., Takahiro, H., Zhou, Z., Gordon, T., Zeng, D. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2017. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 252. Springer, Cham. https://doi.org/10.1007/978-3-030-00916-8_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-00916-8_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00915-1
Online ISBN: 978-3-030-00916-8
eBook Packages: Computer ScienceComputer Science (R0)