Abstract
According to exascale computing roadmap, the dynamic nature of new generation scientific problems needs an undergoing review in the static management of computing resources. Therefore, it is necessary to present a dynamic load balancing model to manage the load of the system, efficiently. Currently, the distributed exascale systems are the promising solution to support the scientific programs with dynamic requests to resources. In this work, we propose a dynamic load balancing mechanism for distributed controlling of the load in the computing nodes. The presented method overcomes the challenges of dynamic behavior in the next generation problems. The proposed model considers many practical parameters including the load transition and communication delay. We also propose a compensating factor to minimize the idle time of computing nodes. We propose an optimized method to calculate this compensating factor. We estimate the status of nodes and also calculate the exact portion of the load that should be transferred to perform the optimized load balancing. The evaluation results show significant improvements regarding the performance by proposed load balancing in compared with some earlier distributed load balancing mechanisms.
Similar content being viewed by others
References
DOE Workshop Report (2014).: Software Productivity for Extreme-Scale Science. Rockville
Mirtaheri, S.L., Khaneghah, E.M., Grandinetti, L., Sharifi, M.: A mathematical model for empowerment of Beowulf clusters for exascale computing. In: High Performance Computing and Simulation (HPCS), 2013 International Conference on, pp. 682-687. IEEE, Helsinki (2013)
Dongarra, J.: International Exascale Software Project Roadmap (Draft 1/27/10 5: 08 PM) (2009)
strm, J.A., Carter, A., Hetherington, J., Ioakimidis, K., Lindahl, E., Mozdzynski, G., Westerholm, J.: Preparing scientific application software for exascale computing. In: International Workshop on Applied Parallel Computing, pp. 27–42. Springer, Berlin (2012)
Wang, K., Kulkarni, A., Lang, M., Arnold, D., Raicu, I.: Exploring the design tradeoffs for extreme-scale high-performance computing system software. IEEE Trans. Parallel Distrib. Syst. 27(4), 1070–1084 (2016)
Kogge, P., Bergman, K., Borkar, S., Campbell, D., Carson, W., Dally, W., Hill, K.: Exascale computing study: technology challenges in achieving exascale systems (2008)
Qin, X., Jiang, H., Manzanares, A., Ruan, X., Yin, S.: Dynamic load balancing for I/O-intensive applications on clusters. ACM Trans. Storage (TOS) 5(3), 9 (2009)
Reddy, H.: Performance Evaluation of Static and Dynamic Load-Balancing Schemes for a Parallel Computational Fluid Dynamics Software Application (Fluent) Distributed Across Clusters of Heterogeneous Symmetric Multiprocessor System. IBM Red Book, 6609 Carriage Drive Colleyville, TX 76034 (2004)
Mohamed, N., Al-Jaroodi, J.: Delay-tolerant dynamic load balancing. In: High Performance Computing and Communications (HPCC), 2011 IEEE 13th International Conference on, pp. 237–245. IEEE (2011)
Llanes, A., Cecilia, J.M., Snchez, A., Garca, J.M., Amos, M., Ujaldn, M.: Dynamic load balancing on heterogeneous clusters for parallel ant colony optimization. Cluster Comput. 19(1), 1–11 (2016)
Langer, A.: An optimal distributed load balancing algorithm for homogeneous work units. In: Proceedings of the 28th ACM international conference on Supercomputing, pp. 165–165. ACM (2014)
Alam, T., Raza, Z.: An adaptive threshold based hybrid load balancing scheme with sender and receiver initiated approach using random information exchange. Practice and Experience, Concurrency and Computation (2016)
Mahafzah, B.A., Jaradat, B.A.: The hybrid dynamic parallel scheduling algorithm for load balancing on Chained-Cubic Tree interconnection networks. J. Supercomput. 52(3), 224–252 (2010)
Martnez, J.A., Almeida, F., Garzn, E.M., Acosta, A., Blanco, V.: Adaptive load balancing of iterative computation on heterogeneous nondedicated systems. J. Supercomput. 58(3), 385–393 (2011)
Ybenes, P., Escudero-Sahuquillo, J., Garca, P.J., Quiles, F.J.: Straightforward solutions to reduce HoL blocking in different Dragonfly fully-connected interconnection patterns. J. Supercomput. 72(12), 1–23 (2016)
Mirtaheri, S.L., Sharifi, M.: An efficient resource discovery framework for pure unstructured peer-to-peer systems. Comput. Netw. 59, 213–226 (2014)
Balasangameshwara, J., Raju, N.: Performance-driven load balancing with a primary-backup approach for computational grids with low communication cost and replication cost. IEEE Trans. Comput. 62(5), 990–1003 (2013)
Domanal, S.G., Reddy, G.R.M.: Load Balancing in Cloud Environment using a Novel Hybrid Scheduling Algorithm. In: 2015 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM), pp. 37–42. IEEE (2015)
Dhakal, S., Hayat, M.M., Pezoa, J.E., Yang, C., Bader, D.A.: Dynamic load balancing in distributed systems in the presence of delays: a regeneration-theory approach. IEEE Trans. Parallel Distrib. Syst. 18(4), 485–497 (2007)
Mkel, A., Siikavirta, S., Manner, J.: Comparison of load-balancing approaches for multipath connectivity. Comput. Netw. 56(8), 2179–2195 (2012)
Heene, M., Kowitz, C., Pflger, D.: Load Balancing for Massively Parallel Computations with the Sparse Grid Combination Technique. In: PARCO, pp. 574–583. (2013)
Arafat, M.H.: Runtime Systems for Load Balancing and Fault Tolerance on Distributed Systems (Doctoral dissertation, The Ohio State University), (2014)
Wang, K., Zhou, X., Li, T., Zhao, D., Lang, M., Raicu, I.: Optimizing load balancing and data-locality with data-aware scheduling. In: Big Data (Big Data), 2014 IEEE International Conference on, pp. 119–128. IEEE (2014)
Wang, K., Zhou, X., Qiao, K., Lang, M., McClelland, B., Raicu, I.: Towards scalable distributed workload manager with monitoring-based weakly consistent resource stealing. In: Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, pp. 219–222. ACM (2015)
Laredo, J.L.J., Guinand, F., Olivier, D., Bouvry, P.: Load Balancing at the edge of chaos: how self-organized criticality can lead to energy-efficient computing. IEEE Trans. Parallel Distrib. Syst. 28(2), 517–529 (2016)
Pitek, W., Oleksiak, A., Da Costa, G.: Energy and thermal models for simulation of workload and resource management in computing systems. Simul. Modell. Pract. Theory 58, 40–54 (2015)
Pickartz, S., Lankes, S., Monti, A., Clauss, C., Breitbart, J.: Application migration in HPCA driver of the exascale era?. In: High Performance Computing & Simulation (HPCS), 2016 International Conference on, pp. 318–325. IEEE (2016)
Alowayyed, S., Groen, D., Coveney, P.V., Hoekstra, A.G.: Multiscale Computing in the Exascale Era. arXiv preprint arXiv:1612.02467 (2016)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mirtaheri, S.L., Grandinetti, L. Dynamic load balancing in distributed exascale computing systems. Cluster Comput 20, 3677–3689 (2017). https://doi.org/10.1007/s10586-017-0902-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-017-0902-8