Abstract
An important concern for an efficient use of distributed computing is dealing with load balancing to ensure all available nodes and their shared resources are equally exploited. In large scale systems such as volunteer computing platforms and desktop grids, centralized solutions may introduce performance bottlenecks and single points of failure. Accordingly fully distributed alternatives have been considered, due to their inherent robustness and reliability. In extremely dynamic contexts, scheduling middlewares should adapt their job scheduling policies to the actual availability and overcome the volatility and heterogeneity typical of the underlying nodes. To deal with the dynamicity of a large pool of resources, self-organizing and adaptive solutions represent a promising research direction. Solutions based on bio-inspired methodologies are particularly suitable, as they inherently provide the desired features. In this paper we present a fully distributed load balancing mechanism, called ozmos, which aims at increasing the efficiency of distributed computing systems through peer-to-peer interaction between nodes. The proposed algorithm is based on a Chord overlay, and employs ant-like agents to spread information about the current load on each node, to reschedule tasks from overloaded systems to underloaded ones, and to relocate incompatible tasks on suitable resources in heterogeneous grids. By means of several evaluation scenarios we demonstrate the effectiveness of the proposed solution in achieving system-wide load balancing, both with homogeneous and heterogeneous resources. In particular we consider the load balancing performance of our approach, its scalability, as well as its communication efficiency.
Similar content being viewed by others
References
Anderson DP (2004) Boinc: a system for public-resource computing and storage. In: Proceedings of the 5th IEEE/ACM international workshop on grid computing. IEEE Computer Society, Washington, DC, pp 4–10
Andrzejak A, Reinefeld A, Schintke F, Schütt T, Mastroianni C, Fragopoulou P, Kondo D, Malecot P, Cosmin Silaghi G, Moura Silva L, Trunfio P, Zeinalipour-Yazti D, Zimeo E (2008) Grid architectural issues: state-of-the-art and future trends. CoreGRID White Paper
Arora M, Das SK, Biswas R (2002) A de-centralized scheduling and load balancing algorithm for heterogeneous grid environments. In: ICPPW’02: proceedings of the 2002 international conference on parallel processing workshops. IEEE Computer Society, Washington, DC, pp 499–505
Babaoglu O, Marzolla MTM (2011) Design and implementation of a p2p cloud system. Technical report, Department of Computer Science, University of Bologna, Italy
Baikerikar JA, Surve SK, Prabhu SU (2010) Comparison of load balancing algorithms in a grid. In: Data storage and data engineering, international conference, pp 20–23
Baumgart I, Heep B, Krause S (2007) OverSim: a flexible overlay network simulation framework. In: Proceedings of 10th IEEE GI/INFOCOM 2007, Anchorage, AK, USA, pp 79–84
Bing-Jue S, Kai-Jun W (2011) Research on cloud computing application in the peer-to-peer based video-on-demand systems. In: Intelligent systems and applications (ISA), 2011 3rd international workshop, pp 1 –4
Bonabeau E, Dorigo M, Theraulaz G (1999) Swarm intelligence: from natural to artificial systems. Oxford University Press, Inc., New York
Brocco A (2011a) Overswarm: a simulation tool for biologically inspired peer-to-peer networks. Technical Report, TM-2011-4, Institute of Telematics, Karlsruhe Institute of Technology
Brocco A (2011b) Ozmos: bio-inspired load balancing in a chord-based p2p grid. In: Proceedings of the 3rd workshop on biologically inspired algorithms for distributed systems, BADS ’11. ACM, New York, pp 9–16
Brocco A, Malatras A, Huang Y, Hirsbrunner B (2010) Aria: a protocol for dynamic fully distributed grid meta-scheduling. ICDCS 2010, pp 86–95
Cao J (2004) Self-organizing agents for grid load balancing. In: Proceedings of the 5th IEEE/ACM international workshop on grid computing, GRID’04. IEEE, Washington, DC, pp 388–395
Chakravarti AJ, Baumgartner G (2004) The organic grid: self-organizing computation on a peer-to-peer network. In: ICAC’04: proceedings of the first international conference on autonomic computing, IEEE Computer Society, Washington, DC, pp 96–103
Christodoulopoulos K, Sourlas V, Mpakolas I, Varvarigos E (2009) A comparison of centralized and distributed meta-scheduling architectures for computation and communication tasks in grid networks. Comput Commun 32:1172–1184
Cunsolo VD, Distefano S, Puliafito A, Scarpa M (2010) From volunteer to cloud computing: cloud@home. In: Conference on computing frontiers, pp 103–104
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51:107–113
Dorigo M, Stützle T (2004) Ant colony optimization. Bradford Company, Scituate
Dörnemann K, Prenzer J, Freisleben B (2007) A peer-to-peer meta-scheduler for service-oriented grid environments. In: Proceedings of the first international conference on networks for grid applications, GridNets’07, ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering). ICST, Brussels, pp 7:1–7:8
Ferguson DF, Nikolaou C, Sairamesh J, Yemini Y (1996) Economic models for allocating resources in computer systems. In: Market based control of distributed systems. World Scientific Publishing Co., Inc., River Edge, NJ, USA, pp 156–183
Folding@home: Folding@home distributed computing. http://folding.stanford.edu/. Accessed 5 April 2012
Fölling A, Grimme C, Lepping J, Papaspyrou A (2009) Decentralized grid scheduling with evolutionary fuzzy systems. In: Job scheduling strategies for parallel processing. Lecture notes in computer science, vol 5798. Springer, Berlin
Forestiero A, Leonardi E, Mastroianni C, Meo M (2010) Self-chord: a bio-inspired p2p framework for self-organizing distributed systems. IEEE/ACM Trans Netw 18(5):1651–1664
Foster I, Iamnitchi A (2003) On death, taxes, and the convergence of peer-to-peer and grid computing. In: Kaashoek MF, Stoica I (eds) Peer-to-peer systems II, Second International Workshop, IPTPS 2003, Berkeley, CA, USA, Revised Papers. Lecture Notes in Computer Science, vol 2735. Springer, pp 118–128
Foster I, Kesselman C (1997) Globus: a metacomputing infrastructure toolkit. Int J Supercomput Appl High Perform Comput 11(2):115–128
Francesco Palmieri DC (2007) Swarm-based distributed job scheduling in next-generation grids. Springer, Berlin
Gupta R, Sekhri V, Somani AK (2006) CompuP2P: an architecture for internet computing using peer-to-peer networks. IEEE Trans Parallel Distrib Syst 17(11):1306–1320
Haynie D (2001) Biological thermodynamics, vol 424. Cambridge University Press, Cambridge
Huang PJ, Yu YF, Lai KC, Yang CT (2009) Distributed adaptive load balancing for p2p grid systems. In: Pervasive systems, algorithms, and networks (ISPAN 2009), pp 696–700
Lu K, Subrata R, Zomaya A (2006) An efficient load balancing algorithm for heterogeneous grid systems considering desirability of grid sites. In: Performance, computing, and communications conference, IPCCC 2006, pp 320–329
Mell P, Grance T (2009) The nist definition of cloud computing. Natl Inst Stand Technol 53(6): 50
Milojicic DS, Kalogeraki V, Lukose R, Nagaraja K, Pruyne J, Richard B, Rollins S, Xu Z (2003) Peer-to-peer computing. Technical Report, HP Labs
Moallem A, Ludwig SA (2009) Using artificial life techniques for distributed grid job scheduling. In: Proceedings of the 2009 ACM symposium on applied computing, SAC’09. ACM, New York, pp 1091–1097
Montresor A (2001) Anthill: a framework for the design and analysis of peer-to-peer systems. In: Proceedings of the 4th European research seminar on advances in distributed systems. Bertinoro, Italy
Montresor A, Meling H, Babaoğlu O (2002) Messor: load-balancing through a swarm of autonomous agents. In: Proceedings of 1st workshop on agent and peer-to-peer systems, pp 125–137
Murata Y, Inaba T, Takizawa H, Kobayashi H (2008) Implementation and evaluation of a distributed and cooperative load-balancing mechanism for dependable volunteer computing. In: DSN, pp 316–325
Nudd GR, Kerbyson DJ, Papaefstathiou E, Perry SC, Harper JS, Wilcox DV (2000) Pace—a toolset for the performance prediction of parallel and distributed systems. Int J High Perform Comput Appl 14(3):228--251
Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1(1):33–57
Romberg M (1999) The UNICORE architecture: seamless access to distributed resources. In: Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing, HPDC ’99. IEEE Computer Society, Washington, DC, USA, pp 287–293
Salehi MA, Deldari H (2006) Grid load balancing using an echo system of intelligent ants. In: Proceedings of PDCN’06. Anaheim, CA, USA, pp 47–52
Schopf JM (2004) Ten actions when Grid scheduling: the user as a Grid scheduler. Kluwer Academic Publishers, Norwell
Shah R, Veeravalli B, Misra M (2007) On the design of adaptive and decentralized load balancing algorithms with load estimation for computational grid environments. IEEE Trans Parallel Distrib Syst 18(12):1675–1686
Stoica I, Morris R, Karger D, Kaashoek FM, Balakrishnan H (2001) Chord: A scalable peer-to-peer lookup service for internet applications. In: SIGCOMM’01, vol 31. ACM Press, New York, pp 149–160
Varga A (2001) The omnet++ discrete event simulation system. In: Proceedings of the European simulation multiconference (ESM’2001)
Wu W, Chen Y, Zhang X, Shi X, Cong L, Deng B, Li X (2008) LDHT: locality-aware distributed hash tables. In: Information networking, 2008. ICOIN 2008. International conference, pp 1–5
Acknowledgments
This research has been carried out thanks to the financial support of the Swiss National Science Foundation, scholarship nr. 134285.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Brocco, A. The grid, the load and the gradient. Nat Comput 12, 69–85 (2013). https://doi.org/10.1007/s11047-012-9323-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11047-012-9323-z