Distributed meta-scheduling in lambda grids by means of Ant Colony Optimization
Introduction
Meta-scheduling is the process of scheduling applications across different sites by orchestrating pool of resources within each local scheduler [1]. This local scheduler is commonly referred as Local Resource Management System (LRMS), since it manages the local resources. Those resources are made transparently available to its users by using network services often supported by commodity Internet, which provides a best-effort transport service.
However, some data-intensive grid applications, such as large scale scientific experiments, do require a dedicated transport infrastructure with large bandwidth associated to strict levels of Quality of Service (QoS) and predictable times, which can be provided by wavelength-routed optical networks [1], [2]. When computing resources of a grid are interconnected by an optical network that allows its applications or its meta-scheduler to dynamically request lightpaths on-demand, the grid is commonly referred as a lambda grid [1], [2].
In order to fulfill the needs of task requests demanded by grid applications, the grid meta-scheduler has to assure that both computing and networking resources are available at appropriate times by reserving those resources. Since a computing resource can be used only after the setup of a lightpath connection is guaranteed to connect it to the grid application, both computing and networking resources have to be co-allocated and then reserved by the meta-scheduler [2].
Reservation of resources in grids typically fall into two categories: immediate reservations (IR) and advance reservations (AR) [2]. The use of resources starts immediately upon the admission of an immediate reservation demand while it is delayed until a future time when an advance reservation is admitted. Note that allowing for advance reservation in a grid environment improves the performance of the scheduling process [3]. However, advance reservations make the meta-scheduling process significantly more complex [4].
Ant Colony Optimization (ACO) [5] algorithms are a promising candidate for meta-scheduling in lambda grids. They are inspired on the observation of the foraging behavior of natural ants, being specially suited for hard-to-solve combinatorial problems or situations where a distributed control is needed. Indeed, in a lambda grid environment, requests are made for lightpath connectivity from an application to a computing resource to be discovered in the grid. In ACO-based algorithms, the discovery of computing resources in the grid is a by-product of the discovery of good routes by the artificial ants. Thus, the ants can gather both resource availability and routing state information in their trips throughout the lambda grid system. In other words, the ants allow for grid and networking resource integration at the control plane of the network. Hence this information can be used in meta-scheduling and co-allocation of the lambda grid resources. In fact, the effectiveness of ACO-based algorithms has already been demonstrated for immediate reservations [6], but their support for advance reservations remained an open research issue.
In this context, the contributions of this paper are three-fold. We present an ACO-based framework for distributed meta-scheduling in lambda grids with support to distributed advance reservation and co-allocation of both computing and optical networking resources. We also present an aggregation mechanism for the information collected by the ants to keep their overhead in the lambda grid system low. In addition, we detail the use of an extended RSVP-TE signaling protocol [7], which has already been used for distributed reservation of resources on optical networks, to also reserve other grid resources and to support advance reservations.
Simulations are carried out to evaluate the performance of the ACO algorithm under different local and meta-scheduling policies, and different resource co-allocation algorithms. Moreover, a comparison with the immediate reservation case is provided to show the importance of supporting advance reservations in order to improve the performance of the scheduling process.
The remaining of the paper is organized as follows. Firstly, we briefly introduce the motivation of this paper and discuss some related works in: (i) ant algorithms for grid meta-scheduling, and (ii) optical network reservation in advance and co-allocation of processing and networking resources for grid environments. In Section 3, we discuss the advance reservation model and, in Section 4, the meta-scheduling architecture used throughout this work. Then, in Section 5, we present our ACO framework for distributed meta-scheduling in advance with co-allocation of computing and optical networking resources. In Section 6, we detail the simulations carried out to evaluate our proposed approach for meta-scheduling in lambda grids. The results obtained through simulations are shown and discussed in Section 7. Finally, in Section 8, conclusions are drawn.
Section snippets
Motivation and related work
To the best of our knowledge, there is no other work in the literature with explicit advance reservation and resource co-allocation using ACO-based algorithms for grid meta-scheduling. Besides, all proposed mechanisms in this work are distributed: meta-scheduling, advance reservation and co-allocation of resources.
A complete solution for an advance reservation and resource co-allocation mechanism will need to address the challenges related to the control plane protocols, albeit they are often
Advance reservation model
Advance reservation can fall into different types, according to its specifications [2], [31], [32]. For instance, if the reservation specifies a starting time and a duration, it is called STSD (Specified starting Time, Specified Duration) [31] or fixed [32]. A variation of this type is STSD with flexible window [31] or first-fit/deadline [32], where a range of starting times is defined instead of a single starting time.
In this work, we consider STSD requests with flexible window, which
Meta-scheduling architecture
Meta-schedulers can be classified into three different models [3]. In the centralized model, the meta-scheduler is a central instance that has a complete knowledge of the usage of the grid resources. Indeed, in this case, the meta-scheduler has a full control over each local scheduler of the grid. The hierarchical model is a variation of the centralized scheme, where the meta-scheduler is a central instance that communicates with other schedulers of its hierarchy. In distributed
Ant Colony Optimization (ACO)
ACO algorithms are based on artificial stigmergy [35], where artificial pheromone levels have positive or negative feedback according to the solution quality seen by the ants. Since those levels contain information from previous solutions of the problem, they can be explored collectively by the ants to improve the solution. Although the artificial ant is a simple, lightweight mobile agent [36], [37], stigmergy allows for the ant colony to exhibit an emergent, self-organizing behavior [38],
Simulation
For evaluating the proposed algorithms, we used the NSFNet backbone network that is shown in Fig. 5, where the latencies between neighbor nodes are depicted at each link. It is a 14-node network with 21 bidirectional links and it is well-balanced [20], with average shortest-path length between all pairs of nodes equal to 2.2 hops and diameter equal to 3. The NSFNet network is a very common benchmark for assessing routing performance [1], [6], [20], [27], [28], [30], being a conservative
Numerical results
We considered the following notation on the next figures, which takes the form: meta-scheduling policy (CA, LL or BADR)/local scheduling approach for the processing resources (EST or FF)—local scheduling approach for the networking resources (EST or FF). As already explained for the SF algorithm, since the EST and FF local scheduling approaches are equivalent for the networking resources, it is omitted for sake of clarity.
First of all, we evaluate the influence of the fixed timeslot duration (
Conclusions
In this work, we presented an ACO-based distributed meta-scheduler with distributed resource co-allocation and advance reservation support in lambda grids. We proposed three different resource co-allocation algorithms: Server First, Server First-Relaxed and Network First. We demonstrated that the best strategy is to allocate first the networking resources and then allocate the processing resources, i.e., the Network First algorithm. However, the RSVP-TE signaling protocol has to be extended to
Acknowledgments
This work was supported by Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) (n∘2008/57857-2), Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) (n∘574017/2008-9), and Instituto Nacional de Ciência e Tecnologia Fotônica para Comunicações Ópticas (FOTONICOM).
Gustavo Sousa Pavani graduated from University of Campinas (UNICAMP) in 2001 with a degree in Computer Engineering. He received his M.Sc. degree and his Ph.D. degree in Electrical Engineering from UNICAMP, in 2003 and 2006, respectively. Currently, he is an associate professor at Universidade Federal do ABC (UFABC), Brazil. He has interest on the following topics: routing algorithms for packet-switched and circuit-switched optical networks by means ant-colony optimization (ACO), GMPLS control
References (44)
Network-aware scheduling for real-time execution support in data-intensive optical grids
Future Gener. Comput. Syst.
(2009)- et al.
Co-scheduling in lambda grid systems by means of ant colony optimization
Future Gener. Comput. Syst.
(2009) - et al.
An ant algorithm for balanced job scheduling in grids
Future Gener. Comput. Syst.
(2009) - et al.
MAX-MIN ant system
J. Future Gener. Comput. Syst.
(2000) - et al.
Distributed job scheduling based on swarm intelligence: A survey
Comput. Electr. Eng.
(2014) - et al.
Advance reservation, co-allocation and pricing of network and computational resources in grids
Future Gener. Comput. Syst.
(2014) - et al.
Network-aware meta-scheduling in advance with autonomous self-tuning system
Future Gener. Comput. Syst.
(2011) - et al.
From volunteer to trustable computing: Providing QoS-aware scheduling mechanisms for multi-grid computing environments
Future Gener. Comput. Syst.
(2014) - et al.
Scheduling efficiency of resource information aggregation in grid networks
Future Gener. Comput. Syst.
(2012) - et al.
A survey of advance reservation routing and wavelength assignment in wavelength-routed WDM networks
IEEE Commun. Surv. Tutor.
(2012)
Ant Colony Optimization
An ant colony optimization approach to a grid workflow scheduling problem with various QoS requirements
IEEE Trans. Syst. Man Cybern. Part C Appl. Rev.
Cited by (8)
An improved SSO algorithm for cyber-enabled tumor risk analysis based on gene selection
2019, Future Generation Computer SystemsCitation Excerpt :SSO evolved from particle swarm optimization a number of years ago, which is customized for use for discrete problems. Compared with particle swarm optimization (PSO) [3,8,25–27], genetic algorithm (GA) [11,28–30] and ant colony optimization (ACO) [4,5,31–33], SSO has gained increasing attention from researchers for its simplicity, efficiency and better convergence. Nevertheless, SSO still suffers from some unavoidable disadvantages in a few respects, as do other heuristic algorithms.
Smart perception and autonomic optimization: A novel bio-inspired hybrid routing protocol for MANETs
2018, Future Generation Computer SystemsCitation Excerpt :Further, bio-inspired methods have also shown the great potential for solving path-finding problems in MANETs. For example, ant colony optimization (ACO), inspired by ants’ foraging processes, is able to achieve an optimal solution by employing positive feedback mechanism [18]. Some routing protocols are proposed based on ACO, e.g., ARA [19], AntHocNet [20], HOPNET, AD-ZRP [21] and HACOR [22].
Survivability in Lambda Grids by means of Ant Colony Optimization
2021, Proceedings of the IM 2021 - 2021 IFIP/IEEE International Symposium on Integrated Network ManagementTrust-Based Secure Routing in Mobile Ad Hoc Network Using Hybrid Optimization Algorithm
2019, Computer JournalAn Intelligent GbMFPA Model for Sales Optimization in Distributed Grid-Market
2018, Wireless Personal CommunicationsDistributed resource scheduling algorithm based on hybrid genetic algorithm
2017, Proceedings - 2017 International Conference on Computing Intelligence and Information System, CIIS 2017
Gustavo Sousa Pavani graduated from University of Campinas (UNICAMP) in 2001 with a degree in Computer Engineering. He received his M.Sc. degree and his Ph.D. degree in Electrical Engineering from UNICAMP, in 2003 and 2006, respectively. Currently, he is an associate professor at Universidade Federal do ABC (UFABC), Brazil. He has interest on the following topics: routing algorithms for packet-switched and circuit-switched optical networks by means ant-colony optimization (ACO), GMPLS control plane, and the optical network support for grid and cloud architectures.
Rodrigo Izidoro Tinini graduated from Universidade Municipal de São Caetano do Sul (USCS) in 2011 with a degree in Computer Science. He received his M.Sc. degree in Computer Science from Federal University of ABC (UFABC) in 2014. Currently, he is a Ph.D. student at University of São Paulo (USP) with a scholarship awarded from Hewlett-Packard. He has interest on the following topics: grid computing, optical networking and artificial intelligence.