A hybrid genetic algorithm for the design of water distribution networks

doi:10.1016/j.engappai.2004.10.001

Engineering Applications of Artificial Intelligence

Volume 18, Issue 4, June 2005, Pages 461-472

https://doi.org/10.1016/j.engappai.2004.10.001 Get rights and content

Abstract

Genetic algorithms are currently one of the state-of-the-art techniques for the optimisation of engineering systems including water network design and rehabilitation. They are capable of finding near optimal cost solutions to these problems given certain cost and hydraulic parameters. However, many forms of genetic algorithms rely on random starting points that are often poor solutions and the problem of how to efficiently provide good initial estimates of solution sets automatically is still an ongoing research topic. This paper proposes a novel method, known as CANDA-GA, which uses a heuristic-based, local representative cellular automata approach to provide a good initial population for genetic algorithm runs. CANDA-GA is applied to three networks, one taken from the literature and two taken from industry. The results show that the proposed method consistently outperforms the conventional non-heuristic-based GA approach in terms of producing more economically designed water distribution networks.

Introduction

The problem of designing a water distribution network (WDN) to optimally meet performance and cost criteria is known to be NP hard and a large variety of computational algorithms have been devised for this task. In recent years, the genetic algorithm (GA) has proved to be one of the most popular algorithms in a variety of domains that include engineering optimisation problems and the design of WDNs. The application of GAs to WDN optimisation can be traced back to the late 1990s (Dandy et al., 1996; Savic and Walters, 1997) and whilst there are now many more variants of the algorithm than there were then, it remains a vital tool for WDN optimisation. Therefore, it can be said with some confidence that GAs represent a state-of-the-art approach to WDN optimisation. Conventional GAs usually begin the optimisation process by randomly generating a solution set, evaluating each solution's performance on the problem and then selecting the best for entry into the next generation. Selecting random solutions is an intuitive way of generating unbiased solutions when the algorithm has no prior information on the search space.

More recently, a number of researchers (Neppalli et al., 1996; Harik and Goldberg, 2000; Liaw, 2000; Hopper and Turton, 2001; Yang et al., 2002) have found that if prior knowledge exists or can be generated at a low computational cost, seeding GAs with good initial estimates may generate better solutions with faster convergence. The seeding of a GA with good solutions is not a new idea: Grefenstette (1987) discussed methods and demonstrated the value of incorporating problem-specific knowledge into the GA mechanism, including seeding the population. Louis (1997) found that seeding the GA population with known good solution from case-based reasoning was a feasible approach. They implemented the scheme for the open-shop re-scheduling problem and found that the performance of GA was consistently better than a randomly seeded GA. Oman and Cunningham (2001) experimented with seeding for the travelling salesman problem (TSP) and the job-shop scheduling problem (JSSP), two benchmark tasks for evolutionary algorithms. They seeded the GA with known good solutions in the initial population of the GA and found that the results were significantly improved on the TSP but not on JSSP. Interestingly, they used a varying percentage of seeding, from 25% to 75% and the result for each was remarkably similar although the authors do point out that a 100% seed was not very successful on either problem. The authors also found that the best-quoted results on the TSP were discovered when the seeded solutions incorporated some heuristic element, as well as information from the overall problem definition. Therefore, it follows that a heuristic-based approach to seeding a GA should yield performance enhancements on difficult problems and this forms the basis of the proposed approach.

In recent years, a new kind of algorithm called cellular automata (CAs) has emerged and been widely applied to distributed computing and spatially distributed problems, such as the simulation of physical systems, traffic flows (Emmerich and Rank, 1995, Emmerich and Rank, 1997) and a variety of other applications (Bandani et al., 2001; Toffoli and Margolus, 1987). In a recent publication (Keedwell and Khu, 2004), we described the use of a CA-inspired approach (Cellular Automaton for Network Design Algorithm, CANDA) for the design of water distribution systems, the results of which showed that CANDA can provide good solutions requiring only a very small number of network simulations. However, the CANDA approach alone is not necessarily amenable to being used in the traditional optimisation domain as it does not use typical performance metrics throughout the algorithm run. Without these metrics, it is difficult to direct the search procedure beyond subtly manipulating the rule set of the algorithm. In addition to this, a problem that WDN designers typically face is the limited time that can be spent on design. A pragmatic design usually involves running the network simulator for a limited number of runs especially for large networks involving thousands of pipes as decision variables. GAs, whilst efficient, usually require many thousands of network simulations in an optimisation. Hence, the difficulty of the WDN design problem is to balance the number of network simulations with a level of good design solutions. In light of these facts, we propose a combined CA and GA approach (herein known as CANDA-GA) to address this problem.

In order for the readers to understand the problem faced by WDN designers, a brief description of WDN design is outlined in the next section, followed by short descriptions of GA and CA. The concept of seeding is discussed followed by the proposed CANDA-GA algorithm. The results of applying CANDA-GA on the two-loop network (Alperovits and Shamir, 1977), and two real-world design problems show the advantages of this approach over solely using a GA. This is followed by a general discussion of the results and conclusions.

A water distribution system typically consists of an array of pipes, pumps, valves and other appurtenances. The flows through a water distribution system are governed by complex, non-linear, non-convex and discontinuous hydraulic equations. Water distribution systems can be modelled and simulated through the combined use of the conservation of flow and energy equations. The conservation of energy equations applied to each independent loop of the water distribution system thus constitute a system of non-linear equations.

Assuming water is incompressible, the general expression for the conservation of flow at each node in the network is (Mays and Tung, 1992) $\sum Q_{in} - \sum Q_{out} = Q_{external},$ where Q_in and Q_out are the pipe flows into and out of the node, respectively, and Q_external is the external demand or supply at the node.

The conservation of energy equation is required for each loop in the network as given by $\sum h_{L} - \sum H_{pump} = 0 .$ Head loss can be related to flow using the expression $h_{L} = K Q^{n},$ where h_L is head loss, K the head loss coefficient, Q the flow, and n the exponent.

The Darcy–Weisbach equation, the Hazen–Williams and Manning empirical equations may be used for computing the friction head losses in pressure pipes which normally represent the most significant element in the determination of distribution of flow in pipe networks. Computational methods such as those of Hardy Cross and Newton–Raphson, and linear theory methods may be used for analysing flow in pipe networks.

As mentioned previously, a variety of computational algorithms exist for the optimum design and rehabilitation of WDNs. In an optimum design problem, the objective is to design a completely new network given a set of costs, demands and other requirements of the network. In the case of a rehabilitation problem, the objective is to propose alternatives or alterations to the existing network in order to meet new criteria that have arisen through its lifetime. In both cases, the algorithm decision variables are the sizes of pipes at a variety of locations in the network. The field of optimisation has primarily focussed on using new algorithms to improve a system by reducing the monetary outlay required to achieve the requisite properties of the network. The pipe layout, the node connectivity, demands of the system and minimum pressure head requirements are typically assumed to be known. Readers are referred to Rossman (1999) or Walski et al. (2001) for more information on WDN modeling, and Alperovits and Shamir (1977), Quindry et al. (1981) or Goulter (1992) on optimal design of WDNs.

The problems facing the optimal design of WDNs are huge; they belong to a class of problems known as NP-hard problems, where the problem is intractable and it is not practical to perform a full enumeration using any rigorous algorithm. For this reason, there are many examples of algorithms passing from artificial intelligence to the optimisation domain. For instance, a network with 12 pipes and 8 potential pipe diameters has 8¹² possible pipe diameter combinations, which constitute the search space of the problem. Even this very modest network would require an exhaustive search algorithm a considerable amount of time to navigate the entire search space of 68,719,476,736 potential solutions. It is clear that more intelligent methods are required to solve these problems, and these recently have taken the form of GAs.

The history of the GA can be traced back to the late 1970s in the work of Holland (1975) as models of evolution. Their popularity has steadily grown since then and there are now a large number of applications even just within engineering. There have been numerous advances in these algorithms which have benefited the field of optimisation, progressing from the early single-objective algorithm to multiple-objective algorithms (Fonseca and Fleming, 1995) that allow network designers a variety of options when designing a WDN. The GA approach uses a population of individual solutions that iterate from one generation to the next as the search progresses. The performance of each of the solutions is evaluated by an “objective function” which relates the solution variables to the problem at hand. Typically, in WDN optimisation problems, the solution decision variables make changes to the network which is then simulated by a network simulator (such as EPANET, Rossman, 1999). To proceed from one generation to the next, the algorithm uses crossover and mutation operators to generate new solutions and a selection operator to choose which individuals survive into the next generation. By using these mechanisms, the GA is able to quickly traverse the search space whilst avoiding local minima and proceeding to a near-optimal answer. Despite their success, GAs are generally criticised in two main areas of their operation:

1.
They find different answers to problems depending on their starting position in the search space. This is a problem with all stochastic algorithms as they use random starting points and variables during the optimisation, and therefore two optimisation runs with two random seeds are never the same.
2.
They are population based and therefore require a large number of objective function evaluations to solve a problem. A typical GA run will use a population size of 100 and run for 1000 generations. Depending on the algorithm used, this can require up to 100,000 or more objective function evaluations (network simulations). This scale of the required computational effort may be large and impractical in many cases.

The proposed CANDA-GA approach goes some way to addressing these concerns by reducing the need for large numbers of generations and also reducing the variability of the GA with different random seeds.

CANDA-GA makes use of a computational method known as CA (Von Neumann, 1966). A CA consists of an interconnected set of nodes (often in regular formation) that use a number of rules to update the state of every node according to the states of neighbouring nodes. These rules and states are normally dependent on the problem being solved, as is the size neighbourhood on which they operate. The neighbourhood defines how many surrounding nodes are taken into account before updating the state of the node in question. An important feature of the CA is that updates for every node are performed in parallel; therefore, in one iteration every node updates its state depending on those surrounding it in the previous iteration.

Traditional optimisation algorithms are driven by global performance, for instance, in the GA, the objective function determines the optimality of a solution compared to the others in the population. This is often determined in WDN optimisation as a combination of hydraulic and cost parameters; an example fitness function could be $fitness = a (TotalHeadDeficit) + b (Cost),$ where a and b are constant multipliers, TotalHeadDeficit yields a measure of the violation of the hydraulic criteria set for the problem and Cost is the monetary cost associated with the current solution. However, CAs do not have an objective function such as this and are concerned only with the execution of rules at a local level.

From an optimisation point of view, CAs possess three additional key properties in their execution:

1.
Parallelism: Updates of each cell state are completed in parallel, and each of the changes to pipe diameters occurs at once. This factor is vital for optimising large WDNs as will be seen in later sections.
2.
Localist representation: Determines that when a node is updated, its new state is based solely on the old state of the node and of those of its nearest neighbours. Localism is the mechanism by which parallelism can benefit performance in combinatorial problems such as this.
3.
Homogeneity: Determines that each node is updated according to the same rules. This is important for treating each area of the WDN with the same degree of importance as any other. This homogeneity is also present in other algorithms such as GAs due to their lack of problem-specific knowledge.

In a previous publication (Keedwell and Khu, 2003), the CANDA approach was tested on the three WDNs also used in this paper, the two-loop network (Alperovits and Shamir, 1977) and two large real-world networks (Networks A and B) from the United Kingdom. The task for CANDA was to create an optimal design given a least-cost design as the starting point (where each of the pipe diameters is smallest). The search spaces involved are large, and each space consists of 8¹² , 635²⁰ and 1277²⁰ possible pipe diameters, respectively, for the two-loop, Network A and Network B. In experimentation, the CANDA approach did not produce a better ultimate solution when compared with the GA on smaller problems. However, very good approximate solutions were obtained for an extremely small number of network simulations (<1% of the GA network simulations). Therefore CANDA was shown to be a quick approach to estimate a final solution set. To some degree, every optimisation algorithm, with the exception of a full enumeration, trades some optimality for a saving in computational time. The CANDA approach is no different except that it takes this one step further than the GA. The results are not optimal, but the computation required is significantly less than other methods.

As described previously, the initial population influences the path that the GA will take to the near-optimal final solution, and therefore it is an important part of the algorithm. Therefore, if a search algorithm can discover a set of more optimal initial solutions than random, then the GA performance can be expected to increase. The difficulty with this approach is that the technique used to search the space to discover solutions in the first instance must be more efficient than the GA for the seeding to be effective. Therefore, the seeding of a GA has to be completed with a minimal number of model evaluations whilst representing the best benefit to the algorithm. To accomplish this, a hybrid approach is proposed that utilises the initial power of CANDA without sacrificing the ability of the GA to find solutions that match or exceed the exact requirements of the optimisation. The major advantage that CANDA has over other search techniques is that the number of model evaluations incurred is very small and therefore it is ideal for the seeding of an algorithm such as the GA. In this paper, we consider only the seeding of a GA due to its population basis and because they currently represent a state-of-the-art method for designing WDNs. There is no practical reason, however, why CANDA could not be used to seed other search methods for this problem.

Section snippets

Algorithm overview

A detailed explanation of the operation of the CANDA approach can be seen in Keedwell and Khu (2003). However, a short explanation of the heuristic CANDA method will be given here to aid understanding of the CANDA-GA approach.

The CANDA algorithm works by considering a WDN as a special form of CA; it cannot be definitively considered a CA because the nodes are connected by variable length and diameter links, and the arrangement of them is not regular. Despite this, nodes and links in CANDA

Experimentation

This section describes a set of experiments on three WDN design problems, one taken from the literature and two actual WDNs. The aim of the experiments is to compare a standard GA with the same algorithm in CANDA-GA. The experiments are run in exactly the same fashion for all networks, and are as follows:

1.
The GA random seed is set to be one of five alternatives, either 1, 12, 123, 1234 or 12345.
2.
A GA of population 100 (roulette wheel selection, one point crossover, 0.9 mutation and crossover

Discussion

The cellular automaton approach has been found to be a useful tool in its own right when considering the design of WDNs (Keedwell and Khu, 2004). However, whilst it finds reasonable solutions in exceptional model evaluation time, there will conceivably be scenarios where a result which meets a given specification is required. The CANDA-GA approach combines the best of the two algorithms in that CANDA is used to find globally good solutions in the first instance and then the GA is used to

Conclusions

A novel CA-based approach to the seeding of GAs for WDN design optimisation problems has been described. The approach uses a combination of CA and GA technologies to yield improved solutions for the lifetime of the optimisation up to 100,000 generations. The drawbacks of using this approach are few, and the benefits are such that optimisation runs can either be made shorter to achieve a given goal or discover better results in a fixed timeframe. This principle has been shown to work using

Acknowledgements

This research was funded by a UK EPSRC Grant GR/R73393.

The authors wish to thank Godfrey Walters for provision of the industry networks, and also to Prasad Tumula for his assistance in implementing them.

References (25)

H. Emmerich et al.
An improved cellular automaton model for traffic flow simulation
Physica A
(1997)
G.R. Harik et al.
Linkage learning through probabilistic expression
Computer Methods in Applied Mechanics and Engineering
(2000)
E. Hopper et al.
An empirical investigation of meta-heuristic and heuristic algorithms for a 2D packing problem
European Journal of Operational Research
(2001)
C.-F. Liaw
A hybrid genetic algorithm for the open shop scheduling problem
European Journal of Operational Research
(2000)
S.J. Louis
Working from blueprintsevolutionary learning for design
Artificial Intelligence in Engineering
(1997)
V.R. Neppalli et al.
Genetic algorithms for the two stage bicriteria flowshop problem
European Journal of Operational Research
(1996)
M. Yang et al.
A hybrid genetic algorithm for the fitting of models to electrochemical impedence data
Journal of Electroanalytical Chemistry
(2002)
E. Alperovits et al.
Design of optimal water distribution systems
Water Resources Research
(1977)
S. Bandani et al.
Cellular automatafrom a theoretical parallel computational model to its application to complex systems
Parallel Computing
(2001)
G.C. Dandy et al.
An improved genetic algorithm for pipe network optimization
Water Resources Research
(1996)

Emmerich, H., Rank, E., 1995. Analyzing traffic flow by a cellular automaton. Proceedings of the EUROSIM 1995, Vienna,...

C.M. Fonseca et al.

An overview of evolutionary algorithms in multiobjective optimisation

Evolutionary Computation

(1995)

Cited by (109)

Efficiency in university hospitals: A genetic optimized semi-parametric production function
2023, Operations Research Perspectives
This paper investigates the social-welfare efficiency drivers of public university hospitals in Brazil by focusing on how the surrounding social welfare conditions may affect their performance. A novel Genetic Envelopment Analysis (GEA) approach is developed here to this end. Subsequently, LASSO regression is applied to filter the impact of social-welfare related variables –on efficiency scores. Results indicate that beds, number of employees and number of doctors are the influential factors in determining the efficiency level, while the operating scales are not relevant to the productivity level. We further find that there is a degree of difference related to the efficiency level among the hospitals in the sample. Finally, our results show that GEA estimates present higher discrimination and dispersion compared to DEA, SFA and TOPSIS, also GEA provides the most reliable and accurate results. In the second stage analysis, we find that female population ratio and high school ratio significantly affect the efficiency level in a negative manner, while the urban population ratio has a significant and positive impact. Based on these results, we provide important policy implications.
A Chaotic Sobol Sequence-based multi-objective evolutionary algorithm for optimal design and expansion of water networks
2022, Sustainable Cities and Society
The design of a water distribution network (WDN) is an optimization problem that is computationally challenging with conflicting objectives. This study offers an enhanced Chaotic Sobol Sequence-based Multi-Objective Self-Adaptive Differential Evolution (CS-MOSADE) algorithm for multi-objective WDN design. The CS-MOSADE algorithm was tested on two benchmark WDNs, and a real WDN. Optimization results indicate that the CS-MOSADE algorithm converged two to three times faster than the MOSADE and NSGA-IIalgorithms and led to better output in terms of even distribution of solutions and convergence towards the true Pareto-optimal front. Smaller spacing metric indicated better uniformity in the obtained solutions; and larger hyper-area and coverage function values depicted better convergence towards the true Pareto-optimal front for the CS-MOSADE algorithm compared to the other algorithms. The CS-MOSADE algorithm was then applied to solve a WDN expansion problem for optimal pump scheduling and minimization of Life Cycle Cost, maximization of reliability and minimization of Green House Gas (GHG) emissions. A significant reduction in GHG emissions of 2.17 x 10⁶ kg was achieved at an additional cost of $0.55 x 10⁷ when optimal pump scheduling was incorporated in the model of the real WDN over service life of 50 years.
Structural optimization of concrete arch bridges using Genetic Algorithms
2019, Ain Shams Engineering Journal
Concrete bridges are used for both highways and railways roads. They are characterized by their durability, rigidity, economy and beauty. Concrete bridges have many types such as simply supported girder bridges, arch bridges and rigid frame bridges. However, for very large spans, arch bridges are more economic in addition to their beauty appearance. In this research, a geometrical structural optimization study for a deck concrete arch bridges using Genetic Algorithms technique is presented. This research aims mainly to demonstrate a methodology to find the least cost design, in term of material volume, by finding the optimal profile. A Finite Element numerical model is used to represent the arch structure. The MATLAB programing platform is used to develop codes for Genetic Algorithms optimization technique and Finite Element analysis method. The resulted design from the optimization process is compared to traditional design and an obvious cost reduction is obtained.
A Review of Optimal Design for Large-Scale Micro-Irrigation Pipe Network Systems
2023, Agronomy
Optimization of urban water pipe network design using fast-messy genetic algorithms (fmGA)
2023, H2Open Journal
Chaos-directed genetic algorithms for water distribution network design: an enhanced search method
2022, Stochastic Environmental Research and Risk Assessment

View all citing articles on Scopus

View full text

A hybrid genetic algorithm for the design of water distribution networks

Abstract

Introduction

Section snippets

Algorithm overview

Experimentation

Discussion

Conclusions

Acknowledgements

Physica A

Computer Methods in Applied Mechanics and Engineering

European Journal of Operational Research

European Journal of Operational Research

Artificial Intelligence in Engineering

European Journal of Operational Research

Journal of Electroanalytical Chemistry

Design of optimal water distribution systems

Water Resources Research

Cellular automatafrom a theoretical parallel computational model to its application to complex systems

Parallel Computing

An improved genetic algorithm for pipe network optimization

Water Resources Research

An overview of evolutionary algorithms in multiobjective optimisation

Evolutionary Computation