A geometric semantic genetic programming system for the electoral redistricting problem
Introduction
The zone design problem (also known as redistricting) is the process of dividing a geographic space or region of spatial units into smaller subregions or districts. Probably, the most well-known instance of the zone design problem is the electoral redistricting problem. As reported by Bação et al. [1], electoral redistricting consists in the partitioning of areal units, generally administrative units, into a predetermined number of zones (districts) such that the units in each zone are contiguous, each zone is geographically compact and the sum of the populations of the areal units in any district is as similar as possible in all the districts or lies within a predetermined range [2]. Because of the spatial nature involved in constraints, redistricting is usually seen as a type of spatial clustering. Due to its NP-completeness, the electoral redistricting problem is considered a complex problem, and heuristic techniques seem to provide the best solutions. In this paper, we propose the use of Genetic Programming (GP) [5] to address the electoral redistricting problem. In particular, we use recently defined genetic operators, called geometric semantic genetic operators [11], that allow us to integrate semantic awareness in the evolutionary process.
One of the strongest points of geometric semantic operators [11] is that, using semantic information, they are able to induce, by construction, a unimodal fitness landscape on all problems consisting in matching sets of input data into known targets (like supervised learning problems, such as regression or classification). Geometric semantic operators have been used so far on many different symbolic regression problems (including several complex real-life applications [8], [19], [20], [21]), generally with excellent results, and the fact that the fitness landscape is unimodal is often used as an argument to justify those results [8], [19], [20], [21]. In this paper, for the first time, we apply geometric semantic operators to a problem that does not belong to that class: the objective of the application studied here, in fact, is not matching sets of input data into known targets. Thus, nothing can make us believe that the fitness landscape induced by those operators is unimodal in our case. Interestingly, the results we present demonstrate that geometric semantic operators have a beneficial effect on the search process also in this case. This fact hints that geometric semantic operators may be useful not only in single-objective supervised learning problems, but also in a larger class of problems, and possibly the justification for their effectiveness goes beyond the fact that they may induce unimodal fitness landscapes. This issue opens the door to future investigation and deserves to be deepened in the future.
The paper is organized as follows: Section 2 describes the redistricting problem and its constraints; Section 3 presents the standard GP algorithm and the canonical (syntax-based) genetic operators; Section 4 defines the geometric semantic operators used in this paper. Section 5 describes the representation of the candidate solutions, the fitness function and how the geometric semantic operators have been used in this work. Section 6 presents the experimental settings and the obtained results. Here, a comparison between the proposed framework and several existing redistricting techniques is also presented. Finally, Section 7 concludes the paper and suggests ideas for possible future research.
Section snippets
Redistricting problem: Definition and constraints
In redistricting problems, the aim is to aggregate n geo-spatial regions into c partitions (or districts) subject to some constraints. The most well-known application of the redistricting problem is the electoral redistricting problem. Here, the objective is to create districts usually by grouping smaller administrative units.
The constraints that define a “good” electoral redistricting plan are as follows: 1. All the districts should be equal in population, 2. Each district should be a single
Genetic programming
In this work, the electoral redistricting problem has been addressed using a GP based system. GP [5] is one of the youngest paradigms inside the computational intelligence research area called Evolutionary Computation (EC) and consists in the automated learning of computer programs by means of a process mimicking Darwinian evolution. GP evolves computer programs, traditionally represented as tree structures. Trees represent candidate solutions for the problem at hand and they can be easily
Geometric semantic operators
Recent research in GP have been dedicated to an aspect that was only marginally considered up to some years ago: the definition of methods based on the semantics of the solutions [7], [8], [9], [10]. Although there is no universally accepted definition of semantics in GP, this term often refers to the behavior of a program, once it is executed on a set of data. For this reason, in many references, including here, the term semantics refers to the vector of outputs a program produces on the
Methodology
In this section, we describe how a candidate solution is represented in the proposed GP system. Similar to what has been done in previous works [15], [16], in our system an individual is formed by c trees, each of them associated with one of the clusters that are to be formed. The ith individual of the population is denoted as Ii and each of its trees as Tik. The output of Ii is a vector containing the outputs of all its trees, , where denotes the vector of
Experimental settings and results
In this section, we experimentally validate the proposed semantic GP-based technique. In particular, we apply the proposed algorithm to 10 instances of the redistricting problem described in Table 1. Data are related to the US census tract data set (year 2000). More in detail, we compare the performance achieved using the proposed system against the following methods: graph partitioning algorithm (Graph) [18], simulated annealing redistricting algorithm (SARA) [17], genetic algorithm (GA) for
Conclusions
In this work a Genetic Programming system has been proposed for addressing the electoral redistricting problem. The proposed system uses recently defined geometric semantic genetic operators to improve the search process. Even though those operators had originally been introduced for supervised learning problems, consisting in matching input data into known targets, where it is possible to show that they always induce a unimodal fitness landscape, some considerations about the characteristics
Acknowledgements
The authors acknowledge projects EnviGP (PTDC/EIA-CCO/103363/2008), MassGP (PTDC/EEI-CTP/2975/2012) and InteleGen (PTDC/DTP-FTO/1747/2012), FCT, Portugal.
Mauro Castelli received the Master׳s degree (Laurea) in computer science from the University of Milano Bicocca, Milan, Italy, in 2008 (“summa cum Laude”), and the Ph.D. degree from the University of Milano Bicocca in 2012. His Ph.D. thesis presented contributions in the field of evolutionary computation and, in particular, genetic programming.
From February 2012 to February 2013, he had a Postdoctoral position at INESC-ID, Lisbon, Portugal. Since March 2013, he has been an Invited Assistant
References (21)
- et al.
A tabu search heuristic and adaptive memory procedure for political districting
Eur. J. Oper. Res.
(2003) - et al.
Prediction of high performance concrete strength using genetic programming with geometric semantic genetic operators
Expert Syst. Appl.
(2013) - et al.
Prediction of the Unified Parkinson׳s Disease Rating Scale assessment using a genetic programming system with geometric semantic genetic operators
Exp. Syst. Appl.
(2014) - et al.
Applying genetic algorithms to zone design
Soft. Comput.
(2005) Is automation the answerthe computational complexity of automated redistricting
Rutgers Comput. Law Technol. J.
(1997)- R. Poli, W.B. Langdon, N.F. Mcphee, A field guide to genetic programming, 2008....
Genetic ProgrammingOn the Programming of Computers by Means of Natural Selection
(1992)- J.R. Koza, Introduction to genetic programming tutorial: from the basics to human-competitive results, in: Proceedings...
- L. Beadle, C. Johnson, Semantically driven crossover in genetic programming, in: J. Wang (Ed.), Proceedings of the IEEE...
- L. Vanneschi, M. Castelli, L. Manzoni, S. Silva, A new implementation of geometric semantic GP and its application to...
Cited by (6)
A binary-constrained Geometric Semantic Genetic Programming for feature selection purposes
2017, Pattern Recognition LettersCitation Excerpt :However, the main contribution of this work is related to the Geometric Semantic Genetic Programming (GSGP) technique [20], which encodes the semantic (meaning) of individual trees when performing mutation and crossover operations. GSGP has been employed to a number of problems very recently, such as electoral redistricting problem [6] and real-life applications [35]. One strong point of geometric semantic operators concerns their ability in inducing unimodal fitness landscapes on some problems where one knows the matching between the input and the output data.
Computational Intelligence for Life Sciences
2020, Fundamenta InformaticaeAn introduction to geometric semantic genetic programming
2017, Studies in Computational IntelligenceAutomatic random tree generator on FPGA
2017, Studies in Computational IntelligenceSemantic geometric initialization
2016, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)Geometric semantic genetic programming is overkill
2016, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Mauro Castelli received the Master׳s degree (Laurea) in computer science from the University of Milano Bicocca, Milan, Italy, in 2008 (“summa cum Laude”), and the Ph.D. degree from the University of Milano Bicocca in 2012. His Ph.D. thesis presented contributions in the field of evolutionary computation and, in particular, genetic programming.
From February 2012 to February 2013, he had a Postdoctoral position at INESC-ID, Lisbon, Portugal. Since March 2013, he has been an Invited Assistant Professor with ISEGI, Universidade Nova de Lisboa, Lisbon, Portugal.
His main research interests are in the field of artificial intelligence (in particular, evolutionary computation and genetic programming) and in the application of machine-learning techniques to solve complex real-life problems, especially in the field of biology and medicine.
Roberto Henriques is a Visiting Assistant Professor at ISEGI, Universidade Nova de Lisboa. He is a director of the Degree in Systems and Information Technology and researcher at the Centre for Studies in Information Management (CEGI). He has a Ph.D. in Information Management from Universidade Nova de Lisboa, has a master׳s degree in Science and Geographic Information Systems by ISEGI-NOVA and degree in Biophysics Engineering from the University of Évora. His research focuses on the analysis of geospatial data using techniques of Artificial Intelligence and Data Mining.
Leonardo Vanneschi received the Master׳s degree (“Laurea”) in computer science from the University of Pisa, Pisa, Italy, in 1996 (“summa cum Laude”), and the Ph.D. degree from the University of Lausanne, Lausanne, Switzerland, in 2004. His Ph.D. thesis was honored with the excellence award of the Science Faculty of the University of Lausanne. Since September 2011, he has been an Assistant Professor with ISEGI, Universidade Nova de Lisboa, Lisbon, Portugal. His main research interests include computational intelligence and the study of complex systems, with particular focus on evolutionary computation and genetic programming. Dr. Vanneschi is an Associated Editor of a scientific journal in the area, and a member of the editorial board of two scientific journals and of the steering committee and program committee of various international conferences. Until February 2013, he has published 130 scientific contributions, among which nine have been honored with international awards.