Designing a classifier by a layered multi-population genetic programming approach
Introduction
Genetic programming (GP) [1], an important evolutionary computation (EC) technique, has developed rapidly in recent years. Researchers have proposed creative ideas to improve the effectiveness and efficiency of GP, such as new fitness functions, new architectures, and new individual expressions.
Traditionally, GP works with a single population. Multi-population GP (MGP) [3], [18], which employs several populations to discover optimal solutions, has been proposed and developed. Many different topologies of MGP have been proposed, such as the circle topology and the random topology. Fig. 1 shows the circle topology where circles stand for populations [3]. An important characteristic of MGP is migration. This means that individuals can be transmitted from one population to another. The arrows in Fig. 1 indicate the migration direction. Fernández et al. [18] performed several experiments with parallel and distributed GP (PADGP), isolated multi-population GP (IMGP), where “isolated” means that there is no migration between populations, and traditional single population GP. Their experiments show that PADGP and IMGP usually obtain better performance than traditional single population GP.
Many classifiers have been developed based on GP in recent years [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [19], [21]. To generate classification rules, Freitas [6] proposed the tuple-set-descriptor (TSD), a logical formula to represent an individual. Kotani and Sherrah [9], [13] used GP to perform feature selection before using other classification methods. Multi-category classification problems are more difficult than two-class classification problems. Kishore et al. [7] and the present authors [4] have considered such a problem as multiple two-class classification problems and then generated corresponding expressions or discriminant functions. These methods need k runs for a k-class classification problem. Muni et al. [12] proposed a novel method to solve k-class classification problems in a single run. Each individual in their work is represented by a multi-tree. Evolving one individual is equivalent to evolving k trees simultaneously. Loveard and Ciesielski [11] proposed five methods for solving multi-category classification problems including binary decomposition, static range selection, dynamic range selection, class enumeration, and evidence enumeration. Brameier and Banzhaf [3] used linear GP and MGP techniques. Individuals are represented as strings and can be transmitted between demes, i.e. subpopulations, according to their fitness value. Tsakonas [21] compares four different structures evolved by GP in several different classification problems.
Using functional expressions to represent individuals is effective in GP [4], [7], [10]. The tree structure is a common data structure for functional expressions. However, two problems occur when GP is employed to generate functional expressions. First, it is difficult to choose appropriate operations for a given problem because characteristics of the problem are completely unknown. If the operator set contains many operations, there is a greater possibility of discovering optimal solutions, but the searching space becomes larger and therefore may become impracticable. Fortunately, as shown in Ref. [7], GP with an operation set comprising only basic arithmetic operations, i.e. , generates results comparable to that with an operation set comprising additional operations. Second, it is difficult to know the proper length of an individual because there is no prior knowledge about optimal solutions. The predefined individual length, as the length of a string-expression individual or the number of available nodes of a tree-expression individual, is usually chosen according to heuristic or empirical assumptions. The following is an example of a classification problem containing 64 dimensional data, i.e. a training instance x is represented by . Suppose that an optimal solution F is known as . F can be represented as a skew binary tree with a height of 64 or a balanced binary tree with a height of seven, as shown in Fig. 2. An individual can contain at most nodes if the predefined maximum depth is 64. A population containing so many large trees is highly complex and is thereby impracticable. On the other hand, if the predefined maximum depth is fixed at seven, it is very difficult to generate the ideal balanced tree. Moreover, the function will never be obtained if the maximum depth is less than seven.
Using an acceptable and practicable individual size is a simple but dangerous way to avoid this problem. This problem has motivated us to develop this work. Since a long function can be viewed as a composition of a number of small functions, it is possible to combine a number of small GP solutions into a large one. Therefore, it is desirable to generate those small solutions with a practicable size of individuals and then use them to compose a larger solution. For example, consider the above function F and two functions and . Clearly, F can be represented as (, as shown in Fig. 3, where the tree representations of B and C have at most a height of 32 rather than 64. Functions B and C can be generated by two separate GPs and then are combined together to form F. Here we attempt to develop a method by which we can determine a proper node to combine small functions, for example, the shaded operation in Fig. 3.
The method proposed in this paper is called layered genetic programming (LAGEP). It is a method based on MGP. LAGEP arranges populations in a layered architecture. Populations in the same layer evolve with identical training set and store the results of their best individuals into a dataset; this dataset becomes a new training set for the successive layer. After all layers have finished the evolution process, the output of the final layer is used as the result of LAGEP.
The rest of this paper is organized as follows. Section 2 describes the details of LAGEP. Section 3 presents and discusses the experimental results on selected classification problems. Conclusions are drawn in Section 4.
Section snippets
Proposed LAGEP method
LAGEP is based on multi-population method. In this section, we at first describe the design of each single population including a mutation weight tuning method. Then the design of LAGEP and the benefits of it are explained. The test phase and conflict problem are addressed afterward. Finally, an example demonstrates LAGEP.
Experiments
In this section we describe the experiments and analyze classification results. To conduct the experiments described in this section, we developed a system based on the LAGEP Project [23] executed under an ACER VT7600GL, which is equipped with 3.0 GHz processor and 1.5 GB memory.
Conclusions and future work
In this paper, we propose a MGP method, LAGEP. LAGEP arranges a number of populations into a layer. Every layer evolves its populations to generate a set of discriminant functions. These functions transform the training set to a new training set, which is used for successive layer. The evolution process of every population is efficient because it evolves with short individuals. We also proposed a method to prevent falling into a local optimum for a long time called AMRT.
Experiment results show
Acknowledgments
We would like to express our appreciation to the anonymous reviewers for their useful suggestions and revision. We also wish to thank Dr. Hsinchun Chen and Jiexun Li for many helpful discussions and comments.
About the Author—JUNG-YI LIN was born in Taitung, Taiwan. He received the M.S. degree in Computer Science and Information Engineering from I-Shou University in 2002. He is currently a Ph.D. candidate in Computer Science, National Chiao Tung University, HsinChu, Taiwan. Lin is currently a visiting scholar at Artificial Intelligence Lab, Department of MIS, University of Arizona, Arizona, USA. His research interests include machine learning, data mining, and knowledge discovery.
References (20)
- et al.
Learning effective classifiers with Z-value measure based on genetic programming
Pattern Recognition
(2004) A comparison of classification accuracy of four genetic programming-evolved intelligent structures
Inf. Sci.
(2006)Genetic Programming: On the Programming of Computers by Means of Natural Selection
(1992)- et al.
Genetic Programming: An Introduction on the Automatic Evolution of Computer Programs and Its Application
(1998) - et al.
A comparison of linear genetic programming and neural networks in medical data mining
IEEE Trans. Evol. Comput.
(2001) - et al.
Discovering interesting classification rules with genetic programming
Appl. Soft Comput.
(2002) - A. Freitas, A genetic programming framework for two data mining tasks: classification and generalized rule induction,...
- et al.
Application of genetic programming for multicategory pattern classification
IEEE Trans. Evol. Comput.
(2000) Group classification using a mix of genetic programming and genetic algorithms
- et al.
Emergence of feature extraction function using genetic programming
Cited by (46)
Multi-population techniques in nature inspired optimization algorithms: A comprehensive survey
2019, Swarm and Evolutionary ComputationCitation Excerpt :Data analysis is another important application area for multi-population methods [203–215]. A layered multi-population GP for designing a classifier was presented in Ref. [216], where the layer architectures were used to arrange multiple subpopulations to construct a new training set. The authors compared their method to the other algorithms and found that it was an effective approach to the classification problem.
Explainable Artificial Intelligence by Genetic Programming: A Survey
2023, IEEE Transactions on Evolutionary ComputationA Survey on Computational Intelligence Applications in Information Retrieval
2022, Research SquareRecognition of weld defects from X-ray images based on improved convolutional neural network
2022, Multimedia Tools and ApplicationsAn Empirical Study of Progressive Insular Cooperative GP
2022, SN Computer ScienceMulticlass Classification on High Dimension and Low Sample Size Data Using Genetic Programming
2022, IEEE Transactions on Emerging Topics in Computing
About the Author—JUNG-YI LIN was born in Taitung, Taiwan. He received the M.S. degree in Computer Science and Information Engineering from I-Shou University in 2002. He is currently a Ph.D. candidate in Computer Science, National Chiao Tung University, HsinChu, Taiwan. Lin is currently a visiting scholar at Artificial Intelligence Lab, Department of MIS, University of Arizona, Arizona, USA. His research interests include machine learning, data mining, and knowledge discovery.
About the Author—HAO-REN KE was born on June 29, 1967 in Taipei, Taiwan, Republic of China. He received the B.S. degree in 1989 and his Ph.D. degree in 1993, both in Computer and Information Science, from National Chiao Tung University. Now he is a professor of the Library, and Institute of Information Management, National Chiao Tung University (NCTU). He is also the associate director of the NCTU Library. His research interests include digital library, digital museum, information retrieval, web service, and data mining. He can be contacted at: [email protected].
About the Author—BEEN-CHIAN CHIEN received the Ph.D. in Computer Science and Information Engineering from National Chiao Tung University in 1992. He was an associate professor of the Department of Computer Science and Information Engineering, I-Shou University, Kaohsiung, Taiwan, from 1996 to 2004. Currently, he is a professor and the head of the department of computer science and information engineering, national university of Tainan, Tainan, Taiwan. His current research activities involve machine learning, content-based image retrieval, intelligent information retrieval and data mining.
About the Author—WEI-PANG YANG was born on May 17, 1950 in Hualien, Taiwan. He received the B.S. degree in mathematics from National Taiwan Normal University in 1974, and the M.S. and Ph.D. degrees from the National Chiao Tung University in 1979 and 1984, respectively, both in Computer Engineering. He was a professor of the Department of CSIE and Department of CIS at the National Chiao Tung University, Hsinchu, Taiwan. He was a visiting scholar at the Harvard University and at the University of Washington. He was the Director of the Computer Center of National Chiao Tung University. Dr. Yang is currently the Head of the Department of Information Management and is the Dean of College of Management. His research interests include database theory and application, information retrieval, data miming, digital library, and digital museum.