Learning effective classifiers with Z-value measure based on genetic programming

doi:10.1016/j.patcog.2004.03.016

Pattern Recognition

Volume 37, Issue 10, October 2004, Pages 1957-1972

https://doi.org/10.1016/j.patcog.2004.03.016 Get rights and content

Abstract

This paper presents a learning scheme for data classification based on genetic programming. The proposed learning approach consists of an adaptive incremental learning strategy and distance-based fitness functions for generating the discriminant functions using genetic programming. To classify data using the discriminant functions effectively, the mechanism called $Z$ -value measure is developed. Based on the $Z$ -value measure, we give two classification algorithms to resolve ambiguity among the discriminant functions. The experiments show that the proposed approach has less training time than previous GP learning methods. The learned classifiers also have high accuracy of classification in comparison with the previous classifiers.

Introduction

Data classification is one of the important issues in the research area of machine learning. Many applications can be viewed as extensions of classification problem. For example, pattern recognition, disease diagnosis, and business decision-making. The classification is a two-step process including learning and classifying generally. In the first step, we referred to a predetermined set of data as training data set, which is used to build a classifier by a learning algorithm. The learned classifier is then used for classification. The task of classification is to assign an unknown object to one of the predefined classes based on the observed attributes of the object. Since the versatility of human activities and the unpredictability of data, it is a challenge for researchers to learn an effective classifier efficiently.

To reduce the learning time and increase the classification accuracy, many different methods for building effective classifiers have been proposed in the past decades. Typical rule-based classifiers like ID3 [1] and C4.5 [2] construct decision trees and classification rules using entropy-based measure called information gain. Some of the previous methods are based on some mathematical models or theories. For example, the statistical classifiers are built on the Bayesian decision theory [3], [4], [5]. The theory provides a probability model for classification by minimizing the probability of total misclassification rate. Another well-known approach is neural network [6], [7], [8], [9]. In neural network approach, a multi-layered network with $m$ inputs and $n$ outputs is trained with a given set of training data. We give an input vector to the network, and an $n$ -dimensional output vector is obtained from the outputs of the network. Then the given vector is assigned to the class with the maximum output. The other methods include distance-based classifiers and evolutionary approaches. Distance-based classifiers, like maximum likelihood classifier (MLC) [10] and k-nearest neighbor classifiers [10], [11], evaluate distances among input vectors of objects, and then classify objects into the individual class with the smallest distance. The evolutionary approaches, generally, include genetic algorithm (GA) [12], [13] and genetic programming (GP) [14], [15], [16], [17]. Genetic algorithm encodes a set of classification rules to be a sequence of bit strings called gene. The evolution operations such as reproduction, crossover and mutation are used to generate the next generation of classification rules with better fitness. The classifier with a set of classification rules will be obtained after the specified number of generations is evolved or the condition of the fitness function is satisfied. For genetic programming, there are two types of classifiers can be learned from the training data set. The first type is the rule-based classifier consisting of classification rules as other methods [14]. The other type is the function-based classifier in which the discriminant functions are included [15], [16]. In the function-based classifier, as Fig. 1, each predefined class has a corresponding discriminant function to decide whether an object belongs to the class or not. The advantages of a function-based classifier are concise and efficient, because each class has only one corresponding function and the functions are easy to compute. Nevertheless, thereare a few problems for learning discriminant functions using genetic programming and classifying using discriminant functions. One of the main drawbacks of using GP methodology is the long training time. Although the more training time allowing the more accurate the classifier can be trained, it is still relatively long while comparing with other classification methods. Another problem is the ambiguity that may occur when a new object is recognized by two or more discriminant functions at the same time or no discriminant function recognizes the object. To resolve the problem of ambiguity, an effective discerning mechanism should be provided, as Fig. 2; otherwise, the accuracy of classification will be decayed. Kishore proposed the strength of association (SA) measure to solve the problem of ambiguity by classifying an ambiguous object to the major class of the corresponding discriminant function; however, the accuracy was not improved so much by SA measure [16].

In this paper, we present a new scheme for learning discriminant functions based on genetic programming and an effective ambiguity resolution mechanism for the discriminant functions of the classifier. In order to shorten the training time of classifiers without losing accuracy of classification, we propose an adaptive incremental learning strategy and the distance-based fitness functions. The proposed learning strategy partitions the training data of each class into several small subsets with both positive and negative training samples in them first. The learning algorithm then learns a specific discriminant function for each class from the sample subsets stage by stage incrementally. After all discriminant functions are generated, we provide the resolution mechanism called $Z$ -value measure to resolve the problem of ambiguity. Two types of distance-based fitness functions: boundary division and interval division and two $Z$ -value ambiguity resolutions: the Algorithm $Z$ and the Algorithm $Z_Min$ are proposed. Based on the proposed fitness functions and ambiguity resolutions, four alternative classifiers are produced. Several benchmarks of data sets from UCI data repository are used to demonstrate and compare the accuracy of the proposed methods. We will discuss the effectiveness of the GP learning strategies for distinct fitness functions and the suitability of various $Z$ -value ambiguity resolutions for the learned discriminant functions. The experiments show that a well-designed fitness functions used in GP can improve the training time as well as the accuracy of classification. In comparison with other classification methods, the classifiers learned by the fitness function of interval division including IZ and $IZ_\min$ have high accuracy of classification.

The remainder of this paper is organized as follows. Section 2 reviews the methodology of genetic programming and gives the algorithm in detail by steps. In Section 3, we propose a GP-based adaptive incremental learning approach and the distance-based fitness functions for learning a set of discriminant functions. The ambiguity resolution mechanisms, $Z$ -value measure, and the classification algorithms are presented in Section 4. Section 5 shows the experimental results and makes some comparisons with the previous methods. Finally, we make conclusions and discuss some prospects for future research.

Section snippets

Genetic programming

The technique of genetic programming (GP) was proposed by Koza [18], [19]. Genetic programming has been applied to a wide range of areas, such as symbolic regression, robot control programs and classifications, etc. Genetic programming can discover underlying data relationships and presents these relationships by expressions. The expression is constructed by terminals and functions. There are several types of functions can be applied for genetic programming:

1.
Arithmetic operations: addition,

Learning discriminant functions by genetic programming

This section presents the GP-based learning method of discriminant functions for classification. First, we will give a formal description for discriminant functions. Then, we provide an adaptive incremental training strategy and propose two distance-based fitness functions to learn the discriminant functions for classifiers. We also give a complete example to explain the proposed learning method by a subset of IRIS data set.

The $Z$ -value measure and the classification methods

In general, since the training set does not consist of all possible samples, a classifier cannot recognize all objects correctly in real applications. The traditional rule-based classifiers need high accurate rules to achieve the effectiveness of recognition. However, the recognition rate of the proposed function-based classifier is dependent on not only one discriminant function itself but also others discriminant functions in the classifier, since the misjudgment of a discriminant function

Experimental results and comparisons

In this section, we demonstrate and compare the performance of the proposed classifiers. The classifiers proposed in this paper consist of sets of discriminant functions learned by genetic programming and the ambiguity resolutions. We refer to the learning methods using the boundary division and the interval division as GP-B and GP-I, respectively. For demonstrating the effectiveness and efficiency of the proposed classifiers, we modify the GP Quick 2.1 [21] to fit the requirements of the

Conclusions

The traditional rule-based classification is to classify patterns using a set of decision rules. For the problem with high-dimensional numerical attributes, a classifier with decision rules may not get a high accuracy of classification and keep rules simply simultaneously. This paper presents a learning approach to generate discriminant functions for classification based on genetic programming. The proposed approaches include an adaptive incremental learning strategy to speed up the training

About the Author—BEEN-CHIAN CHIEN received the Ph.D. in Computer Science and Information Engineering from National Chiao Tung University in 1992. He was an associate professor in Department of Information Engineering in I-Shou University from August 1996 to July 2004. Since August 2004, he is an associate professor in Department of Computer Science and information Engineering in National Tainan Teachers College, Tainan, Taiwan. His current research activities involve machine learning,

References (26)

H.M. Lee
A neural network classifier with disjunctive fuzzy information
Neural Networks
(1998)
C.H. Wang et al.
Integrating fuzzy knowledge by genetic algorithms
IEEE Trans. Evolut. Comput.
(1998)
B.C. Chien et al.
Learning discriminant functions with fuzzy attributes for classification using genetic programming
Expert Syst. Appl.
(2002)
J.R. Quinlan
Induction of decision trees
Mach. Learning
(1986)
J.R. Quinlan
C.45: Programs for Machine Learning
(1993)
N. Friedman et al.
Bayesian network classifiers
Mach. Learn.
(1997)
D. Heckerman et al.
Bayesian networks
Comm. ACM
(1995)
R. Kohavi
Scaling up the accuracy of Naı¨ve-Bayes classifiersa decision tree hybrid
H.M. Lee et al.
An efficient fuzzy classifier with feature selection based on fuzzy entropy
IEEE Trans. Systems Man Cybernet. B Cybernet.
(2001)
P.K. Simpson
Fuzzy min–max neural networks—part 1classification
IEEE Trans. Neural Networks
(1992)

G.P. Zhang

Neural networks for classificationa survey

IEEE Trans. Systems Man Cybernet. C Appl. Rev.

(2000)

R.O. Duda et al.

Pattern Classification and Scene Analysis

(1973)

E.H. Han et al.

Text categorization using weight adjusted k-nearest neighbor classification

Cited by (0)

About the Author—JUNG-YI LIN was born in Taitung, Taiwan. He received the M.S. degree in Computer Science and Information Engineering from I-Shou University in 2002. He is currently a Ph.D. student in Computer and Information Science, National Chiao Tung University, HsinChu, Taiwan. His research interests include machine learning, data mining, and knowledge discovery.

About the Author—WEI-PANG YANG received the Ph.D. degrees in Computer Engineering from the National Chiao Tung University in 1984. Dr. Yang was a visiting scholar at Harvard University and University of Washington at 1986 and 1996, respectively. Currently, he is a professor in Computer and Information Science, and the Director of University Library in National Chiao Tung University, HsinChu, Taiwan. His research interests include database theory, object-oriented database, video database, Chinese database retrieval systems, and digital library.

View full text

Learning effective classifiers with Z-value measure based on genetic programming

Abstract

Introduction

Section snippets

Genetic programming

Learning discriminant functions by genetic programming

The Z-value measure and the classification methods

Experimental results and comparisons

Conclusions

Neural Networks

IEEE Trans. Evolut. Comput.

Expert Syst. Appl.

Induction of decision trees

Mach. Learning

C.45: Programs for Machine Learning

Bayesian network classifiers

Mach. Learn.

Bayesian networks

Comm. ACM

Scaling up the accuracy of Naı¨ve-Bayes classifiersa decision tree hybrid

An efficient fuzzy classifier with feature selection based on fuzzy entropy

IEEE Trans. Systems Man Cybernet. B Cybernet.

Fuzzy min–max neural networks—part 1classification

IEEE Trans. Neural Networks

Neural networks for classificationa survey

IEEE Trans. Systems Man Cybernet. C Appl. Rev.

Pattern Classification and Scene Analysis

Text categorization using weight adjusted k-nearest neighbor classification

Learning effective classifiers with $Z$ -value measure based on genetic programming

The $Z$ -value measure and the classification methods