Metrics to guide a multi-objective evolutionary algorithm for ordinal classification☆
Introduction
Ordinal classification or ordinal regression is a supervised learning problem of predicting categories that have an ordered arrangement. Although classification and regression metric problems have been thoroughly investigated in the literature, the ordinal regression problems have not received as much attention as nominal (binary or multiclass) classification. For example, people can be classified by considering whether they are high, medium, or low on some attribute or in a set of categories varying from strong agreement to strong disagreement with respect to some attitude item. Hodge and Treiman [1], to analyse social class identification, scored responses as follows: “Respondents identified with the lower, working, middle, upper middle, and upper class were assigned the scores 1, 2, 3, 4, and 5, respectively”. Though sequential numbers may be assigned to such categories, the numbers assigned serve only to identify the ordering of the categories. In contrast to regression metric problems, these ranks are finite types and the metric distances between the ranks are not defined; in general, in contrast to classification problems, these ranks are also different from the labels of multiple classes due to the existence of the ordering information [2].
In the previous example, it is straightforward to think that predicting class lower when the real class is upper middle should be considered as a more severe error than the one associated to a working prediction. Thereby, ordinal classification problems should be evaluated with specific metrics. In the first consideration, various measures of ordinal association and product-moment correlation and regression seem to rely on very different foundations. That is, the ordinal measures are developed from (a) the notion of comparing pairs of cases, or (b) the product-moment system, which is considered in terms of measures of individual cases.
If methodology (a) is used, and there is an ordering of the categories but the absolute distances among them are unknown, an ordinal categorical variable is obtained. In that respect, in order to avoid the influence of the numbers chosen to represent the classes on the performance assessment, we should only look at the order relation between “true” and “predicted” class numbers. The use of Spearman′s rank correlation coefficient [3] and specially Kendall′s [4] is a step forward in that direction. Moreover, other coefficients are frequently used to describe the association between ordinal measures as Goodman and Kruskal′s γ [5], and Somers′s d [6].
If methodology (b) (product-moment system) is used, the most commonly considered measures in machine learning are the mean absolute error (here denoted as MAE) [7], [8], root mean square error (RMSE) [8], and mean zero-one error (MZE, more frequently known as error rate) [8], with , where CCR is the correct classification rate. However, these three measures are not suitable when used to evaluate the performance of classifiers on ordinal unbalanced datasets [7]. The first contribution of this work is a newly proposed metric associated to an ordinal classifier that is the highest MAE value from MAEs measured independently for each class (maximum MAE or MMAE). This metric evaluates the performance on the worst classified class. The second contribution of this work is the analysis of the state-of-the-art performance metrics. Finally, we empirically show that some of the metric pairs can be non-cooperative, and consequently justify the use of a multi-objective framework to address the classifier optimization problem.
Fig. 1 presents a motivational example for the present work depicting three classifiers on a fourth class ordinal classification problem. This figure illustrates how different variations of decision thresholds can affect the classification performance specially influenced by patterns placed on the class boundaries. More specifically, this example raises two issues that will be studied in the current work. First, using a unique performance measure may not be enough to evaluate a classifier, specially in the field of ordinal regression. Second, some of the performance metrics can result in competitive objectives on a general optimization process since moving a threshold on a direction can produce an improvement in one metric, but a detrimental on a second one.
In the present paper, the aforementioned issues are studied under a multi-objective optimization approach. Multi-objective algorithms are algorithms that optimize simultaneously objectives that are non-cooperative. In many problems there are several conflicting objectives, such as execution speed or computational cost and kindness of the results. For example, in [9], [10] the authors try to obtain optimal results in the shortest time and at the lowest cost. In other problems, the execution speed is not the most important and what is relevant is achieving good results in different conflicting error functions.
In the field of artificial neural networks (ANNs), classification performance and model simplicity are objectives that typically guide the training process of an evolutionary multi-objective algorithm (MOEA) [11], with the purpose of finding a trade-off between performance and model readability. Other works present the optimization of global performance versus the worst classified class in a Pareto based algorithm [12] or also by simplifying both objectives as a weighted linear combination of the functions [13].
In ordinal classification, it is common to use several error functions when some of the classes have a number of patterns much lower than the others, i.e. ordinal imbalanced datasets. Because of this reason, we proposed the MMAE metric measuring the performance in the worst classified class. One real world application where this problem can be found is in the extension of donor–recipient allocation in liver transplants [14], where the classifiers aim at predicting the survival of the organ (describing this survival in three different classes, class 1: lower than 15 days, class 2: between 15 days and 3 months, and class 3: higher than 3 months). The problem is that, in real cases, the number of patterns of class 1 is much lower than that of class 2 or 3. The hospital would be interested in classifiers able to correctly classify all classes equally, but the bad performance for class 1 can be hidden by the fact that the number of patterns of this class is very low (for example, a good MAE value can be obtained when class 1 is associated to a 5% of the patterns and the classifier never assigns a pattern to class 1). As can be seen, both objectives are conflicting (MAE and MMAE), because improving MMAE usually involves worsening MAE and vice versa. In [15] another ordinal problem is solved from a multi-objective perspective, where six different objectives are considered, including MZE, MAE and four different formulations for the expected ranking accuracy. In this work, several different ordinal measures that could be combined in the context of ordinal regression are analysed and combined in pairs for a MOEA.
The present work aims at identifying which pair of ordinal classification performance metrics can be more suitable to guide a MOEA to obtain classifiers with a good performance (considering both the order of the mis-classification errors and the worst classified class errors). The most common ordinal classification performance metrics are reviewed, and some of them are selected to evaluate the performance of four nominal and ordinal classifiers, including also the proposed metric. Then, a correlation study is done between all the metrics in order to find the less correlated ones. We hypothesize that the more uncorrelated metrics are the more suitable for acting as optimization objectives for the MOEA (given that all of them highlight positive aspects of the classifiers). The selected metrics are grouped into different pairs that will be simultaneously optimized by the MOEA. The base classifier considered is an ANN based on the proportional odds model (POM) [16] and it is evolved using a differential evolution MOEA [17], [18]. Finally, the generalization performance of the models obtained is studied with respect to the pair of metrics considered in the evolution. Because of their performance, the pair MMAE–MAE has taken special attention, deriving a relationship between this pair of metrics and studying their graphical representation.
This paper is a significant extension of a conference paper [19]. The new contributions are the following. A correlation analysis of the ordinal performance metrics is done, and the confusion matrices studied are changed and extended. In addition, the neural network model used has been replaced by another neural network based on the proportional odds model and a local search procedure based on the iRprop+ algorithm [20] has been included in the MOEA to optimize the new model. Finally, two additional ordinal methods have been compared in the experimental section, and new tables describing the experiments and statistical tests have been included to enforce the conclusions.
The rest of the paper is organized as follows: Section 2 shows a revision and an experimental comparison of measures for ordinal classification; Section 3 details the ordinal ANN model based on the POM; Section 4 describes the training method employed; Section 5 describes the experimental design and the results obtained, while conclusions and future research are outlined in Section 6.
Section snippets
Measures of association in ordinal classification
This section presents both nominal and ordinal classification performance metrics commonly used in the literature. An empirical evaluation of the correlation between them is done in order to select the most relevant ones.
Let us define an ordinal classification problem as a problem where the purpose is to learn a model able to predict class labels, containing J labels, for unseen patterns after a training process. What makes the difference with nominal classification is that the
Ordinal model
One main issue of ordinal classification is that there is no notion of the precise distance between classes. The samples are labeled by a set of ranks with different categories and an order. In this paper, the classical proportional odds model (POM) [16] adapted to ANNs [26] is considered. The POM works based on two elements: the first one is a linear layer with only one node (see Fig. 2) whose inputs are stamped onto a line to give them an order which facilitates ordinal classification. After
Method
To see how the selected metrics behave, this paper uses the MOEA described in [28]. The algorithm used is the memetic Pareto differential evolution neural network (MPDENN) algorithm developed by Storn and Price in [17] and modified by Abbass to train neural networks [18]. MPDENN is adapted according to the trade-off between the CCR and the MS analysed in [12], [29]. The fundamental bases of this algorithm are differential evolution (DE) and the concept of Pareto dominance. DE has often been
Experimental study
To verify the efficiency of the three proposals, 10 ordinal datasets have been used. Nine of them are benchmark datasets1 and the other (Toy) has been generated following the guidelines in [34]. Table 6 shows the characteristics of the datasets used, including the number of patterns, the number of attributes (after transforming nominal attributes into binary ones), the number of classes and the class
Conclusion
This paper contributes an analysis of different state-of-the-art performance measures to evaluate an ordinal classifier. The aim of this analysis is selecting the best pair of metrics to guide a multi-objective evolutionary algorithm. In this analysis, the new MMAE metric is included. This metric is the highest MAE value from MAEs measured independently for each class, i.e. it evaluates the performance of the worst classified class. The analysis studies the correlations between the different
Acknowledgement
This work was supported in part by the TIN2011-22794 project of the Spanish Ministerial Commission of Science and Technology (MICYT), FEDER funds and the P11-TIC-7508 project of the “Junta de Andalucía” (Spain). Manuel Cruz-Ramírez′s research has been subsidized by the FPU Predoctoral Program (Spanish Ministry of Education and Science), grant reference AP2009-0487. Javier Sánchez-Monedero′s research has been funded by the “Junta de Andalucía” Ph.D. Student Program.
M. Cruz-Ramírez was born in Cordoba, Spain. He received the B.S. degree in Computer Science from the University of Córdoba, Spain, in 2009, and the M.S. in Soft Computing and Intelligent Systems from the University of Granada, Spain, in 2009. He is currently a Ph.D. Student in the Department of Computer Science and Numerical Analysis, University of Córdoba, Spain. His current research interests include neural networks, multi-objective evolutionary algorithms and their applications to real
References (38)
An evolutionary artificial neural networks approach for breast cancer diagnosis
Artif. Intell. Med.
(2002)- et al.
Multi-objective evolutionary algorithm for donor-recipient decision system in liver transplants
Eur. J. Oper. Res.
(2012) - et al.
ROC analysis in ordinal regression learning
Pattern Recognit. Lett.
(2008) - et al.
Empirical evaluation of the improved Rprop learning algorithms
Neurocomputing
(2003) - et al.
Learning partial ordinal class memberships with kernel-based proportional odds models
Comput. Stat. Data Anal.
(2012) - et al.
Shape recognition based on neural networks trained by differential evolution algorithm
Neurocomputing
(2007) - et al.
Nonlinear system identification using memetic differential evolution trained neural networks
Neurocomputing
(2011) - et al.
A multi-objective neural network based method for cover crop identification from remote sensed data
Expert Syst. Appl.
(2012) - et al.
Class identification in the United States
Am. J. Sociol.
(1968) - W. Chu, S.S. Keerthi, New approaches to support vector ordinal regression, in: Proceedings of the 22nd International...
The proof and measurement of association between two things
Am. J. Psychol.
Rank Correlation Methods
Measures of association for cross classifications
J. Am. Stat. Assoc.
The rank analogue of product-moment partial correlation and regression with application to manifold, ordered contingency tables
Biometrika
Large scale image annotationlearning to rank with joint word-image embeddings
Mach. Learn.
Multi-tasking with joint semantic spaces for large-scale music annotation and retrieval
J. New Music Res.
Sensitivity versus accuracy in multi-class problems using memetic Pareto evolutionary neural networks
IEEE Trans. Neural Netw.
Cited by (78)
Fusion of standard and ordinal dropout techniques to regularise deep models
2024, Information FusionSustainable group tourist trip planning: An adaptive large neighborhood search algorithm
2024, Expert Systems with ApplicationsIterative minority oversampling and its ensemble for ordinal imbalanced datasets
2024, Engineering Applications of Artificial IntelligenceExponential loss regularisation for encouraging ordinal constraint to shotgun stocks quality assessment
2023, Applied Soft ComputingSoft labelling based on triangular distributions for ordinal classification
2023, Information FusionDeep learning based hierarchical classifier for weapon stock aesthetic quality control assessment
2023, Computers in Industry
M. Cruz-Ramírez was born in Cordoba, Spain. He received the B.S. degree in Computer Science from the University of Córdoba, Spain, in 2009, and the M.S. in Soft Computing and Intelligent Systems from the University of Granada, Spain, in 2009. He is currently a Ph.D. Student in the Department of Computer Science and Numerical Analysis, University of Córdoba, Spain. His current research interests include neural networks, multi-objective evolutionary algorithms and their applications to real problems and performance/evaluation measures.
C. Hervás-Martínez was born in Cuenca, Spain. He received the B.S. degree in Statistics and Operations Research from the “Universidad Complutense”, Madrid, Spain, in 1978, and the Ph.D. degree in Mathematics from the University of Seville, Seville, Spain, in 1986. He is currently a Professor of Computer Science and Artificial Intelligence in the Department of Computer Science and Numerical Analysis, University of Córdoba, Spain, and an Associate Professor in the Department of Quantitative Methods, School of Economics, University of Córdoba. His current research interests include neural networks, evolutionary computation, and the modelling of natural systems.
J. Sánchez-Monedero was born in Córdoba (Spain). He received the B.S in Computer Science from the University of Granada, Spain, in 2008 and the M.S. in Multimedia Systems from the University of Granada in 2009. In 2013 he obtained the Ph.D. degree on Information and Communication Technologies of the University of Granada. He is working as researcher with the Department of Computer Science and Numerical Analysis at the University of Córdoba. His current research interests include computational intelligence methods and their applications, as well as distributed systems.
P.A. Gutiérrez was born in Córdoba, Spain. He received the B.S. degree in Computer Science from the University of Sevilla, Spain, in 2006, and the Ph.D. degree in Computer Science and Artificial Intelligence from the University of Granada, Spain, in 2009. He is currently an Assistant Professor in the Department of Computer Science and Numerical Analysis, University of Córdoba, Spain. His current research interests include neural networks and their applications, evolutionary computation, and hybrid algorithms.
- ☆
This paper is a significant extension of the work “A preliminary study of ordinal metrics to guide a multi-objective evolutionary algorithm” appearing in the 11th International Conference on Intelligent Systems Design and Applications (ISDA2011).