Metrics to guide a multi-objective evolutionary algorithm for ordinal classification

doi:10.1016/j.neucom.2013.05.058

Neurocomputing

Volume 135, 5 July 2014, Pages 21-31

https://doi.org/10.1016/j.neucom.2013.05.058 Get rights and content

Abstract

Ordinal classification or ordinal regression is a classification problem in which the labels have an ordered arrangement between them. Due to this order, alternative performance evaluation metrics are need to be used in order to consider the magnitude of errors. This paper presents a study of the use of a multi-objective optimization approach in the context of ordinal classification. We contribute a study of ordinal classification performance metrics, and propose a new performance metric, the maximum mean absolute error (MMAE). MMAE considers per-class distribution of patterns and the magnitude of the errors, both issues being crucial for ordinal regression problems. In addition, we empirically show that some of the performance metrics are competitive objectives, which justify the use of multi-objective optimization strategies. In our case, a multi-objective evolutionary algorithm optimizes an artificial neural network ordinal model with different pairs of metric combinations, and we conclude that the pair of the mean absolute error (MAE) and the proposed MMAE is the most favourable. A study of the relationship between the metrics of this proposal is performed, and the graphical representation in the two-dimensional space where the search of the evolutionary algorithm takes place is analysed. The results obtained show a good classification performance, opening new lines of research in the evaluation and model selection of ordinal classifiers.

Introduction

Ordinal classification or ordinal regression is a supervised learning problem of predicting categories that have an ordered arrangement. Although classification and regression metric problems have been thoroughly investigated in the literature, the ordinal regression problems have not received as much attention as nominal (binary or multiclass) classification. For example, people can be classified by considering whether they are high, medium, or low on some attribute or in a set of categories varying from strong agreement to strong disagreement with respect to some attitude item. Hodge and Treiman [1], to analyse social class identification, scored responses as follows: “Respondents identified with the lower, working, middle, upper middle, and upper class were assigned the scores 1, 2, 3, 4, and 5, respectively”. Though sequential numbers may be assigned to such categories, the numbers assigned serve only to identify the ordering of the categories. In contrast to regression metric problems, these ranks are finite types and the metric distances between the ranks are not defined; in general, in contrast to classification problems, these ranks are also different from the labels of multiple classes due to the existence of the ordering information [2].

In the previous example, it is straightforward to think that predicting class lower when the real class is upper middle should be considered as a more severe error than the one associated to a working prediction. Thereby, ordinal classification problems should be evaluated with specific metrics. In the first consideration, various measures of ordinal association and product-moment correlation and regression seem to rely on very different foundations. That is, the ordinal measures are developed from (a) the notion of comparing pairs of cases, or (b) the product-moment system, which is considered in terms of measures of individual cases.

If methodology (a) is used, and there is an ordering of the categories but the absolute distances among them are unknown, an ordinal categorical variable is obtained. In that respect, in order to avoid the influence of the numbers chosen to represent the classes on the performance assessment, we should only look at the order relation between “true” and “predicted” class numbers. The use of Spearman′s rank correlation coefficient $r_{S}$ [3] and specially Kendall′s $τ_{b}$ [4] is a step forward in that direction. Moreover, other coefficients are frequently used to describe the association between ordinal measures as Goodman and Kruskal′s γ [5], and Somers′s d [6].

If methodology (b) (product-moment system) is used, the most commonly considered measures in machine learning are the mean absolute error (here denoted as MAE) [7], [8], root mean square error (RMSE) [8], and mean zero-one error (MZE, more frequently known as error rate) [8], with $MZE = 1 - CCR$ , where CCR is the correct classification rate. However, these three measures are not suitable when used to evaluate the performance of classifiers on ordinal unbalanced datasets [7]. The first contribution of this work is a newly proposed metric associated to an ordinal classifier that is the highest MAE value from MAEs measured independently for each class (maximum MAE or MMAE). This metric evaluates the performance on the worst classified class. The second contribution of this work is the analysis of the state-of-the-art performance metrics. Finally, we empirically show that some of the metric pairs can be non-cooperative, and consequently justify the use of a multi-objective framework to address the classifier optimization problem.

Fig. 1 presents a motivational example for the present work depicting three classifiers on a fourth class ordinal classification problem. This figure illustrates how different variations of decision thresholds can affect the classification performance specially influenced by patterns placed on the class boundaries. More specifically, this example raises two issues that will be studied in the current work. First, using a unique performance measure may not be enough to evaluate a classifier, specially in the field of ordinal regression. Second, some of the performance metrics can result in competitive objectives on a general optimization process since moving a threshold on a direction can produce an improvement in one metric, but a detrimental on a second one.

In the present paper, the aforementioned issues are studied under a multi-objective optimization approach. Multi-objective algorithms are algorithms that optimize simultaneously objectives that are non-cooperative. In many problems there are several conflicting objectives, such as execution speed or computational cost and kindness of the results. For example, in [9], [10] the authors try to obtain optimal results in the shortest time and at the lowest cost. In other problems, the execution speed is not the most important and what is relevant is achieving good results in different conflicting error functions.

In the field of artificial neural networks (ANNs), classification performance and model simplicity are objectives that typically guide the training process of an evolutionary multi-objective algorithm (MOEA) [11], with the purpose of finding a trade-off between performance and model readability. Other works present the optimization of global performance versus the worst classified class in a Pareto based algorithm [12] or also by simplifying both objectives as a weighted linear combination of the functions [13].

In ordinal classification, it is common to use several error functions when some of the classes have a number of patterns much lower than the others, i.e. ordinal imbalanced datasets. Because of this reason, we proposed the MMAE metric measuring the performance in the worst classified class. One real world application where this problem can be found is in the extension of donor–recipient allocation in liver transplants [14], where the classifiers aim at predicting the survival of the organ (describing this survival in three different classes, class 1: lower than 15 days, class 2: between 15 days and 3 months, and class 3: higher than 3 months). The problem is that, in real cases, the number of patterns of class 1 is much lower than that of class 2 or 3. The hospital would be interested in classifiers able to correctly classify all classes equally, but the bad performance for class 1 can be hidden by the fact that the number of patterns of this class is very low (for example, a good MAE value can be obtained when class 1 is associated to a 5% of the patterns and the classifier never assigns a pattern to class 1). As can be seen, both objectives are conflicting (MAE and MMAE), because improving MMAE usually involves worsening MAE and vice versa. In [15] another ordinal problem is solved from a multi-objective perspective, where six different objectives are considered, including MZE, MAE and four different formulations for the expected ranking accuracy. In this work, several different ordinal measures that could be combined in the context of ordinal regression are analysed and combined in pairs for a MOEA.

The present work aims at identifying which pair of ordinal classification performance metrics can be more suitable to guide a MOEA to obtain classifiers with a good performance (considering both the order of the mis-classification errors and the worst classified class errors). The most common ordinal classification performance metrics are reviewed, and some of them are selected to evaluate the performance of four nominal and ordinal classifiers, including also the proposed metric. Then, a correlation study is done between all the metrics in order to find the less correlated ones. We hypothesize that the more uncorrelated metrics are the more suitable for acting as optimization objectives for the MOEA (given that all of them highlight positive aspects of the classifiers). The selected metrics are grouped into different pairs that will be simultaneously optimized by the MOEA. The base classifier considered is an ANN based on the proportional odds model (POM) [16] and it is evolved using a differential evolution MOEA [17], [18]. Finally, the generalization performance of the models obtained is studied with respect to the pair of metrics considered in the evolution. Because of their performance, the pair MMAE–MAE has taken special attention, deriving a relationship between this pair of metrics and studying their graphical representation.

This paper is a significant extension of a conference paper [19]. The new contributions are the following. A correlation analysis of the ordinal performance metrics is done, and the confusion matrices studied are changed and extended. In addition, the neural network model used has been replaced by another neural network based on the proportional odds model and a local search procedure based on the iRprop⁺ algorithm [20] has been included in the MOEA to optimize the new model. Finally, two additional ordinal methods have been compared in the experimental section, and new tables describing the experiments and statistical tests have been included to enforce the conclusions.

The rest of the paper is organized as follows: Section 2 shows a revision and an experimental comparison of measures for ordinal classification; Section 3 details the ordinal ANN model based on the POM; Section 4 describes the training method employed; Section 5 describes the experimental design and the results obtained, while conclusions and future research are outlined in Section 6.

Section snippets

Measures of association in ordinal classification

This section presents both nominal and ordinal classification performance metrics commonly used in the literature. An empirical evaluation of the correlation between them is done in order to select the most relevant ones.

Let us define an ordinal classification problem as a problem where the purpose is to learn a model able to predict class labels, $C = {C_{1}, C_{2}, \dots, C_{J}}$ containing J labels, for unseen patterns after a training process. What makes the difference with nominal classification is that the

Ordinal model

One main issue of ordinal classification is that there is no notion of the precise distance between classes. The samples are labeled by a set of ranks with different categories and an order. In this paper, the classical proportional odds model (POM) [16] adapted to ANNs [26] is considered. The POM works based on two elements: the first one is a linear layer with only one node (see Fig. 2) whose inputs are stamped onto a line to give them an order which facilitates ordinal classification. After

Method

To see how the selected metrics behave, this paper uses the MOEA described in [28]. The algorithm used is the memetic Pareto differential evolution neural network (MPDENN) algorithm developed by Storn and Price in [17] and modified by Abbass to train neural networks [18]. MPDENN is adapted according to the trade-off between the CCR and the MS analysed in [12], [29]. The fundamental bases of this algorithm are differential evolution (DE) and the concept of Pareto dominance. DE has often been

Experimental study

To verify the efficiency of the three proposals, 10 ordinal datasets have been used. Nine of them are benchmark datasets¹ and the other (Toy) has been generated following the guidelines in [34]. Table 6 shows the characteristics of the datasets used, including the number of patterns, the number of attributes (after transforming nominal attributes into binary ones), the number of classes and the class

Conclusion

This paper contributes an analysis of different state-of-the-art performance measures to evaluate an ordinal classifier. The aim of this analysis is selecting the best pair of metrics to guide a multi-objective evolutionary algorithm. In this analysis, the new MMAE metric is included. This metric is the highest MAE value from MAEs measured independently for each class, i.e. it evaluates the performance of the worst classified class. The analysis studies the correlations between the different

Acknowledgement

This work was supported in part by the TIN2011-22794 project of the Spanish Ministerial Commission of Science and Technology (MICYT), FEDER funds and the P11-TIC-7508 project of the “Junta de Andalucía” (Spain). Manuel Cruz-Ramírez′s research has been subsidized by the FPU Predoctoral Program (Spanish Ministry of Education and Science), grant reference AP2009-0487. Javier Sánchez-Monedero′s research has been funded by the “Junta de Andalucía” Ph.D. Student Program.

M. Cruz-Ramírez was born in Cordoba, Spain. He received the B.S. degree in Computer Science from the University of Córdoba, Spain, in 2009, and the M.S. in Soft Computing and Intelligent Systems from the University of Granada, Spain, in 2009. He is currently a Ph.D. Student in the Department of Computer Science and Numerical Analysis, University of Córdoba, Spain. His current research interests include neural networks, multi-objective evolutionary algorithms and their applications to real

References (38)

H. Abbass
An evolutionary artificial neural networks approach for breast cancer diagnosis
Artif. Intell. Med.
(2002)
M. Cruz-Ramírez et al.
Multi-objective evolutionary algorithm for donor-recipient decision system in liver transplants
Eur. J. Oper. Res.
(2012)
W. Waegeman et al.
ROC analysis in ordinal regression learning
Pattern Recognit. Lett.
(2008)
C. Igel et al.
Empirical evaluation of the improved Rprop learning algorithms
Neurocomputing
(2003)
J. Verwaeren et al.
Learning partial ordinal class memberships with kernel-based proportional odds models
Comput. Stat. Data Anal.
(2012)
J.-X. Du et al.
Shape recognition based on neural networks trained by differential evolution algorithm
Neurocomputing
(2007)
B. Subudhi et al.
Nonlinear system identification using memetic differential evolution trained neural networks
Neurocomputing
(2011)
M. Cruz-Ramírez et al.
A multi-objective neural network based method for cover crop identification from remote sensed data
Expert Syst. Appl.
(2012)
R.W. Hodge et al.
Class identification in the United States
Am. J. Sociol.
(1968)
W. Chu, S.S. Keerthi, New approaches to support vector ordinal regression, in: Proceedings of the 22nd International...

C. Spearman

The proof and measurement of association between two things

Am. J. Psychol.

(1904)

M.G. Kendall

Rank Correlation Methods

(1962)

L. Goodman et al.

Measures of association for cross classifications

J. Am. Stat. Assoc.

(1954)

R.H. Somers

The rank analogue of product-moment partial correlation and regression with application to manifold, ordered contingency tables

Biometrika

(1955)

S. Baccianella, A. Esuli, F. Sebastiani, Evaluation measures for ordinal regression, in: Proceedings of the Ninth...

K. Dembczyński, W. Kotlowski, R. Slowiński, Ordinal classification with decision rules, in: Proceedings of the...

J. Weston et al.

Large scale image annotationlearning to rank with joint word-image embeddings

Mach. Learn.

(2010)

J. Weston et al.

Multi-tasking with joint semantic spaces for large-scale music annotation and retrieval

J. New Music Res.

(2011)

J.C. Fernández et al.

Sensitivity versus accuracy in multi-class problems using memetic Pareto evolutionary neural networks

IEEE Trans. Neural Netw.

(2010)

Cited by (78)

Fusion of standard and ordinal dropout techniques to regularise deep models
2024, Information Fusion
Dropout is a popular regularisation tool for deep neural classifiers, but it is applied regardless of the nature of the classification task: nominal or ordinal. Consequently, the order relation between the class labels of ordinal problems is ignored. In this paper, we propose the fusion of standard dropout and a new dropout methodology for ordinal classification regularising deep neural networks to avoid overfitting and improve generalisation, but taking into account the extra information of the ordinal task, which is exploited to improve performance. The correlation between the outputs of every neuron and the target labels is used to guide the dropout process: the higher the neuron is correlated with the expected labels, the lower its probability of being dropped. Given that randomness also plays a crucial role in the regularisation process, a balancing factor ( $β$ ) is also added to the training process to determine the influence of the ordinality with respect to a constant probability, providing a hybrid ordinal regularisation method. An extensive battery of experiments shows that the new hybrid ordinal dropout methodology perform better than standard dropout, obtaining improved results in most evaluation metrics, including not only ordinal metrics but also nominal ones.
Sustainable group tourist trip planning: An adaptive large neighborhood search algorithm
2024, Expert Systems with Applications
The tourism industry is a key driver of economic growth and contributes to the achievement of sustainability goals. This paper presents a multi-objective group tourist planning problem that considers economic, environmental, and social dimensions simultaneously. The proposed model minimizes total cost and environmental impacts while maximizing the total collected prizes from tourists' interests. We introduce the lost profit opportunity for the cost of tours from an economic perspective for the first time. From an environmental perspective, the model minimizes both carbon emissions for transportation and the waste produced by tourists. Social satisfaction is addressed by considering tourists' preferences for visiting tourist sites and their interests in participating in group tours, maximizing total collected prizes. Uncertainties in travel time and prize values are addressed by using a fuzzy programming approach. A multi-objective adaptive large neighborhood search (ALNS) algorithm is developed to solve the proposed multi-objective group tourist planning problem, offering various removal, insertion, and local search heuristic procedures. Extensive analyses and computations are conducted to demonstrate the performance of the proposed multi-objective optimization model and the ALNS metaheuristic algorithm in solving large-scale instances. The results demonstrate the effectiveness of our approach in aiding tourism managers to make informed decisions that balance economic, environmental, and social objectives.
Iterative minority oversampling and its ensemble for ordinal imbalanced datasets
2024, Engineering Applications of Artificial Intelligence
Ordinal classification of imbalanced datasets is a challenging problem that occurs in many real-world applications. The main challenge is to simultaneously consider the classes ordering and imbalanced distribution. Although the classic synthetic instances oversampling techniques can improve the identification of minority classes, they easily incur the damage of the classes ordering when the synthetic instances fall in non-adjacent classes regions. In this paper, we propose a powerful method for handling the imbalanced problem embedded in the ordinal classification, namely Iterative Minority oversampling technique for imbalanced Ordinal Classification (IMOC). Concretely, we first develop an iterative identification procedure to select the minority instance that is hardest to learn. Then, a weighted oversampling probability distribution that respects the ordinal nature is used to generate synthetic minority instances to balance the skewed distribution. Furthermore, two novel ensemble versions are developed to boost the capability of our proposed IMOC. In order to verify the effectiveness and robustness of our proposed methods, an extensive experimental study is carried out on a large number of datasets from real-world applications. The experimental results supported by proper statistical tests indicate that our proposed methods outperform state-of-the-art algorithms in terms of the most frequently used performance measures.
Exponential loss regularisation for encouraging ordinal constraint to shotgun stocks quality assessment
2023, Applied Soft Computing
Ordinal problems are those where the label to be predicted from the input data is selected from a group of categories which are naturally ordered. The underlying order is determined by the implicit characteristics of the real problem. They share some characteristics with nominal or standard classification problems but also with regression ones. In the real world, there are many problems of this type in different knowledge areas, such as medical diagnosis, risk prediction or quality control. The latter has gained an increasing interest in the Industry 4.0 scenario. Some weapons manufacturer follow an aesthetic quality control process to determine the quality of the wood used to produce the stock of the weapons they manufacture. This process is an ordinal classification problem that can be automatised using machine learning techniques. Deep learning methods have been widely used for multiples types of tasks including image aesthetic quality control, where convolutional neural networks are the most common alternative, given that they are focused on solving problems where the input data are images. In this work, we propose a new exponential regularised loss function that is usedto improve the classification performance for ordinal problems when using deep neural networks. The proposed methodology is applied to a real-world aesthetic quality control problem. The results and statistical analysis prove that the proposed methodology outperforms other state-of-the-art methods, obtaining very robust results.
Soft labelling based on triangular distributions for ordinal classification
2023, Information Fusion
Recently, solving ordinal classification problems using machine learning and deep learning techniques has acquired important attention. There are many real-world problems in different areas of knowledge where a categorical variable needs to be predicted, and the existing categories follow an order associated with the nature of the problem: e.g. medical diagnosis with different states of a disease, or industrial quality assessment with different levels of quality. In these problems, it is quite common that the final label for each sample is determined by a group of experts with different opinions, and all opinions are usually summarised in a single crisp label by means of a given statistic (e.g. the median or the mode). Applying standard ordinal classifiers to these crisp labels could result in overfitting, as the labelling information is considered as totally certain. In this work, we propose a unimodal regularisation approach based on soft labelling, i.e. the ordinal information is used to introduce the inherent uncertainty of the label fusion. Specifically, said regularisation is based on using triangular distributions to simulate the aforementioned fusion of the expert opinions, where a parameter is used to decide the amount of probability that is assigned to the target category and the adjacent ones (according to the ordinal scale). The strategy could be applied to the loss function used by any ordinal classification learning algorithm, but we focus on deep learning in this paper. The proposal is compared to a baseline approach for nominal classification tasks and other state-of-the-art unimodal regularisation methods, and the experimental validation includes six benchmark datasets and five performance metrics. The results along with the statistical analysis show that the proposed methodology significantly outperforms the rest of the methods.
Deep learning based hierarchical classifier for weapon stock aesthetic quality control assessment
2023, Computers in Industry
In the last years, multiple quality control tasks consist in classifying some items based on their aesthetic characteristics (aesthetic quality control, AQC), where usually the aspect of the material is not measurable and is based on expert observation. Given the increasing amount of images in this domain, deep learning (DL) models can be used to extract and classify the most discriminative patterns. Frequently, when trying to evaluate the quality of a manufactured product, the categories are naturally ordered, resulting in an ordinal classification problem. However, the ordinal categories assigned by an expert can be arranged in different levels that somehow model a hierarchy of the AQC task. In this work, we propose a DL approach to improve the classification performance in problems where categories are naturally ordered and follow a hierarchical structure. The proposed approach is evaluated on a real-world dataset that defines an AQC task and compared with other state-of-the-art DL methods. The experimental results show that our hierarchical approach outperforms the state-of-the-art ones.

View all citing articles on Scopus

C. Hervás-Martínez was born in Cuenca, Spain. He received the B.S. degree in Statistics and Operations Research from the “Universidad Complutense”, Madrid, Spain, in 1978, and the Ph.D. degree in Mathematics from the University of Seville, Seville, Spain, in 1986. He is currently a Professor of Computer Science and Artificial Intelligence in the Department of Computer Science and Numerical Analysis, University of Córdoba, Spain, and an Associate Professor in the Department of Quantitative Methods, School of Economics, University of Córdoba. His current research interests include neural networks, evolutionary computation, and the modelling of natural systems.

J. Sánchez-Monedero was born in Córdoba (Spain). He received the B.S in Computer Science from the University of Granada, Spain, in 2008 and the M.S. in Multimedia Systems from the University of Granada in 2009. In 2013 he obtained the Ph.D. degree on Information and Communication Technologies of the University of Granada. He is working as researcher with the Department of Computer Science and Numerical Analysis at the University of Córdoba. His current research interests include computational intelligence methods and their applications, as well as distributed systems.

P.A. Gutiérrez was born in Córdoba, Spain. He received the B.S. degree in Computer Science from the University of Sevilla, Spain, in 2006, and the Ph.D. degree in Computer Science and Artificial Intelligence from the University of Granada, Spain, in 2009. He is currently an Assistant Professor in the Department of Computer Science and Numerical Analysis, University of Córdoba, Spain. His current research interests include neural networks and their applications, evolutionary computation, and hybrid algorithms.

^☆: This paper is a significant extension of the work “A preliminary study of ordinal metrics to guide a multi-objective evolutionary algorithm” appearing in the 11th International Conference on Intelligent Systems Design and Applications (ISDA2011).

View full text

Metrics to guide a multi-objective evolutionary algorithm for ordinal classification☆

Abstract

Introduction

Section snippets

Measures of association in ordinal classification

Ordinal model

Method

Experimental study

Conclusion

Acknowledgement

Artif. Intell. Med.

Eur. J. Oper. Res.

Pattern Recognit. Lett.

Neurocomputing

Comput. Stat. Data Anal.

Neurocomputing

Neurocomputing

Expert Syst. Appl.

Class identification in the United States

Am. J. Sociol.

The proof and measurement of association between two things

Am. J. Psychol.

Rank Correlation Methods

Measures of association for cross classifications

J. Am. Stat. Assoc.

The rank analogue of product-moment partial correlation and regression with application to manifold, ordered contingency tables

Biometrika

Large scale image annotationlearning to rank with joint word-image embeddings

Mach. Learn.

Multi-tasking with joint semantic spaces for large-scale music annotation and retrieval

J. New Music Res.

Sensitivity versus accuracy in multi-class problems using memetic Pareto evolutionary neural networks

IEEE Trans. Neural Netw.