Elsevier

Information Fusion

Volume 65, January 2021, Pages 72-83
Information Fusion

Distance metric learning for augmenting the method of nearest neighbors for ordinal classification with absolute and relative information

https://doi.org/10.1016/j.inffus.2020.08.004Get rights and content

Highlights

  • Absolute and relative information are exploited for augmenting the method of nearest neighbors.

  • Both types of information are considered for learning an appropriate distance metric.

  • Distance metric learning improves ordinal classification performance.

Abstract

The performance of a classifier is often limited by the amount of labeled data (absolute information) available. In order to overcome this limitation, the incorporation of side information into the classification process has become a popular research topic in the field of machine learning. In this work, we propose a new method for ordinal classification that combines absolute information and a specific type of side information: relative information. In particular, this method exploits both types of information to learn an appropriate distance metric and subsequently incorporates the learned distance metric into the classical method of k nearest neighbors. The experimental results show that the proposed method attains a good performance in terms of some of the most popular (ordinal) classification performance measures.

Introduction

Ordinal classification problems are common in many fields of science, such as medicine[1], [2], image processing[3] and social sciences[4]. Typically, absolute information (i.e.,examples with an explicitly given class label) is initially gathered for learning an ordinal classification model. Unfortunately, the performance of this ordinal classification model is limited by the amount of absolute information available for training. Admittedly, collecting a large amount of absolute information is usually time-consuming and costly. In order to improve the performance, additional side information is sometimes considered for ordinal classification[5]. For instance, in some application domains such as recommender systems[6], medical care[7] and food quality assessment[8], collecting relative information (i.e.,preference orders for couples of examples) is actually easier than collecting absolute information. The challenge is thus how to combine absolute and relative information for augmenting the ordinal classification performance, typically in a setting where the former is limited and the latter more abundant.

Some contributions[9], [10] have already shown the benefits of combining absolute and relative information. For example, Herk etal.[10] jointly analyzed absolute information (ratings) and relative information (rankings). They found that both ratings and rankings are frequently used to measure values or preferences, but there is no consensus for choosing one type of information over the other. They concluded that both types of information are important for facilitating the reaching of a consensus and recommended the use of absolute and relative information simultaneously for attaining a complete understanding of datasets. In order to exploit both types of information at the same time, Sader etal.[8] recently proposed a method for ordinal classification that combines absolute evaluations from experts and relative evaluations from novices. This proposal amounts to solving a constrained convex optimization problem that contains many parameters to learn, which makes the model complex and hard to explain. For this very reason, we proposed a new ordinal classification method based on the method of k nearest neighbors (k-NN) that incorporates absolute and relative information[11].

In k-NN, the Euclidean distance metric is typically considered the standard for identifying the nearest neighbors. However, this distance metric might not adequately describe the hidden structure in a given dataset. Distance metric learning[12], [13], [14] thus became an interesting research topic that has been studied in many different scenarios. For instance, Wang etal.[15] considered distance metric learning in the context of image classification, Feng etal.[16] studied distance metric learning for imbalanced datasets and Wang etal.[17] dealt with distance metric learning in the setting of information coming from different sources. There has also been some interest in the combination of deep neural networks and distance metric learning[18].

Recently, many strategies have been proposed to learn an appropriate distance metric for ordinal classification[19], [20], [21], [22]. For example, Xiao etal.[23] used local neighborhoods to make the pairwise distances between examples with the same class label small and the distances between examples with different class labels large. Fouad etal.[24] incorporated additional information into ordinal classification tasks by changing the distance metric in the input space based on the order relation among the class labels. Their experimental results show that the proposed distance metric learning method improves the ordinal classification performance. Nguyen etal.[25] considered ordinal information as local triplet constraints such that, in case ABC or ABC, examples with class label A should be closer to examples with class label B than to examples with class label C. More specifically, they proposed a method that learns a suitable distance metric that (mostly) satisfies these constraints and subsequently incorporates the learned distance metric into k-NN.

All the proposed distance metric learning strategies above only deal with absolute information. However, the setting in which a small amount of absolute information and a large amount of relative information are available is commonplace. Therefore, some additional constraints need to be imposed in order to incorporate the relative information into the learning of an appropriate distance metric. Similarly to the idea behind k-NN, where close examples tend to have the same class label, we also assume that close couples of examples tend to have the same order. Following this assumption, which has been successfully tested in[11], we aim at learning a product distance metric that makes the distances between couples with the same order relation small and the distances between couples with different order relations large.

In this paper, in order to combine absolute and relative information for distance metric learning, we incorporate the corresponding constraints from both types of information into an optimization process to obtain an optimal distance metric. Next, we incorporate the learned distance metric into k-NN for ordinal classification. We test our method on some available benchmark datasets. The experimental performance shows the usefulness of considering absolute and relative information and the effectiveness of our proposed distance metric learning method.

The remainder of this paper is structured as follows. Section2 introduces the preliminaries and Section3 proposes a new distance metric learning method that simultaneously exploits absolute and relative information. Experimental results and a corresponding analysis of these results are presented in Section4. We end with some conclusions and open problems in Section5.

Section snippets

Problem setting

The starting point is that of[11] in which a small amount of absolute information and a large amount of relative information is available. The goal is to exploit both types of information simultaneously in order to classify new examples.

Formally, the data includes two types of information: absolute information and relative information. The first type of information is collected in a set A={(x1,y1),(x2,y2),,(xn,yn)} with a set of input examples D={x1,x2,,xn}, where the input examples xi=(xi1,,

Combining absolute and relative information for distance metric learning for k-NN

In this section, we extend the distance metric learning method proposed in[25] in order to additionally consider relative information, thus, exploiting absolute and relative information simultaneously.

Here, we combine both absolute and relative information to set the constraints for distance metric learning. Firstly, we use a similar idea to that of Nguyen etal.[25] to set the corresponding triplet constraints for absolute information. Secondly, since there are no explicit class labels for

Experiments

In this section, we describe the datasets, introduce the performance measures and analyze the experimental results. In order to show the effectiveness of combining absolute and relative information for distance metric learning, we compare the performances of the method of k-NN, distance metric learning for k-NN (DMLk-NN) as in[25], combining absolute and relative information for k-NN (ARk-NN) as in[11] and the here-proposed method DMLARk-NN. In particular, we show all results in the context of

Conclusion

We have proposed a new distance metric learning method for ordinal classification for the setting in which a small amount of absolute information and a large amount of relative information is available. Both types of information are incorporated to set the constraints on a local neighborhood for learning an appropriate Mahalanobis distance metric. This learned distance metric is used for replacing the Euclidean distance metric when applying the method of k-NN with absolute and relative

CRediT authorship contribution statement

Mengzi Tang: Conception and design of study, Acquisition of data, Analysis and/or interpretation of data, Drafting the manuscript, Revising the manuscript critically for important intellectual content. Raúl Pérez-Fernández: Conception and design of study, Analysis and/or interpretation of data, Drafting the manuscript, Revising the manuscript critically for important intellectual content. Bernard De Baets: Conception and design of study, Analysis and/or interpretation of data, Drafting the

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Mengzi Tang is supported by the China Scholarship Council (CSC). Raúl Pérez-Fernández acknowledges the support of the Research Foundation of Flanders, Belgium (FWO17/PDO/160) and the Spanish MINECO (TIN2017-87600-P). This research received funding from the Flemish Government, Belgium under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme. All authors approved the version of the manuscript to be published.

References (43)

  • Cruz-RamírezM. et al.

    Metrics to guide a multi-objective evolutionary algorithm for ordinal classification

    Neurocomputing

    (2014)
  • WaegemanW. et al.

    Learning to rank: a ROC-based graph-theoretic approach

    Pattern Recognit. Lett.

    (2008)
  • M. Pérez-Ortiz, P.A. Gutiérrez, C. García-Alonso, L. Salvador-Carulla, J.A. Salinas-Pérez, C. Hervás-Martínez, Ordinal...
  • DoyleO.M. et al.

    Predicting progression of Alzheimer’s disease using ordinal regression

    PLoS One

    (2014)
  • J. Chen, X. Liu, S. Lyu, Boosting with side information, in: Proceedings of the Asian Conference on Computer Vision,...
  • FariasV.F. et al.

    Learning preferences with side information

    Manage. Sci.

    (2019)
  • NguyenQ. et al.

    Learning classification models with soft-label information

    J. Am. Med. Inform. Assoc.

    (2014)
  • FriedmanA. et al.

    Global-scale location and distance estimates: Common representations and strategies in absolute and relative judgments

    J. Exp. Psychol. Learn. Mem. Cogn.

    (2006)
  • WeinbergerK.Q. et al.

    Distance metric learning for large margin nearest neighbor classification

    J. Mach. Learn. Res.

    (2009)
  • MeiJ. et al.

    Logdet divergence-based metric learning with triplet constraints and its applications

    IEEE Trans. Image Process.

    (2014)
  • WangH. et al.

    Semantic discriminative metric learning for image similarity measurement

    IEEE Trans. Multimed.

    (2016)
  • Cited by (12)

    • Partial Discharge Diagnosis in GIS based on Pulse Sequence Features and Optimized Machine Learning Classification Techniques

      2022, Electric Power Systems Research
      Citation Excerpt :

      PD pulse sequence only necessitates the measurement of PD phase appearance and its corresponding instantaneous voltage [20]. After the extraction of PD pulse sequence features, the PD diagnosis itself was implemented using the following machine learning classification techniques: decision tree classification (DT) [21–24], ensemble methods (EN) [25–27], k-nearest neighboring (KNN) [28,29], Discriminant analysis (DA) [30,31], and Naïve Bayes classification (NB) [32–34]. The paper is constructed as follows.

    • An axiomatic distance methodology for aggregating multimodal evaluations

      2022, Information Sciences
      Citation Excerpt :

      Problem R-OA is NP-hard whether the input rankings are complete [15] or incomplete [26]. Before proceeding, it is worthwhile to explain that, while some researchers have defined analogous versions of the COA objective function given by (13) to aggregate multimodal evaluations (e.g. [35–39]), this is where the fundamental similarities with the proposed approach stop. The full specification of the optimization models is starkly different.

    • Classification of cervical cells leveraging simultaneous super-resolution and ordinal regression

      2022, Applied Soft Computing
      Citation Excerpt :

      To the best of our knowledge, our method is the first one that combines the CNN based super-resolution and classification network and achieves good classification performance. Traditionally, the methods to handle ordinal regression are among machine learning techniques such as support vector machines (SVM) [62,63], extreme learning machine (ELM) [64,64] and nearest neighbors [65]. None of them considered deep structures or images as input data.

    • A comparative study of machine learning methods for ordinal classification with absolute and relative information

      2021, Knowledge-Based Systems
      Citation Excerpt :

      However, the present work is more ambitious and aims at augmenting other popular ordinal classification methods and comparing their performance. It is anticipated that, as in the classical setting, there is no augmented method that will outperform all others and that the method proposed by the present authors in [22] will stand the test in the comparison with the augmented versions of the most popular methods for ordinal classification found in the literature. In this section, we first introduce our problem setting and then augment several machine learning methods for ordinal classification to deal with both absolute and relative information.

    View all citing articles on Scopus
    View full text