One Dependence Value Difference Metric

https://doi.org/10.1016/j.knosys.2011.01.005Get rights and content

Abstract

Many distance-related algorithms depend upon a good distance metric to be successful. The Value Difference Metric, simply VDM, is proposed to find reasonable distance metric between each pair of instances with nominal attribute values only. In VDM, all of the attributes are assumed to be fully independent, and the difference between two values of an attribute is only considered to be closer if they have more similar correlation with the output classes. It is obvious that the attribute independence assumption in VDM is rarely true in reality, which would harm its performance in the applications with complex attribute dependencies. In this paper, we single out an improved Value Difference Metric by relaxing its unrealistic attribute independence assumption. We call it One Dependence Value Difference Metric, simply ODVDM. In ODVDM, the structure learning algorithms for Bayesian network classifiers, such as tree augmented naive Bayes, are used to find the dependence relationships among the attributes. Our experimental results validate its effectiveness in terms of classification accuracy.

Section snippets

Introduction and related work

In instance based learning [1], [2], [3], the distance metric plays the most important role. In fact, distance metrics are also widely used in other paradigms of machine learning, such as classification and clustering, and other research fields, such as statistics, pattern recognition, and recommender systems [4]. Many distance metrics have been proposed. When all attributes are nominal, the simplest distance metric is the Overlap Metric. We simply denote it OM in this paper, which can be

One Dependence Value Difference Metric

Our research starts from looking back to the prediction of Bayesian network classifiers. Given a test instance x, Bayesian network classifiers use Eq. (3) to predict its class labelc(x)=argmaxcCP(c)Pa1(x),,an(x)|c

According to Eq. (3), we need to estimate the conditional probability P(a1(x),…,an(x)∣c). However, fully estimating it is an NP-hard problem [12]. Similar to the attribute independence assumption made by VDM, naive Bayesian classifiers (simply NB) assume that all of the attributes

Experimental methodology and results

Distance weighted k-nearest neighbor (simply KNNDW) is the most representative instance based learning algorithm, in which the distance metric is used twice: finding and weighting neighbors. It should be the first test bed to demonstrate the effectiveness of a distance metric. Therefore, we use it to validate the effectiveness of our ODVDM.

In our experiments, 10 UCI [23] classification data sets are used. We use them because they represent a wide range of domains and data characteristics and

Conclusions and future work

The Value Difference Metric (simply VDM) is widely used for distance metric between each pair of instances with nominal attribute values only. In this paper, we single out an improved Value Difference Metric by relaxing its unrealistic attribute independence assumption. We call it One Dependence Value Difference Metric, simply ODVDM. In our ODVDM, the structure learning algorithms for Bayesian network classifiers, such as tree augmented naive Bayes, are used to find the dependence relationships

Acknowledgements

We thank professor Liangxiao Jiang, Harry Zhang, and Jian Yu for their kindly help. We thank anonymous reviewers for their very useful comments and suggestions. The work was supported by the National Natural Science Foundation of China under Grant Nos. 60905033 and 61071188, the Provincial Natural Science Foundation of Hubei under Grant Nos. 2009CDB139 and 2009CDB077, and the Fundamental Research Funds for the Central Universities under Grant No. CUGL090248.

Chaoqun Li, is currently a Ph.D. candidate at China University of Geosciences (Wuhan). Her research interests include data mining and machine learning.

References (30)

  • L. Jiang et al.

    Decision tree with better class probability estimation

    International Journal of Pattern Recognition and Artificial Intelligence

    (2009)
  • D. Randall Wilson et al.

    Improved heterogeneous distance functions

    Journal of Artificial Intelligence Research

    (1997)
  • C. Stanfill et al.

    Toward memory-based reasoning

    Communications of the ACM

    (1986)
  • N. Friedman et al.

    Bayesian network classifiers

    Machine Learning

    (1997)
  • C.K. Chow et al.

    Approximating discrete probability distributions with dependence trees

    IEEE Transactions on Information Theory

    (1968)
  • Cited by (27)

    • Instance-based learning using the half-space proximal graph

      2022, Pattern Recognition Letters
      Citation Excerpt :

      Some other methods include assigning different weights to each neighbor, with the idea that closer neighbors should contribute more for assigning the class label to the query [17]. Other approaches include the design of distance functions like Mahalanobis [18,19], adaptive Euclidian [20], or the Value Difference Metric (VDM) [21,22]. For each of the proposed methods and improvements of kNN, there is an additional time-consuming stage, online or offline, to estimate a proper k value of each test sample.

    • Financial crisis prediction model using ant colony optimization

      2020, International Journal of Information Management
    • Using fine-tuned conditional probabilities for data transformation of nominal attributes

      2019, Pattern Recognition Letters
      Citation Excerpt :

      However, this assumption usually doesn't hold. Therefore, many variants of VDM [14,17–19] were proposed from the perspective of relaxing the assumption. For example, One Dependence Value Difference Metric [18] calculates corresponding conditional probability terms according to the pairwise dependence of nominal attributes with class variable; Independently Weighted Value Difference Metric [17] is weighted for attributes, without the requirement for the above assumption, by using the importance of attributes that is determined by the joint mutual information between attributes and label variable.

    View all citing articles on Scopus

    Chaoqun Li, is currently a Ph.D. candidate at China University of Geosciences (Wuhan). Her research interests include data mining and machine learning.

    Hongwei Li, the doctoral supervisor of Chaoqun Li, is currently a professor in Department of Mathematics at China University of Geosciences (Wuhan).

    View full text