Skip to main content
Log in

Toward value difference metric with attribute weighting

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

In distance metric learning, recent work has shown that value difference metric (VDM) with a strong attribute independence assumption outperforms other existing distance metrics. However, an open question is whether VDM with a less restrictive assumption can perform even better. Many approaches have been proposed to improve VDM by weakening the assumption. In this paper, we make a comprehensive survey on the existing improved approaches and then propose a new approach to improve VDM by attribute weighting. We name the proposed new distance function as attribute-weighted value difference metric (AWVDM). Moreover, we propose a modified attribute-weighted value difference metric (MAWVDM) by incorporating the learned attribute weights into the conditional probability estimates of AWVDM. AWVDM and MAWVDM significantly outperform VDM and inherit the computational simplicity of VDM simultaneously. Experimental results on a large number of UCI data sets validate the performance of AWVDM and MAWVDM.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Aha D (1992) Tolerating noisy, irrelevant, and novel attributes in instance-based learning algorithms. Int J Man Mach Stud 36(2):267–287

    Article  Google Scholar 

  2. Aha D, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66

    Google Scholar 

  3. Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput 17(2–3):255–287

    Google Scholar 

  4. Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11:11–73

    Article  Google Scholar 

  5. Bian W, Tao D (2012) Constrained empirical risk minimization framework for distance metric learning. IEEE Trans Neural Netw Learn Syst 23(8):1194–1205

    Article  Google Scholar 

  6. Blanzieri E, Ricci F (1999) Probability based metrics for nearestneighbor classification and case-based reasoning. In: Proceedings of the 3rd international conference on case-based reasoning. Springer, pp 14–28

  7. Cattral R, Oppacher F, Deugo D (2002) Evolutionary data mining with automatic rule generalization. Recent advances in computers, computing and communications. WSEAS Press, pp 296–300

  8. Chen C, Zhang J, Fleischer R (2010) Distance approximating dimension reduction of riemannian manifolds. IEEE Trans Syst Man Cybern Part B: Cybern 40(1):208–217

    Article  Google Scholar 

  9. Chen C, Zhuang Y, Nie F, Yang Y, Wu F, Xiao J (2011) Learning a 3D human pose distance metric from geometric pose descriptor. IEEE Trans Vis Comput Graphics 17(11):1676–1689

    Article  Google Scholar 

  10. Cheng V, Li CH, Kwok JT, Li CK (2004) Dissimilarity learning for nominal data. Pattern Recogn 37(7):1471–1477

    Article  Google Scholar 

  11. Cleary JG, Trigg LE (1995) K*: An instance-based learner using an entropic distance measure. In: Proceedings of the 12th international conference on machine learning. Morgan Kaufmann, Tahoe City, pp 108–114

  12. Cost S, Salzberg S (1993) A weighted nearest neighbor algorithm for learning with symbolic features. Mach Learn 10:57–78

    Google Scholar 

  13. Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  14. Deufemia V, Risi M, Tortora G (2014) Sketched symbol recognition using latent-dynamic conditional random fields and distance-based clustering. Pattern Recogn 47(3):1159–1171

    Article  Google Scholar 

  15. Diday E (1974) Recent progress in distance and similarity measures in pattern recognition. In: Proceedings of the 2th international joint conference on pattern recognition, pp 534–539

  16. Frank A, Asuncion A (2010) UCI machine learning repository. University of California, Irvine

    Google Scholar 

  17. Frank E, Hall M, Pfahringer B (2003) Locally weighted naive bayes. In: Proceedings of the 19th conference on uncertainty in artificial intelligence (UAI’03). Morgan Kaufmann, San Francisco, pp 249–256

  18. Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677–2694

    MATH  Google Scholar 

  19. Grossman D, Domingos P (2004) Learning bayesian network classifiers by maximizing conditional likelihood. In: Proceedings of the 21st international conference on machine learning. ACM, pp 361–368

  20. Guo Y, Greiner R (2005) Discriminative model selection for belief net structures. In: Proceedings of the 12th National Conference on Artificial Intelligence, AAAI, pp 770–776

  21. Hall M (2007) A decision tree-based attribute weighting filter for naive bayes. Knowl-Based Syst 20:120–126

    Article  Google Scholar 

  22. Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, Stanford, pp 359–366

  23. Hinneburg A, Aggarwal C, Keim D (2000) What is the nearest neighbor in high dimensional spaces? In: Proceedings of the 26th international conference on very large data bases. Cairo, pp 506–515

  24. Jiang L, Cai Z, Zhang H, Wang D (2013) Naive bayes text classifiers: a locally weighted learning approach. J Exp Theor Artif Intell 25(2):273–286

    Article  Google Scholar 

  25. Jiang L, Li C (2013) An augmented value difference measure. Pattern Recogn Lett 34(10):1169–1174

    Article  Google Scholar 

  26. Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive bayes and its application to text classification. Eng Appl Artif Intell 52:26–39

    Article  Google Scholar 

  27. Jiang L, Li C, Zhang H, Cai Z (2014) A novel distance function: Frequency difference metric. Int J Pattern Recognit Artif Intell 28(2):1451002

    Article  Google Scholar 

  28. Jiang L, Wang D, Cai Z (2012) Discriminatively weighted naive bayes and its application in text classification. Int J Artif Intell Tools 21:1250007

    Article  Google Scholar 

  29. Jiang L, Zhang H (2006) Learning naive bayes for probability estimation by feature selection. In: Proceedings of the 19th Canadian conference on artificial intelligence. Springer, pp 503–514

  30. Kasif S, Salzberg S, Waltz D, Rachlin J, Aha D (1998) A probabilistic framework for memory-based reasoning. Artif Intell 104:287–311

    Article  MathSciNet  MATH  Google Scholar 

  31. Li C, Jiang L, Li H (2014) Local value difference metric. Pattern Recogn Lett 49:62–68

    Article  Google Scholar 

  32. Li C, Jiang L, Li H (2014) Naive bayes for value difference metric. Front Comput Sci 8(2):255–264

    Article  MathSciNet  Google Scholar 

  33. Li C, Jiang L, Li H, Wang S (2013) Attribute weighted value difference metric. In: Proceedings of the 25th IEEE international conference on tools with artificial intelligence. IEEE, pp 575–580

  34. Li C, Li H (2011) One dependence value difference metric. Knowl-Based Syst 24(5):589–594

    Article  Google Scholar 

  35. Li C, Li H (2012) A modified short and fukunaga metric based on the attribute independence assumption. Pattern Recogn Lett 33(9):1213–1218

    Article  Google Scholar 

  36. Li C, Li H (2013) Selective value difference metric. J Comput 8(9):2232–2238

    Google Scholar 

  37. Liu B, Wang M, Hong R, Zha Z, Hua X (2010) Joint learning of labels and distance metric. IEEE Trans Syst Man Cybern Part B: Cybern 40(3):973–978

    Article  Google Scholar 

  38. Ma L, Yang X, Tao D (2014) Person re-identification over camera networks using multi-task distance metric learning. IEEE Trans Image Process 23(8):3656–3670

    Article  MathSciNet  Google Scholar 

  39. Mitchell TM (1997) Machine learning, 1st edn. McGraw-Hill, New York

    MATH  Google Scholar 

  40. Myles JP, Hand DJ (1990) The multi-class metric problem in nearest neighbour discrimination rules. Pattern Recogn 23(11):1291–1297

    Article  Google Scholar 

  41. Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281

    Article  MATH  Google Scholar 

  42. Noh YK, Zhang BT, Lee DD (2010) Generative local metric learning for nearest neighbor classification. In: Proceedings of the 24th annual conference on neural information processing systems. Curran Associates, Inc., pp 1822–1830

  43. Qiu C, Jiang L, Li C (2015) Not always simple classification: learning superparent for class probability estimation. Expert Syst Appl 42(13):5433–5440

    Article  Google Scholar 

  44. Sangineto E (2013) Pose and expression independent facial landmark localization using dense-surf and the hausdorff distance. IEEE Trans Pattern Anal Mach Intell 35(3):624–638

    Article  Google Scholar 

  45. Short RD, Fukunaga K (1981) The optimal distance measure for nearest neighbour classification. IEEE Trans Inf Theory 27:622–627

    Article  MATH  Google Scholar 

  46. Stanfill C, Waltz D (1986) Toward memory-based reasoning. Commun ACM 29:1213–1228

    Article  Google Scholar 

  47. Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34

    MathSciNet  MATH  Google Scholar 

  48. Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, San Francisco

    Google Scholar 

  49. Yang, L. (2006), Distance metric learning: a comprehensive survey, Technical report, Department of Computer Science and Engineering, Michigan State University

  50. Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442

    Article  Google Scholar 

  51. Yu J, Tao D, Li J, Cheng J (2014) Semantic preserving distance metric learning and applications. Inf Sci 281:674–686

    Article  MathSciNet  Google Scholar 

  52. Yu J, Wang M, Tao D (2012) Semi-supervised multiview distance metric learning for cartoon synthesis. IEEE Trans Image Process 21(11):4636–464

    Article  MathSciNet  Google Scholar 

  53. Zaidi NA, Cerquides J, Carman MJ, Webb GI (2013) Alleviating naive bayes attribute independence assumption by attribute weighting. J Mach Learn Res 14:1947–1988

    MathSciNet  MATH  Google Scholar 

  54. Zhang H, Sheng S (2004) Learning weighted naive bayes with accurate ranking. In: Proceedings of the 4th IEEE international conference on data mining. IEEE, pp 567–570

Download references

Acknowledgments

Chaoqun Li was supported by NSFC (61203287). Liangxiao Jiang was partially supported by the NCET Program (NCET-12-0953) and the Wuhan Chenguang Program (2015070404010202).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liangxiao Jiang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, C., Jiang, L., Li, H. et al. Toward value difference metric with attribute weighting. Knowl Inf Syst 50, 795–825 (2017). https://doi.org/10.1007/s10115-016-0960-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-016-0960-x

Keywords

Navigation