Abstract
In distance metric learning, recent work has shown that value difference metric (VDM) with a strong attribute independence assumption outperforms other existing distance metrics. However, an open question is whether VDM with a less restrictive assumption can perform even better. Many approaches have been proposed to improve VDM by weakening the assumption. In this paper, we make a comprehensive survey on the existing improved approaches and then propose a new approach to improve VDM by attribute weighting. We name the proposed new distance function as attribute-weighted value difference metric (AWVDM). Moreover, we propose a modified attribute-weighted value difference metric (MAWVDM) by incorporating the learned attribute weights into the conditional probability estimates of AWVDM. AWVDM and MAWVDM significantly outperform VDM and inherit the computational simplicity of VDM simultaneously. Experimental results on a large number of UCI data sets validate the performance of AWVDM and MAWVDM.
Similar content being viewed by others
References
Aha D (1992) Tolerating noisy, irrelevant, and novel attributes in instance-based learning algorithms. Int J Man Mach Stud 36(2):267–287
Aha D, Kibler D, Albert MK (1991) Instance-based learning algorithms. Mach Learn 6:37–66
Alcalá-Fdez J, Fernandez A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple-Valued Logic Soft Comput 17(2–3):255–287
Atkeson CG, Moore AW, Schaal S (1997) Locally weighted learning. Artif Intell Rev 11:11–73
Bian W, Tao D (2012) Constrained empirical risk minimization framework for distance metric learning. IEEE Trans Neural Netw Learn Syst 23(8):1194–1205
Blanzieri E, Ricci F (1999) Probability based metrics for nearestneighbor classification and case-based reasoning. In: Proceedings of the 3rd international conference on case-based reasoning. Springer, pp 14–28
Cattral R, Oppacher F, Deugo D (2002) Evolutionary data mining with automatic rule generalization. Recent advances in computers, computing and communications. WSEAS Press, pp 296–300
Chen C, Zhang J, Fleischer R (2010) Distance approximating dimension reduction of riemannian manifolds. IEEE Trans Syst Man Cybern Part B: Cybern 40(1):208–217
Chen C, Zhuang Y, Nie F, Yang Y, Wu F, Xiao J (2011) Learning a 3D human pose distance metric from geometric pose descriptor. IEEE Trans Vis Comput Graphics 17(11):1676–1689
Cheng V, Li CH, Kwok JT, Li CK (2004) Dissimilarity learning for nominal data. Pattern Recogn 37(7):1471–1477
Cleary JG, Trigg LE (1995) K*: An instance-based learner using an entropic distance measure. In: Proceedings of the 12th international conference on machine learning. Morgan Kaufmann, Tahoe City, pp 108–114
Cost S, Salzberg S (1993) A weighted nearest neighbor algorithm for learning with symbolic features. Mach Learn 10:57–78
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Deufemia V, Risi M, Tortora G (2014) Sketched symbol recognition using latent-dynamic conditional random fields and distance-based clustering. Pattern Recogn 47(3):1159–1171
Diday E (1974) Recent progress in distance and similarity measures in pattern recognition. In: Proceedings of the 2th international joint conference on pattern recognition, pp 534–539
Frank A, Asuncion A (2010) UCI machine learning repository. University of California, Irvine
Frank E, Hall M, Pfahringer B (2003) Locally weighted naive bayes. In: Proceedings of the 19th conference on uncertainty in artificial intelligence (UAI’03). Morgan Kaufmann, San Francisco, pp 249–256
Garcia S, Herrera F (2008) An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. J Mach Learn Res 9:2677–2694
Grossman D, Domingos P (2004) Learning bayesian network classifiers by maximizing conditional likelihood. In: Proceedings of the 21st international conference on machine learning. ACM, pp 361–368
Guo Y, Greiner R (2005) Discriminative model selection for belief net structures. In: Proceedings of the 12th National Conference on Artificial Intelligence, AAAI, pp 770–776
Hall M (2007) A decision tree-based attribute weighting filter for naive bayes. Knowl-Based Syst 20:120–126
Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, Stanford, pp 359–366
Hinneburg A, Aggarwal C, Keim D (2000) What is the nearest neighbor in high dimensional spaces? In: Proceedings of the 26th international conference on very large data bases. Cairo, pp 506–515
Jiang L, Cai Z, Zhang H, Wang D (2013) Naive bayes text classifiers: a locally weighted learning approach. J Exp Theor Artif Intell 25(2):273–286
Jiang L, Li C (2013) An augmented value difference measure. Pattern Recogn Lett 34(10):1169–1174
Jiang L, Li C, Wang S, Zhang L (2016) Deep feature weighting for naive bayes and its application to text classification. Eng Appl Artif Intell 52:26–39
Jiang L, Li C, Zhang H, Cai Z (2014) A novel distance function: Frequency difference metric. Int J Pattern Recognit Artif Intell 28(2):1451002
Jiang L, Wang D, Cai Z (2012) Discriminatively weighted naive bayes and its application in text classification. Int J Artif Intell Tools 21:1250007
Jiang L, Zhang H (2006) Learning naive bayes for probability estimation by feature selection. In: Proceedings of the 19th Canadian conference on artificial intelligence. Springer, pp 503–514
Kasif S, Salzberg S, Waltz D, Rachlin J, Aha D (1998) A probabilistic framework for memory-based reasoning. Artif Intell 104:287–311
Li C, Jiang L, Li H (2014) Local value difference metric. Pattern Recogn Lett 49:62–68
Li C, Jiang L, Li H (2014) Naive bayes for value difference metric. Front Comput Sci 8(2):255–264
Li C, Jiang L, Li H, Wang S (2013) Attribute weighted value difference metric. In: Proceedings of the 25th IEEE international conference on tools with artificial intelligence. IEEE, pp 575–580
Li C, Li H (2011) One dependence value difference metric. Knowl-Based Syst 24(5):589–594
Li C, Li H (2012) A modified short and fukunaga metric based on the attribute independence assumption. Pattern Recogn Lett 33(9):1213–1218
Li C, Li H (2013) Selective value difference metric. J Comput 8(9):2232–2238
Liu B, Wang M, Hong R, Zha Z, Hua X (2010) Joint learning of labels and distance metric. IEEE Trans Syst Man Cybern Part B: Cybern 40(3):973–978
Ma L, Yang X, Tao D (2014) Person re-identification over camera networks using multi-task distance metric learning. IEEE Trans Image Process 23(8):3656–3670
Mitchell TM (1997) Machine learning, 1st edn. McGraw-Hill, New York
Myles JP, Hand DJ (1990) The multi-class metric problem in nearest neighbour discrimination rules. Pattern Recogn 23(11):1291–1297
Nadeau C, Bengio Y (2003) Inference for the generalization error. Mach Learn 52(3):239–281
Noh YK, Zhang BT, Lee DD (2010) Generative local metric learning for nearest neighbor classification. In: Proceedings of the 24th annual conference on neural information processing systems. Curran Associates, Inc., pp 1822–1830
Qiu C, Jiang L, Li C (2015) Not always simple classification: learning superparent for class probability estimation. Expert Syst Appl 42(13):5433–5440
Sangineto E (2013) Pose and expression independent facial landmark localization using dense-surf and the hausdorff distance. IEEE Trans Pattern Anal Mach Intell 35(3):624–638
Short RD, Fukunaga K (1981) The optimal distance measure for nearest neighbour classification. IEEE Trans Inf Theory 27:622–627
Stanfill C, Waltz D (1986) Toward memory-based reasoning. Commun ACM 29:1213–1228
Wilson DR, Martinez TR (1997) Improved heterogeneous distance functions. J Artif Intell Res 6:1–34
Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques, 3rd edn. Morgan Kaufmann, San Francisco
Yang, L. (2006), Distance metric learning: a comprehensive survey, Technical report, Department of Computer Science and Engineering, Michigan State University
Yu J, Rui Y, Tang YY, Tao D (2014) High-order distance-based multiview stochastic learning in image classification. IEEE Trans Cybern 44(12):2431–2442
Yu J, Tao D, Li J, Cheng J (2014) Semantic preserving distance metric learning and applications. Inf Sci 281:674–686
Yu J, Wang M, Tao D (2012) Semi-supervised multiview distance metric learning for cartoon synthesis. IEEE Trans Image Process 21(11):4636–464
Zaidi NA, Cerquides J, Carman MJ, Webb GI (2013) Alleviating naive bayes attribute independence assumption by attribute weighting. J Mach Learn Res 14:1947–1988
Zhang H, Sheng S (2004) Learning weighted naive bayes with accurate ranking. In: Proceedings of the 4th IEEE international conference on data mining. IEEE, pp 567–570
Acknowledgments
Chaoqun Li was supported by NSFC (61203287). Liangxiao Jiang was partially supported by the NCET Program (NCET-12-0953) and the Wuhan Chenguang Program (2015070404010202).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, C., Jiang, L., Li, H. et al. Toward value difference metric with attribute weighting. Knowl Inf Syst 50, 795–825 (2017). https://doi.org/10.1007/s10115-016-0960-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-016-0960-x