Abstract
This paper presents a new hybrid classifier that combines the Nearest Neighbor distance based algorithm with the Classification Tree paradigm. The Nearest Neighbor algorithm is used as a preprocessing algorithm in order to obtain a modified training database for the posterior learning of the classification tree structure; experimental section shows the results obtained by the new algorithm; comparing these results with those obtained by the classification trees when induced from the original training data we obtain that the new approach performs better or equal according to the Wilcoxon signed rank statistical test.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aha, D., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6, 37–66 (1991)
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, Monterey (1984)
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. IT-13 1, 21–27 (1967)
Cowell, R.G., Dawid, A.P., Lauritzen, S.L., Spiegelharter, D.J.: Probabilistic Networks and Expert Systems. Springer, Heidelberg (1999)
Dasarathy, B.V.: Nearest neighbor (nn) norms: Nn pattern recognition classification techniques. IEEE Computer Society Press, Los Alamitos (1991)
Dietterich, T.G.: Machine learning research: four current directions. AI Magazine 18(4), 97–136 (1997)
Freund, Y., Schapire, R.E.: A short introduction to boosting. Journal of Japanese Society for Artificial Intelligence 14(5), 771–780 (1999)
Gama, J.: Combining Classification Algorithms. Phd Thesis. University of Porto (2000)
Gunes, V., Ménard, M., Loonis, P.: Combination, cooperation and selection of classifiers: A state of the art. International Journal of Pattern Recognition 17, 1303–1324 (2003)
Ho, T.K., Srihati, S.N.: Decision combination in multiple classifier systems. IEEE Transactions on Pattern Analysis and Machine Intelligence 16, 66–75 (1994)
Inza, I., Larrañaga, P., Etxeberria, R., Sierra, B.: Feature subset selection by bayesian networks based optimization. Artificial Intelligence 123(1-2), 157–184 (2000)
Inza, I., Larrañaga, P., Sierra, B.: Feature subset selection by bayesian networks: a comparison with genetic and sequential algorithms. International Journal of Approximate Reasoning 27(2), 143–164 (2001)
Kohavi, R.: Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (1996)
Lu, Y.: Knowledge integration in a multiple classifier system. Applied Intelligence 6, 75–86 (1996)
Martin, J.K.: An exact probability metric for decision tree splitting and stopping. Machine Learning 28 (1997)
Martínez-Otzeta, J.M., Sierra, B.: Analysis of the iterated probabilistic weighted k-nearest neighbor method, a new distance-based algorithm. In: 6th International Conference on Enterprise Information Systems (ICEIS), vol. 2, pp. 233–240 (2004)
Michie, D., Spiegelhalter, D.J., Taylor, C.C. (eds.): Machine learning, neural and statistical classification (1995)
Mingers, J.: A comparison of methods of pruning induced rule trees. Technical Report. Coventry, England: University of Warwick, School of Industrial and Business Studies, 1 (1988)
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
Murthy, S.K., Kasif, S., Salzberg, S.: A system for the induction of oblique decision trees. Journal of Artificial Intelligence Research 2, 1–33 (1994)
Pearl, J.: Evidential reasoning using stochastic simulation of causal models. Artificial Intelligence 32(2), 245–257 (1987)
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, Los Altos (1993)
Sierra, B., Lazkano, E.: Probabilistic-weighted k nearest neighbor algorithm: a new approach for gene expression based classification. In: KES 2002 Proceedings, pp. 932–939. IOS Press, Amsterdam (2002)
Sierra, B., Lazkano, E., Inza, I., Merino, M., Larrañaga, P., Quiroga, J.: Prototype selection and feature subset selection by estimation of distribution algorithms. a case study in the survival of cirrhotic patients treated with TIPS. In: Artificial Intelligence in Medicine, pp. 20–29 (2001)
Sierra, B., Serrano, N., Larrañaga, P., Plasencia, E.J., Inza, I., Jiménez, J.J., Revuelta, P., Mora, M.L.: Using bayesian networks in the construction of a bi-level multi-classifier. Artificial Intelligence in Medicine 22, 233–248 (2001)
Sierra, B., Serrano, N., Larrañaga, P., Plasencia, E.J., Inza, I., Jiménez, J.J., Revuelta, P., Mora, M.L.: Machine learning inspired approaches to combine standard medical measures at an intensive care unit. In: Horn, W., Shahar, Y., Lindberg, G., Andreassen, S., Wyatt, J.C. (eds.) AIMDM 1999. LNCS (LNAI), vol. 1620, pp. 366–371. Springer, Heidelberg (1999)
Stone, M.: Cross-validation choice and assessment of statistical procedures. Journal Royal of Statistical Society 36, 111–147 (1974)
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1, 80–83 (1945)
Wolpert, D.: Stacked generalization. Neural Networks 5, 241–259 (1992)
Xu, L., Kryzak, A., Suen, C.Y.: Methods for combining multiple classifiers and their applications to handwriting recognition. IEEE Transactions on SMC 22, 418–435 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Martínez-Otzeta, J.M., Sierra, B., Lazkano, E., Astigarraga, A. (2006). K Nearest Neighbor Edition to Guide Classification Tree Learning: Motivation and Experimental Results. In: Williams, G.J., Simoff, S.J. (eds) Data Mining. Lecture Notes in Computer Science(), vol 3755. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677437_5
Download citation
DOI: https://doi.org/10.1007/11677437_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32547-5
Online ISBN: 978-3-540-32548-2
eBook Packages: Computer ScienceComputer Science (R0)