K Nearest Neighbor Edition to Guide Classification Tree Learning: Motivation and Experimental Results

Martínez-Otzeta, J. M.; Sierra, B.; Lazkano, E.; Astigarraga, A.

doi:10.1007/11677437_5

K Nearest Neighbor Edition to Guide Classification Tree Learning: Motivation and Experimental Results

J. M. Martínez-Otzeta²⁰,
B. Sierra²⁰,
E. Lazkano²⁰ &
…
A. Astigarraga²⁰

Chapter

3375 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3755))

Abstract

This paper presents a new hybrid classifier that combines the Nearest Neighbor distance based algorithm with the Classification Tree paradigm. The Nearest Neighbor algorithm is used as a preprocessing algorithm in order to obtain a modified training database for the posterior learning of the classification tree structure; experimental section shows the results obtained by the new algorithm; comparing these results with those obtained by the classification trees when induced from the original training data we obtain that the new approach performs better or equal according to the Wilcoxon signed rank statistical test.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aha, D., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6, 37–66 (1991)
Google Scholar
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases (1998)
Google Scholar
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, Monterey (1984)
MATH Google Scholar
Cover, T.M., Hart, P.E.: Nearest neighbor pattern classification. IEEE Trans. IT-13 1, 21–27 (1967)
Article Google Scholar
Cowell, R.G., Dawid, A.P., Lauritzen, S.L., Spiegelharter, D.J.: Probabilistic Networks and Expert Systems. Springer, Heidelberg (1999)
MATH Google Scholar
Dasarathy, B.V.: Nearest neighbor (nn) norms: Nn pattern recognition classification techniques. IEEE Computer Society Press, Los Alamitos (1991)
Google Scholar
Dietterich, T.G.: Machine learning research: four current directions. AI Magazine 18(4), 97–136 (1997)
Google Scholar
Freund, Y., Schapire, R.E.: A short introduction to boosting. Journal of Japanese Society for Artificial Intelligence 14(5), 771–780 (1999)
Google Scholar
Gama, J.: Combining Classification Algorithms. Phd Thesis. University of Porto (2000)
Google Scholar
Gunes, V., Ménard, M., Loonis, P.: Combination, cooperation and selection of classifiers: A state of the art. International Journal of Pattern Recognition 17, 1303–1324 (2003)
Article Google Scholar
Ho, T.K., Srihati, S.N.: Decision combination in multiple classifier systems. IEEE Transactions on Pattern Analysis and Machine Intelligence 16, 66–75 (1994)
Article Google Scholar
Inza, I., Larrañaga, P., Etxeberria, R., Sierra, B.: Feature subset selection by bayesian networks based optimization. Artificial Intelligence 123(1-2), 157–184 (2000)
Article MATH Google Scholar
Inza, I., Larrañaga, P., Sierra, B.: Feature subset selection by bayesian networks: a comparison with genetic and sequential algorithms. International Journal of Approximate Reasoning 27(2), 143–164 (2001)
Article MATH Google Scholar
Kohavi, R.: Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (1996)
Google Scholar
Lu, Y.: Knowledge integration in a multiple classifier system. Applied Intelligence 6, 75–86 (1996)
Article MATH Google Scholar
Martin, J.K.: An exact probability metric for decision tree splitting and stopping. Machine Learning 28 (1997)
Google Scholar
Martínez-Otzeta, J.M., Sierra, B.: Analysis of the iterated probabilistic weighted k-nearest neighbor method, a new distance-based algorithm. In: 6th International Conference on Enterprise Information Systems (ICEIS), vol. 2, pp. 233–240 (2004)
Google Scholar
Michie, D., Spiegelhalter, D.J., Taylor, C.C. (eds.): Machine learning, neural and statistical classification (1995)
Google Scholar
Mingers, J.: A comparison of methods of pruning induced rule trees. Technical Report. Coventry, England: University of Warwick, School of Industrial and Business Studies, 1 (1988)
Google Scholar
Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)
MATH Google Scholar
Murthy, S.K., Kasif, S., Salzberg, S.: A system for the induction of oblique decision trees. Journal of Artificial Intelligence Research 2, 1–33 (1994)
MATH Google Scholar
Pearl, J.: Evidential reasoning using stochastic simulation of causal models. Artificial Intelligence 32(2), 245–257 (1987)
Article MATH MathSciNet Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, Los Altos (1993)
Google Scholar
Sierra, B., Lazkano, E.: Probabilistic-weighted k nearest neighbor algorithm: a new approach for gene expression based classification. In: KES 2002 Proceedings, pp. 932–939. IOS Press, Amsterdam (2002)
Google Scholar
Sierra, B., Lazkano, E., Inza, I., Merino, M., Larrañaga, P., Quiroga, J.: Prototype selection and feature subset selection by estimation of distribution algorithms. a case study in the survival of cirrhotic patients treated with TIPS. In: Artificial Intelligence in Medicine, pp. 20–29 (2001)
Google Scholar
Sierra, B., Serrano, N., Larrañaga, P., Plasencia, E.J., Inza, I., Jiménez, J.J., Revuelta, P., Mora, M.L.: Using bayesian networks in the construction of a bi-level multi-classifier. Artificial Intelligence in Medicine 22, 233–248 (2001)
Article Google Scholar
Sierra, B., Serrano, N., Larrañaga, P., Plasencia, E.J., Inza, I., Jiménez, J.J., Revuelta, P., Mora, M.L.: Machine learning inspired approaches to combine standard medical measures at an intensive care unit. In: Horn, W., Shahar, Y., Lindberg, G., Andreassen, S., Wyatt, J.C. (eds.) AIMDM 1999. LNCS (LNAI), vol. 1620, pp. 366–371. Springer, Heidelberg (1999)
Chapter Google Scholar
Stone, M.: Cross-validation choice and assessment of statistical procedures. Journal Royal of Statistical Society 36, 111–147 (1974)
MATH Google Scholar
Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics 1, 80–83 (1945)
Article Google Scholar
Wolpert, D.: Stacked generalization. Neural Networks 5, 241–259 (1992)
Article Google Scholar
Xu, L., Kryzak, A., Suen, C.Y.: Methods for combining multiple classifiers and their applications to handwriting recognition. IEEE Transactions on SMC 22, 418–435 (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Artificial Intelligence, University of the Basque Country, P. Manuel Lardizabal 1, 20018, Donostia-San Sebastián, Basque Country, Spain
J. M. Martínez-Otzeta, B. Sierra, E. Lazkano & A. Astigarraga

Authors

J. M. Martínez-Otzeta
View author publications
You can also search for this author in PubMed Google Scholar
B. Sierra
View author publications
You can also search for this author in PubMed Google Scholar
E. Lazkano
View author publications
You can also search for this author in PubMed Google Scholar
A. Astigarraga
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Australian Taxation Office,
Graham J. Williams
School of Computing and Mathematics, University of Western Sydney, Sydney, NSW, Australia
Simeon J. Simoff

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Martínez-Otzeta, J.M., Sierra, B., Lazkano, E., Astigarraga, A. (2006). K Nearest Neighbor Edition to Guide Classification Tree Learning: Motivation and Experimental Results. In: Williams, G.J., Simoff, S.J. (eds) Data Mining. Lecture Notes in Computer Science(), vol 3755. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677437_5

Download citation

DOI: https://doi.org/10.1007/11677437_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32547-5
Online ISBN: 978-3-540-32548-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics