Hostname: page-component-7c8c6479df-8mjnm Total loading time: 0 Render date: 2024-03-28T18:15:31.368Z Has data issue: false hasContentIssue false

An integrated approach for different attribute types in nearest neighbour classification

Published online by Cambridge University Press:  07 July 2009

W. Z. Liu
Affiliation:
Department of Information Science. University of Portsmouth, Locksway Road, Milton, Hampshire P04 8JF, UK [Email:liuwt@sis.port.ac.uk]

Abstract

The basic nearest neighbour algorithm works by storing the training instances and classifying a new case by predicting that it has the same class as its nearest stored instance. To measure the distance between instances, some distance metric needs to be used. In situations when all attributes have numeric values, the conventional nearest neighbour method treats examples as points in feature spaces and uses Euclidean distance as the distance metric. In tasks with only nominal attributes, the simple “over-lap” metric is usually used. To handle classification tasks that have mixed types of attributes, the two different metrics are simply combined. Work by researchers in the machine learning field has shown that this approach performs poorly. This paper attempts to study a more recently developed distance metric and show that this metric is capable of measuring the importance of different attributes. With the use of discretisation for numeric-valued attributes, this method provides an integrated way in dealing with problem domains with mixtures of attribute types. Through detailed analyses, this paper tries to provide further insights into the understanding of nearest neighbour classification techniques and promote further use of this type of classification algorithm.

Type
Research Article
Copyright
Copyright © Cambridge University Press 1996

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Breiman, L, Friedman, JH, Olshen, RA and Stone, CJ, 1984. Classification and Regression Trees, Wadsworth.Google Scholar
Catlett, J, 1991. “On changing continuous attributes into ordered discrete attributes”. In: Kodratoff, Y (ed.), Proceedings of the European Working Session on Learning.CrossRefGoogle Scholar
Cost, S and Salzberg, S, 1993. “A weighted nearest neighbour algorithm for learning with symbolic features”. Machine Learning 10 5778.CrossRefGoogle Scholar
Dasarathy, BV, 1991. Nearest Neighbour (NN) Norms: NN Pattern Classification Techniques. IEEE Press.Google Scholar
Devijver, PA and Kittler, J, 1980. “On the edited nearest neighbour rule”. In: Proceedings of the Fifth International Conference on Pattern Recognition, 7280.Google Scholar
Fayyad, UM and Irani, KB, 1993. “Multi-interval discretization of continuous valued attributes for classification learning”. In: Proceedings of the 13th International Joint Conference on Artifical Intelligence, 10221027, Morgan Kaufmann.Google Scholar
Fix, E and Hodges, JL, 1951. “Discriminatory analysis—nonparametric discrimination: consistency properties. Project 21–49–004, Report No. 4, USAF School of Aviation Medicine, Randolph Field, TX, 261279.Google Scholar
Fix, E and Hodges, JL, 1952. “Discriminatory analysis—nonparametric discrimination: small sample performance. Project 21–49–004, Report No. 11, USAF School of Aviation Medicine, Randolph Field, TX, 280322.Google Scholar
Hart, PE, 1968. “The condensed nearest neighbour rule”. IEEE Transactions of Information Theory IT-14 (3).CrossRefGoogle Scholar
Kerber, R, 1992. “ChiMerge: discretization of numeric attributes”. In: Proceedings of the Tenth National Conference on Artificial Intelligence, 123128, AAAI Press/MIT Press.Google Scholar
Kononenko, I, Bratko, I and Roskar, E, 1984. “Experiments in automatic learning of medical diagnostic rules”. Technical Report. Jozef Stefan Institute, Ljubjana, Yugoslavia.Google Scholar
Kononenko, I, 1993. “Inductive and Bayesian learning in medical diagnosis”. Applied Artificial Intelligence 7 317337.CrossRefGoogle Scholar
Liu, WZ and White, AP, 1991. “A review of inductive learning”. In: Graham, IM and Milne, RW (eds.), Research and Development in Expert Systems VIII, 112126, Cambridge University Press.Google Scholar
Liu, WZ and White, AP, 1994. “The importance of attribute selection measures in decision tree induction”. Machine Learning 15 2541.CrossRefGoogle Scholar
Liu, WZ and White, AP, 1995. “A comparison of nearest neighbour and tree-based discriminant analysis. Journal of Statistical and Computational Simulation 53 4150.CrossRefGoogle Scholar
Quinlan, JR, 1986. “Induction of decision trees”. Machine Learning 1 81106.CrossRefGoogle Scholar
Quinlan, JR, 1988. “Decision trees and multi-valued attributes”. Machine Intelligence 11 305318.Google Scholar
Quinlan, JR and Rivest, RL, 1989. “Inferring decision trees using the minimum description length principle. Information and Computation 80 227248.CrossRefGoogle Scholar
Rachlin, J, Kasif, S, Salzberg, S and Aha, D, 1994. “Towards a better understanding of memory-based and Bayesian classifiers”. In: Proceedings of the Eleventh International Conference on Machine Learning, 242250, New Brunswick, NJ.Google Scholar
Salzberg, S, 1989. “Nested hyper-rectangles for exemplar-based learning”. In: Jantke, KP (ed), Analogical and Inductive Inference: International Workshop A11'89, Springer-Verlag.Google Scholar
Salzberg, S, 1990. Learning with Nested Generalized Exemplars, Kluwer Academic.CrossRefGoogle Scholar
Salzberg, S, 1991. “A nested hyper-rectangle learning method”. Machine Learning 6 (3) 251276.CrossRefGoogle Scholar
Stanfill, C and Waltz, D, 1986. “Towards memory-based reasoning.” Communications of the ACM 29 (12) 12131228.CrossRefGoogle Scholar
Swonger, CW, 1972, “Sample set condensation for a condensed nearest neighbour decision rule for pattern recognition”. In: Watanabe, S (ed.), Frontiers of Pattern Recognition, 511519.CrossRefGoogle Scholar
Ting, KM, 1994. “Discretisation of continuous-valued attributes and instance-based learning”. Technical Report, 491, Basser Department of Computer Science, University of Sydney.Google Scholar
Tomek, I, 1976, “An experiment with the edited nearest neighbour rule”. IEEE Transactions on Systems, Man and Cybernetics. SMC-6 (6) 448452.Google Scholar
Vande Merckt, T de Merckt, T, 1993. “Decision trees in numerical attributes spaces”. In: Proceedings of the 13th International Joint Conference on Artificial Intellegence, 10161021, Morgan Kaufmann.Google Scholar
White, AP, 1987. “Probabilistic induction by dynamic path generation in virtual trees”. In: Bramer, MA (ed.), Research and Development in Expert Systems III, 3546, Cambridge University Press.Google Scholar
White, AP and Liu, WZ, 1990. “Probabilistic induction by dynamic path generation for continuous attributes”. In: Addis, TR and Muir, RM (eds.), Research and Development in Expert Systems VII, 285296, Cambridge University Press.Google Scholar
White, AP and Liu, WZ, 1993. “Fairness of attribute selection in probabilistic induction”. In: Bramer, MA and Milne, RW (eds.), Research and Development in Expert Systems IX, 209224, Cambridge University Press.CrossRefGoogle Scholar
White, AP and Liu, WZ, 1994. “Bias in information-based measures in decision tree induction.” Machine Learning 15 321329.CrossRefGoogle Scholar