Abstract
In using a classified data set to test clustering algorithms, the data points in a class are considered as one cluster (or more than one) in space. In this paper we adopt this principle to build classification models through interactively clustering a training data set to construct a tree of clusters. The leaf clusters of the tree are selected as decision clusters to classify new data based on a distance function. We consider the feature weights in calculating the distances between a new object and the center of a decision cluster. The new algorithm, W-k-means, is used to automatically calculate the feature weights from the training data. The Fastmap technique is used to handle outliers in selecting decision clusters. This step increases the stability of the classifier. Experimental results on public domain data sets have shown that the models built using this clustering approach outperformed some popular classification algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Huang, Z., Ng, M., Li, Z., Rong, H.: Automated variable weighting k-means type clustering (2003) (submitted)
Huang, Z., Lin, T.: A visual method of cluster validation with fastmap. In: PAKDD2000 (2000)
Blake, C., Merz, C.: uci repository of machine learning databases. Department of Information and Computer Science(1998), [Online]. Available: http://www.ics.uci.edu/m~learn/MLRepository.html
Mui, J., Fu, K.: Automated classification of nucleated blood cells using a binary tree classifier. IEEE Transactions on Pattern Analysis and Machine Intelligence 2(5), 429–443 (1980)
Lin, Y., Fu, K.: Automatic classification of cervical cells using a binary tree classifier. Pattern Recognition 16(1), 68–80 (1983)
Ankerst, M., Elsen, C., Ester, M., Kriegel, H.-P.: Visual classification: An interactive approach to decision tree construction. In: 5th Proceeding of Knowledge Discovery and Data Mining (1999)
Quinlan, J.: Induction of decision trees. Machine Learning 1(1), 81–106 (1986)
Quinlan, J.: C4.5: Programs for machine learning. Morgan Kaufman, San Francisco (1993)
Jain, A., Dubes, R.: Algorithm for clustering data. Prentice-Hall Advanced Reference Series (1988)
Faloulsos, C., Lin, K., 163–174: Fastmap: A fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets. In: Proceedings of ACM SIGMOD Conference, pp. 163–174 (1995)
Bezdek, J.C.: A convergence theorem for the fuzzy ISODATA clustering algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-2, 1–8 (1980)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jing, L., Huang, J., Ng, M.K., Rong, H. (2004). A Feature Weighting Approach to Building Classification Models by Interactive Clustering. In: Torra, V., Narukawa, Y. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2004. Lecture Notes in Computer Science(), vol 3131. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27774-3_27
Download citation
DOI: https://doi.org/10.1007/978-3-540-27774-3_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22555-3
Online ISBN: 978-3-540-27774-3
eBook Packages: Springer Book Archive