Abstract
A common issue in cluster analysis is that there is no single correct answer to the number of clusters, since cluster analysis involves human subjective judgement. Interactive visualization is one of the methods where users can decide a proper clustering parameters. In this paper, a new clustering approach called CDCS (Categorical Data Clustering with Subjective factors) is introduced, where a visualization tool for clustered categorical data is developed such that the result of adjusting parameters is instantly reflected. The experiment shows that CDCS generates high quality clusters compared to other typical algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Blake, C.L., Merz, C.J.: UCI repository of machine learning databases, Irvine, CA (1998), http://www.cs.uci.edu/~mlearn/MLRepository.html
Cheeseman, P., Stutz, J.: Bayesian classification (autoclass): Theory and results. In: Proceedings of Advances in Knowledge Discovery and Data Mining, pp. 153–180 (1996)
Guha, S., Rastogi, R., Shim, K.: ROCK: a robust clustering algorithm for categorical attributes. Information Systems 25, 345–366 (2000)
Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery 2, 283–304 (1998)
Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
To, J.T., Gonzalez, R.C.: Pattern Recognition Principles. Addison-Wesley Publishing Company, Reading (1974)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chang, CH., Ding, ZK. (2004). Categorical Data Visualization and Clustering Using Subjective Factors. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2004. Lecture Notes in Computer Science, vol 3181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30076-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-30076-2_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22937-7
Online ISBN: 978-3-540-30076-2
eBook Packages: Springer Book Archive