Abstract
In this paper, a new algorithm based on the idea of coverage density is proposed for clustering categorical data. It uses average coverage density as the global criterion function. Large sparse categorical databases can be clustered effectively by using this algorithm. It shows that the algorithm uses less memory and time by analyzing its time and space complexity. Experiments on two real datasets are carried out to illustrate the performance of the proposed algorithm.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ganti, V., Gehrke, J., Ramakrishnan, R.: CACTUS: Clustering Categorical Data Using Summaries. In: Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining (KDD), San Diego, CA, USA, pp. 15–18. ACM Press, New York (1999)
Andritsos, P., Tsaparas, P., Miller, R., Sevcik, K.C.: LIMBO: Scalable Clustering of Categorical Data. In: International Conference on Extending DabaBase Tehnology (EDBT), Heraklion Crete, Greece, pp. 123–146 (2004)
Gibson, D., Kleinberg, J., Raghavan, P.: Clustering categorical data: an approach based on dynamical systems. In: Proceedings of the 24th VLDB Conference, New York, USA (1998)
Guha, S., Rastogi, R., Shim, K.: ROCK: A Robust Clustering Algorithm for Categorical Attributes. In: Proceedings of the 15th International Conference on Data Engineering (ICDE), Sydney, Australia, pp. 512–521. IEEE Press, Los Alamitos (1999)
Huang, Z.: Extensions to the K-Means Algorithm for Clustering Large Data Sets with Categorical Values. In: Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD), vol. 2(3), pp. 283–304 (1998)
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2001)
Yang, Y., Guan, X., You, J.: CLOPE: A Fast and Effective Clustering Algorithm for Transactional Data. In: SIGKDD 2002, Edmonton, Alberta, Canada, July 23-26 (2002)
Wang, K., Xu, C., Liu, B.: Clustering Transactions Using Large Items. In: Proc. CIKM 1999, Kansas, Missouri (1999)
Aggarwal, C.C., Procopiuc, C., Yu, P.S.: Finding Localized Associations in Market Basket Data. IEEE Trans. Knowledge and Data Eng. 14(1), 51–62 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yan, H., Zhang, L., Zhang, Y. (2005). Clustering Categorical Data Using Coverage Density. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_30
Download citation
DOI: https://doi.org/10.1007/11527503_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27894-8
Online ISBN: 978-3-540-31877-4
eBook Packages: Computer ScienceComputer Science (R0)