Skip to main content

Clustering Categorical Data Using Coverage Density

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3584))

Abstract

In this paper, a new algorithm based on the idea of coverage density is proposed for clustering categorical data. It uses average coverage density as the global criterion function. Large sparse categorical databases can be clustered effectively by using this algorithm. It shows that the algorithm uses less memory and time by analyzing its time and space complexity. Experiments on two real datasets are carried out to illustrate the performance of the proposed algorithm.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ganti, V., Gehrke, J., Ramakrishnan, R.: CACTUS: Clustering Categorical Data Using Summaries. In: Proceedings of the 5th International Conference on Knowledge Discovery and Data Mining (KDD), San Diego, CA, USA, pp. 15–18. ACM Press, New York (1999)

    Google Scholar 

  2. Andritsos, P., Tsaparas, P., Miller, R., Sevcik, K.C.: LIMBO: Scalable Clustering of Categorical Data. In: International Conference on Extending DabaBase Tehnology (EDBT), Heraklion Crete, Greece, pp. 123–146 (2004)

    Google Scholar 

  3. Gibson, D., Kleinberg, J., Raghavan, P.: Clustering categorical data: an approach based on dynamical systems. In: Proceedings of the 24th VLDB Conference, New York, USA (1998)

    Google Scholar 

  4. Guha, S., Rastogi, R., Shim, K.: ROCK: A Robust Clustering Algorithm for Categorical Attributes. In: Proceedings of the 15th International Conference on Data Engineering (ICDE), Sydney, Australia, pp. 512–521. IEEE Press, Los Alamitos (1999)

    Google Scholar 

  5. Huang, Z.: Extensions to the K-Means Algorithm for Clustering Large Data Sets with Categorical Values. In: Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD), vol. 2(3), pp. 283–304 (1998)

    Google Scholar 

  6. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  7. Yang, Y., Guan, X., You, J.: CLOPE: A Fast and Effective Clustering Algorithm for Transactional Data. In: SIGKDD 2002, Edmonton, Alberta, Canada, July 23-26 (2002)

    Google Scholar 

  8. Wang, K., Xu, C., Liu, B.: Clustering Transactions Using Large Items. In: Proc. CIKM 1999, Kansas, Missouri (1999)

    Google Scholar 

  9. Aggarwal, C.C., Procopiuc, C., Yu, P.S.: Finding Localized Associations in Market Basket Data. IEEE Trans. Knowledge and Data Eng. 14(1), 51–62 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yan, H., Zhang, L., Zhang, Y. (2005). Clustering Categorical Data Using Coverage Density. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_30

Download citation

  • DOI: https://doi.org/10.1007/11527503_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27894-8

  • Online ISBN: 978-3-540-31877-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics