Skip to main content

A Fuzzy Subspace Algorithm for Clustering High Dimensional Data

  • Conference paper
Book cover Advanced Data Mining and Applications (ADMA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4093))

Included in the following conference series:

Abstract

In fuzzy clustering algorithms each object has a fuzzy membership associated with each cluster indicating the degree of association of the object to the cluster. Here we present a fuzzy subspace clustering algorithm, FSC, in which each dimension has a weight associated with each cluster indicating the degree of importance of the dimension to the cluster. Using fuzzy techniques for subspace clustering, our algorithm avoids the difficulty of choosing appropriate cluster dimensions for each cluster during the iterations. Our analysis and simulations strongly show that FSC is very efficient and the clustering results produced by FSC are very high in accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jain, A., Murty, M., Flynn, P.: Data clustering: A review. ACM Computing Surveys 31, 264–323 (1999)

    Article  Google Scholar 

  2. Cao, Y., Wu, J.: Projective ART for clustering data sets in high dimensional spaces. Neural Networks 15, 105–120 (2002)

    Article  Google Scholar 

  3. Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: SIGMOD Record ACM Special Interest Group on Management of Data, pp. 94–105 (1998)

    Google Scholar 

  4. Aggarwal, C., Wolf, J., Yu, P., Procopiuc, C., Park, J.: Fast algorithms for projected clustering. In: Proceedings of the 1999 ACM SIGMOD international conference on Management of data, pp. 61–72. ACM Press, New York (1999)

    Chapter  Google Scholar 

  5. Domeniconi, C., Papadopoulos, D., Gunopulos, D., Ma, S.: Subspace clustering of high dimensonal data. In: Proceedings of the SIAM International Conference on Data Mining, Lake Buena Vista, Florida (2004)

    Google Scholar 

  6. Goil, S., Nagesh, H., Choudhary, A.: MAFIA: Efficient and scalable subspace clustering for very large data sets. Technical Report CPDC-TR-9906-010, Center for Parallel and Distributed Computing, Department of Electrical & Computer Engineering, Northwestern University (1999)

    Google Scholar 

  7. Aggarwal, C., Yu, P.: Finding generalized projected clusters in high dimensional spaces. In: Chen, W., Naughton, J.F., Bernstein, P.A. (eds.) Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, May 16-18, 2000, vol. 29, pp. 70–81. ACM, New York (2000)

    Chapter  Google Scholar 

  8. Woo, K., Lee, J.: FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting. PhD thesis, Korea Advanced Institue of Science and Technology, Department of Electrical Engineering and Computer Science (2002)

    Google Scholar 

  9. Cheng, C., Fu, A., Zhang, Y.: Entropy-based subspace clustering for mining numerical data. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 84–93. ACM Press, New York (1999)

    Chapter  Google Scholar 

  10. Kaufman, L., Rousseeuw, P.: Finding Groups in Data–An Introduction to Cluster Analysis. Wiley series in probability and mathematical statistics. John Wiley & Sons, Inc., New York (1990)

    Google Scholar 

  11. Yang, J., Wang, W., Wang, H., Yu, P.: δ-clusters: capturing subspace correlation in a large data set. In: Proceedings. 18th International Conference on Data Engineering, pp. 517–528 (2002)

    Google Scholar 

  12. Procopiuc, C., Jones, M., Agarwal, P., Murali, T.: A monte carlo algorithm for fast projective clustering. In: Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pp. 418–427. ACM Press, New York (2002)

    Chapter  Google Scholar 

  13. Gan, G., Wu, J.: Subspace clustering for high dimensional categorical data. ACM SIGKDD Explorations Newsletter 6, 87–94 (2004)

    Article  Google Scholar 

  14. Agarwal, P., Mustafa, N.: k-means projective clustering. In: Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems(PODS), Paris, France, pp. 155–165. ACM Press, New York (2004)

    Chapter  Google Scholar 

  15. Liu, B., Xia, Y., Yu, P.: Clustering through decision tree construction. In: Proceedings of the ninth international conference on Information and knowledge management, McLean, Virginia, USA, pp. 20–29. ACM Press, New York (2000)

    Google Scholar 

  16. Hartigan, J.: Clustering Algorithms. John Wiley & Sons, Toronto (1975)

    MATH  Google Scholar 

  17. Huang, Z., Ng, M.: A fuzzy k-modes algorithm for clustering categorical data. IEEE Transactions on Fuzzy Systems 7, 446–452 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gan, G., Wu, J., Yang, Z. (2006). A Fuzzy Subspace Algorithm for Clustering High Dimensional Data. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_30

Download citation

  • DOI: https://doi.org/10.1007/11811305_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-37025-3

  • Online ISBN: 978-3-540-37026-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics