Skip to main content

Refined Shared Nearest Neighbors Graph for Combining Multiple Data Clusterings

  • Conference paper
Advances in Intelligent Data Analysis V (IDA 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2810))

Included in the following conference series:

Abstract

We recently introduced the idea of solving cluster ensembles using a Weighted Shared nearest neighbors Graph (WSnnG). Preliminary experiments have shown promising results in terms of integrating different clusterings into a combined one, such that the natural cluster structure of the data can be revealed. In this paper, we further study and extend the basic WSnnG. First, we introduce the use of fixed number of nearest neighbors in order to reduce the size of the graph. Second, we use refined weights on the edges and vertices of the graph. Experiments show that it is possible to capture the similarity relationships between the data patterns on a compact refined graph. Furthermore, the quality of the combined clustering based on the proposed WSnnG surpasses the average quality of the ensemble and that of an alternative clustering combining method based on partitioning of the patterns’ co-association matrix.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ghosh, J.: Multiclassifier systems: Back to the future. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 1–15. Springer, Heidelberg (2002)

    Google Scholar 

  2. Kittler, K., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(3), 226–239 (1998)

    Article  Google Scholar 

  3. Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining partitionings. In: Conference on Artificial Intelligence (AAAI 2002), Edmonton, July 2002, pp. 93–98. AAAI/MIT Press (2002)

    Google Scholar 

  4. Fred, A., Jain, A.K.: Data clustering using evidence accumulation. In: Proceedings of the 16th International Conference on Pattern Recognition. ICPR 2002, Quebec City, Quebec, Canada, August 2002, vol. 4, pp. 276–280 (2002)

    Google Scholar 

  5. Qian, Y., Suen, C.: Clustering combination method. In: International Conference on Pattern Recognition. ICPR 2000, Barcelona, Spain, September 2000, vol. 2, pp. 732–735 (2000)

    Google Scholar 

  6. Ayad, H., Kamel, M.: Finding natural clusters using multi-clusterer combiner based on shared nearest neighbors. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709. Springer, Heidelberg (2003) (to appear)

    Chapter  Google Scholar 

  7. Jarvis, R.A., Patrick, E.A.: Clustering using a similarity measure based on shared nearest neighbors. IEEE Transactions on Computers C-22(11), 1025–1034 (1973)

    Article  Google Scholar 

  8. Ertoz, L., Steinbach, M., Kumar, V.: Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. Technical report, Department of Computer Science, University of Minnesota (2002)

    Google Scholar 

  9. Ertoz, L., Steinbach, M., Kumar, V.: A new shared nearest neighbor clustering algorithm and its applications. In: Workshop on Clustering High Dimensional Data and its Applications at 2nd SIAM International Conference on Data Mining, Arlington, VA (2002)

    Google Scholar 

  10. Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. Technical Report TR 95-035, Department of Computer Science and Engineering, University of Minnesota (1995)

    Google Scholar 

  11. Karypis, G., Kumar, V.: Multilevel k-way partitioning scheme for irregular graphs. Technical Report TR 95-064, Department of Computer Science and Engineering, University of Minnesota (1995)

    Google Scholar 

  12. Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: Conference on High Performance Networking and Computing. Proceedings of the 1998 ACM/IEEE conference on Supercomputing, San Jose, CA (1998)

    Google Scholar 

  13. Strehl, A., Ghosh, J.: Relationship-based clustering and visualization for high dimensional data mining. INFORMS Journal on Computing, 1–23 (2002)

    Google Scholar 

  14. Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining (2000)

    Google Scholar 

  15. Horton, P., Nakai, K.: A probablistic classification system for predicting the cellular localization sites of proteins. Intelligent Systems in Molecular Biology, 109–115 (1996)

    Google Scholar 

  16. Alimoglu, F.: Combining multiple classifiers for pen-based handwritten digit recognition. Master’s thesis, Institute of Graduate Studies in Science and Engineering, Bogazici University (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ayad, H., Kamel, M. (2003). Refined Shared Nearest Neighbors Graph for Combining Multiple Data Clusterings. In: R. Berthold, M., Lenz, HJ., Bradley, E., Kruse, R., Borgelt, C. (eds) Advances in Intelligent Data Analysis V. IDA 2003. Lecture Notes in Computer Science, vol 2810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45231-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45231-7_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40813-0

  • Online ISBN: 978-3-540-45231-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics