Abstract
We recently introduced the idea of solving cluster ensembles using a Weighted Shared nearest neighbors Graph (WSnnG). Preliminary experiments have shown promising results in terms of integrating different clusterings into a combined one, such that the natural cluster structure of the data can be revealed. In this paper, we further study and extend the basic WSnnG. First, we introduce the use of fixed number of nearest neighbors in order to reduce the size of the graph. Second, we use refined weights on the edges and vertices of the graph. Experiments show that it is possible to capture the similarity relationships between the data patterns on a compact refined graph. Furthermore, the quality of the combined clustering based on the proposed WSnnG surpasses the average quality of the ensemble and that of an alternative clustering combining method based on partitioning of the patterns’ co-association matrix.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ghosh, J.: Multiclassifier systems: Back to the future. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 1–15. Springer, Heidelberg (2002)
Kittler, K., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(3), 226–239 (1998)
Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining partitionings. In: Conference on Artificial Intelligence (AAAI 2002), Edmonton, July 2002, pp. 93–98. AAAI/MIT Press (2002)
Fred, A., Jain, A.K.: Data clustering using evidence accumulation. In: Proceedings of the 16th International Conference on Pattern Recognition. ICPR 2002, Quebec City, Quebec, Canada, August 2002, vol. 4, pp. 276–280 (2002)
Qian, Y., Suen, C.: Clustering combination method. In: International Conference on Pattern Recognition. ICPR 2000, Barcelona, Spain, September 2000, vol. 2, pp. 732–735 (2000)
Ayad, H., Kamel, M.: Finding natural clusters using multi-clusterer combiner based on shared nearest neighbors. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709. Springer, Heidelberg (2003) (to appear)
Jarvis, R.A., Patrick, E.A.: Clustering using a similarity measure based on shared nearest neighbors. IEEE Transactions on Computers C-22(11), 1025–1034 (1973)
Ertoz, L., Steinbach, M., Kumar, V.: Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. Technical report, Department of Computer Science, University of Minnesota (2002)
Ertoz, L., Steinbach, M., Kumar, V.: A new shared nearest neighbor clustering algorithm and its applications. In: Workshop on Clustering High Dimensional Data and its Applications at 2nd SIAM International Conference on Data Mining, Arlington, VA (2002)
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. Technical Report TR 95-035, Department of Computer Science and Engineering, University of Minnesota (1995)
Karypis, G., Kumar, V.: Multilevel k-way partitioning scheme for irregular graphs. Technical Report TR 95-064, Department of Computer Science and Engineering, University of Minnesota (1995)
Karypis, G., Kumar, V.: Multilevel algorithms for multi-constraint graph partitioning. In: Conference on High Performance Networking and Computing. Proceedings of the 1998 ACM/IEEE conference on Supercomputing, San Jose, CA (1998)
Strehl, A., Ghosh, J.: Relationship-based clustering and visualization for high dimensional data mining. INFORMS Journal on Computing, 1–23 (2002)
Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining (2000)
Horton, P., Nakai, K.: A probablistic classification system for predicting the cellular localization sites of proteins. Intelligent Systems in Molecular Biology, 109–115 (1996)
Alimoglu, F.: Combining multiple classifiers for pen-based handwritten digit recognition. Master’s thesis, Institute of Graduate Studies in Science and Engineering, Bogazici University (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ayad, H., Kamel, M. (2003). Refined Shared Nearest Neighbors Graph for Combining Multiple Data Clusterings. In: R. Berthold, M., Lenz, HJ., Bradley, E., Kruse, R., Borgelt, C. (eds) Advances in Intelligent Data Analysis V. IDA 2003. Lecture Notes in Computer Science, vol 2810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45231-7_29
Download citation
DOI: https://doi.org/10.1007/978-3-540-45231-7_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40813-0
Online ISBN: 978-3-540-45231-7
eBook Packages: Springer Book Archive