Abstract
This paper exposes some intrinsic characteristics of the spectral clustering method by using the tools from the matrix perturbation theory. We construct a weight matrix of a graph and study its eigenvalues and eigenvectors. It shows that the number of clusters is equal to the number of eigenvalues that are larger than 1, and the number of points in each of the clusters can be approximated by the associated eigenvalue. It also shows that the eigenvector of the weight matrix can be used directly to perform clustering; that is, the directional angle between the two-row vectors of the matrix derived from the eigenvectors is a suitable distance measure for clustering. As a result, an unsupervised spectral clustering algorithm based on weight matrix (USCAWM) is developed. The experimental results on a number of artificial and real-world data sets show the correctness of the theoretical analysis.
Similar content being viewed by others
References
Bach R, Jordan M I. Learning spectral clustering. University of California at Berkeley Technical report UCB/CSD-03-1249. 2003
Xing E P, Jordan M I. On semidefinite relaxation for normalized k-cut and connections to spectral clustering. University of California at Berkeley Technical report UCB/CSD-3-1265. 2003
Donath W E, Hoffman A J. Lower bounds for partitioning of graphs. IBM J Res Devel, 1973, 17(5): 420–425
Fiedler M. A property of eigenvectors of non-negative symmetric matrices and its application to graph theory. Czechoslovak Mathemat J, 1975, 25(100): 619–633
Hagen L, Kahng A B. New spectral methods for ratio cut partitioning and clustering. IEEE Trans Comput-Aid Design, 1992, 11(9): 1074–1085
Chan P K, Schlag M D F, Zien J Y. Spectral k-way ratio-cut partitioning and clustering. IEEE Trans Comput-Aid Design Integ Circ Syst, 1994, 13(9): 1088–1096
Shi J, Malik J. Normalized cuts and image segmentation. IEEE Trans Patt Anal Mach Intel, 2000, 22(8): 888–905
Fowlkes C, Belongie S, Chung F, et al. Spectral grouping using the Nyström method. IEEE Trans Patt Anal Mach Intel, 2004, 26(2): 214–225
Ding C H Q, He X, Zha H, et al. A min-max cut algorithm for graph partitioning and data clustering. In: Cercone N, Lin T Y, Wu X, eds. ICDM 2001. Los Alamitos, California: IEEE Computer Society, 2001. 107–114
Ding C H Q, He X, Zha H. A spectral method to separate disconnected and nearly-disconnected web graph components. In: Provost F, Srikant R, eds. Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: Association for Computing Machinery, 2001. 275–280.
Weiss Y. Segmentation using eigenvectors: a unifying view. In: Computer Vision, 1999, the proceedings of the Seventh IEEE International Conference on. Los Alamitos, California: IEEE Computer Society, 1999. 975–982
Dhillon I S, Guan Y, Kulis B. A unified view of kernel k-means, spectral clustering and graph cuts. University of Texas at Austin UTCS Technical Report TR-04-25. 2004
Kannan R, Vempala S, Vetta A. On clusterings: good, bad and spectral. J ACM, 2004, 51(3): 597–515
Ng A Y, Jordan M I, Weiss Y. On spectral clustering: Analysis and an algorithm. In: Dietterich T G, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems 14. Cambridge, MA: MIT Press, 2002. 849–856
Brand M, Huang K. A unifying theorem for spectral embedding and clustering. Mitsubishi Electric Research Laboratory Technical Report TR2002-42. 2002.
Sun J. Matrix Perturbation Analysis (in Chinese). 2nd ed. Beijing: Science Press, 2001. 252–272
Hettich S, Bay S D. The UCI KDD Archive [http://kdd.ics.uci.edu]. Irvine, CA: University of California, Department of Information and Computer Science, 1999
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the National Natural Science Foundation of China (Grant No. 60375003) and the Aeronatical Science Foundation of China (Grant No. 03I53059)
Rights and permissions
About this article
Cite this article
Tian, Z., Li, X. & Ju, Y. Spectral clustering based on matrix perturbation theory. SCI CHINA SER F 50, 63–81 (2007). https://doi.org/10.1007/s11432-007-0007-8
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/s11432-007-0007-8