Abstract
This paper proposes a fast spectral clustering method for large-scale data. In the present method, random projection and random sampling techniques are adopted for reducing the data dimensionality and cardinality. The computation time of the present method is quasi-linear with respect to the data cardinality. The clustering result can be updated with a small computational cost when data samples or random samples are appended or removed.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hagen, L., Kahng, A.: New spectral methods for ratio cut partitioning and clustering. IEEE Transactions on Computer-Aided Design 11(9), 1074–1085 (1992)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)
Ding, C.H.Q., He, X., Zha, H., Gu, M., Simon, H.D.: A min-max cut algorithm for graph partitioning and data clustering. In: Proceedings of ICDM 2001, pp. 107–114 (2001)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: Advances in Neural Information Processing Systems, vol. 14, pp. 849–856. MIT Press, Cambridge (2001)
Fowlkes, C., Belongie, S., Chung, F., Malik, J.: Spectral grouping using the Nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 214–225 (2004)
Drineas, P., Mahoney, M.W.: On the Nyström method for approximating a gram matrix for improved kernel-based learning. Journal of Machine Learning Research 6, 2153–2175 (2005)
Freitas, N.D., Wang, Y., Mahdaviani, M., Lang, D.: Fast Krylov methods for N-body learning. In: Advances in Neural Information Processing Systems, vol. 18, pp. 251–258. MIT Press, Cambridge (2006)
Song, Y., Chen, W.Y., Bai, H., Lin, C.J., Chang, E.Y.: Parallel spectral clustering. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS, vol. 5212, pp. 374–389. Springer, Heidelberg (2008)
Mahadevan, S.: Fast spectral learning using Lanczos eigenspace projections. In: AAAI, pp. 1472–1475 (2008)
Golub, G.H., Loan, C.F.V.: Matrix Computations, 3rd edn. Johns Hopkins University Press (1996)
Williams, C.K.I., Seeger, M.: Using the Nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems, vol. 13, pp. 682–688. MIT Press, Cambridge (2001)
Von Luxburg, U.: A tutorial on spectral clustering. Statistics and Computing 17(4), 395–416 (2007)
Fiedler, M.: Algebraic connectivity of graphs. Czechoslovak Mathematical Journal 23, 298–305 (1973)
Fiedler, M.: A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czechoslovak Mathematical Journal 25, 619–633 (1975)
Scott, G.L., Longuet-Higgins, H.C.: Feature grouping by relocalisation of eigenvectors of the proximity matrix. In: British Machine Vision Conference, pp. 103–108 (1990)
Zelnik-Manor, L., Perona, P.: Self-tuning spectral clustering. In: Advances in Neural Information Processing Systems, vol. 17, pp. 1601–1608. MIT Press, Cambridge (2004)
Johnson, W., Lindenstrauss, J.: Extensions of Lipschitz maps into a Hilbert space. Contemporary Mathematics 26, 189–206 (1984)
Dasgupta, S., Gupta, A.: An elementary proof of the Johnson-Lindenstrauss lemma. Technical report, UC Berkeley (1999)
Achlioptas, D.: Database-friendly random projections: Johnson-Lindenstrauss with binary coins. Journal of Computer and System Sciences 66, 671–687 (2003)
Brigham, E., Maninila, H.: Random projection in dimensionality reduction: applications to image and text data. In: ACM SIGKDD ICKDDM, pp. 245–250 (2001)
Fradkin, D., Madigan, D.: Experiments with random projections for machine learning. In: ACM SIGKDD ICKDDM, pp. 517–522 (2003)
Achlioptas, D., Mcsherry, F., Olkopf, B.S.: Sampling techniques for kernel methods. In: Annual Advances in Neural Information Processing Systems, vol. 14, pp. 335–342. MIT Press, Cambridge (2002)
Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: ACM SIGKDD, pp. 269–274. ACM, New York (2001)
Gu, M., Eisenstat, S.C.: A stable and fast algorithm for updating the singular value decomposition. Tech. Rep. YALEU/DCS/RR-966, Yale University (1994)
Berry, M.W.: Large scale sparse singular value computations. International Journal of Supercomputer Applications 6, 13–49 (1992)
Bunch, J.R., Nielsen, C.P.: Updating the singular value decomposition. Numerische Mathematik 31, 111–129 (1978)
Gu, M., Eisenstat, S.C.: A stable and fast algorithm for updating the singular value decomposition. Tech. Rep. YALEU/DCS/RR-966, Yale University (1994)
Chandrasekaran, S., Manjunath, B.S., Wang, Y.F., Winkeler, J., Zhang, H.: An eigenspace update algorithm for image analysis. Graphical models and image processing 59(5), 321–332 (1997)
Brand, M.: Incremental singular value decomposition of uncertain data with missing values. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 707–720. Springer, Heidelberg (2002)
Brand, M.: Fast online SVD revisions for lightweight recommender systems. In: SIAM International Conference on Data Mining, pp. 37–46 (2003)
Skocaj, D., Leonardis, A.: Weighted and robust incremental method for subspace learning. In: Proc. ICCV 2003, vol. 2, p. 1494 (2003)
Davies, P.I., Smith, M.I.: Updating the singular value decomposition. Journal of Computational and Applied Mathematics 170, 145–167 (2004)
Skocaj, D., Leonardis, A.: Incremental and robust learning of subspace representations. Image and Vision Computing 26, 27–38 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sakai, T., Imiya, A. (2009). Fast Spectral Clustering with Random Projection and Sampling. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2009. Lecture Notes in Computer Science(), vol 5632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03070-3_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-03070-3_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03069-7
Online ISBN: 978-3-642-03070-3
eBook Packages: Computer ScienceComputer Science (R0)