Skip to main content
Log in

Accelerating spectral clustering with partial supervision

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Spectral Clustering is a popular learning paradigm that employs the eigenvectors and eigenvalues of an appropriate input matrix for approximating the clustering objective. Albeit its empirical success in diverse application areas, spectral clustering has been criticized for its inefficiency when dealing with large-size datasets. This is mainly due to the fact that the complexity of most eigenvector algorithms is cubic with respect to the number of instances and even memory efficient iterative eigensolvers (such as the Power Method) may converge very slowly to the desired eigenvector solutions. In this paper, inspired from the relevant work on Pagerank we propose a semi-supervised framework for spectral clustering that provably improves the efficiency of the Power Method for computing the Spectral Clustering solution. The proposed method is extremely suitable for large and sparse matrices, where it is demonstrated to converge to the eigenvector solution with just a few Power Method iterations. The proposed framework reveals a novel perspective of semi-supervised spectral methods and demonstrates that the efficiency of spectral clustering can be enhanced not only by data compression but also by introducing the appropriate supervised bias to the input Laplacian matrix. Apart from the efficiency gains, the proposed framework is also demonstrated to improve the quality of the derived cluster models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alzate C, Suykens JAK (2009) A regularized formulation for spectral clustering with pairwise constraints. In: IJCNN’09: Proceedings of the 2009 international joint conference on neural networks. IEEE Press, Piscataway, NJ, USA, pp 1338–1345

  • Bie TD, Suykens JAK, Moor BD (2004) Learning from general label constraints. In: Fred ALN, Caelli T, Duin RPW, Campilho AC, Ridder D SSPR/SPR, Lecture Notes in Computer Science. Springer, Berlin, pp 671–679

  • Brin S, Page L (1998) The anatomy of a large-scale hypertextual web search engine. Comput Netw 30(1–7): 107–117

    Google Scholar 

  • Golub GH, Van Loan CF (1996) Matrix computations. Johns Hopkins University Press, Baltimore

    MATH  Google Scholar 

  • Haveliwala T, Kamvar S (2003) The second eigenvalue of the google matrix. Stanford University Technical Report 2003-20

  • Kamvar SD, Klein D, Manning CD (2003) Spectral learning. In: Gottlob G, Walsh T, (eds) IJCAI, Morgan Kaufmann, pp 561–566

  • Kulis B, Basu S, Dhillon IS, Mooney RJ (2005) Semi-supervised graph clustering: a kernel approach. In: Raedt LD, Wrobel S, (eds) ICML. ACM international conference proceeding series, vol 119. ACM pp 457–464

  • Li Z, Liu J, Tang X (2009) Constrained clustering via spectral regularization. In: CVPR, IEEE, pp 421–428

  • Lu Z, Carreira-Perpiñán MÁ (2008) Constrained spectral clustering through affinity propagation. In: CVPR, IEEE Computer Society

  • Mavroeidis D, Bingham E (2008) Enhancing the stability of spectral ordering with sparsification and partial supervision: Application to paleontological data. In: ICDM, IEEE Computer Society, pp 462–471

  • Mavroeidis D, Bingham E (2010) Enhancing the stability and efficiency of spectral ordering with partial supervision and feature selection. Knowledge and Information Systems 23(2): 243–265

    Article  Google Scholar 

  • Meilă M, Shortreed S, Xu L (2005) Regularized spectral learning. In: Proceedings of the 10th international workshop on artificial intelligence and statistics (AISTATS)

  • Stewart GW, Sun J (1990) Matrix perturbation theory. Academic Press, London

    MATH  Google Scholar 

  • von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4): 395–416

    Article  MathSciNet  Google Scholar 

  • Yan D, Huang L, Jordan MI (2009) Fast approximate spectral clustering. In: IV JFE, Fogelman-Soulié F, Flach PA, Zaki MJ, (eds) KDD, ACM, pp 907–916

  • Zhou D, Bousquet O, Lal TN, Weston J, Schölkopf B (2003) Learning with local and global consistency. In: Thrun S, Saul LK, Schölkopf B NIPS. MIT Press, Cambridge

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitrios Mavroeidis.

Additional information

Responsible editors: José L Balcázar, Francesco Bonchi, Aristides Gionis, Michéle Sebag.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mavroeidis, D. Accelerating spectral clustering with partial supervision. Data Min Knowl Disc 21, 241–258 (2010). https://doi.org/10.1007/s10618-010-0191-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-010-0191-9

Keywords

Navigation