Skip to main content
Log in

An improved spectral clustering algorithm based on random walk

  • Research Article
  • Published:
Frontiers of Computer Science in China Aims and scope Submit manuscript

Abstract

The construction process for a similarity matrix has an important impact on the performance of spectral clustering algorithms. In this paper, we propose a random walk based approach to process the Gaussian kernel similarity matrix. In this method, the pair-wise similarity between two data points is not only related to the two points, but also related to their neighbors. As a result, the new similarity matrix is closer to the ideal matrix which can provide the best clustering result. We give a theoretical analysis of the similarity matrix and apply this similarity matrix to spectral clustering. We also propose a method to handle noisy items which may cause deterioration of clustering performance. Experimental results on real-world data sets show that the proposed spectral clustering algorithm significantly outperforms existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Ng A Y, Jordan M I, Weiss Y. On spectral clustering: analysis and an algorithm. In: Proceedings of Advances in Neural Information Pressing Systems 14. 2001, 849–856

  2. Wang F, Zhang C S, Shen H C, Wang J D. Semi-supervised classification using linear neighborhood propagation. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2006, 160–167

  3. Wang F, Zhang C S. Robust self-tuning semi-supervised learning. Neurocomputing, 2006, 70(16–18): 2931–2939

    Google Scholar 

  4. Kamvar S D, Klein D, Manning C D. Spectral learning. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence. 2003, 561–566

  5. Lu Z D, Carreira-Perpiňán M A. Constrained spectral clustering through affinity propagation. In: Proceedings of 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2008, 1–8

  6. Meila M, Shi J. A random walks view of spectral segmentation. In: Proceedings of 8th International Workshop on Artificial Intelligence and Statistics. 2001

  7. Azran A, Ghahramani Z. Spectral methods for automatic multiscale data clustering. In: Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2006, 190–197

  8. Meila M. The multicut lemma.UW Statistics Technical Report 417, 2001

  9. Shi J, Malik J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888–905

    Article  Google Scholar 

  10. Hagen L, Kahng A B. New spectral methods for ratio cut partitioning and clustering. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1992, 11(9): 1074–1085

    Article  Google Scholar 

  11. Ding C H Q, He X F, Zha H Y, Gu M, Simon H D. A min-max cut algorithm for graph partitioning and data clustering. In: Proceedings of 1st IEEE International Conference on Data Mining. 2001, 107–114

  12. von Luxburg U. A tutorial on spectral clustering. Statistics and Computing, 2007, 17(4): 395–416

    Article  MathSciNet  Google Scholar 

  13. Zelnik-Manor L, Perona P. Self-tuning spectral clustering. In. Proceedings of Advances in Neural Information Processing Systems 17. 2004, 1601–1608

  14. Huang T, Yang C. Matrix Analysis with Applications. Beijing: Scientific Publishing House, 2007 (in Chinese)

    Google Scholar 

  15. Lovász L, Lov L, Erdos O. Random walks on graphs: a survey. Combinatorics, 1993, 2: 353–398

    Google Scholar 

  16. Gong C H. Matrix Theory and Applications. Beijing: Scientific Publishing House, 2007 (in Chinese)

    Google Scholar 

  17. Tian Z, Li X B, Ju Y W. Spectral clustering based on matrix perturbation theory. Science in China Series F: Information Sciences, 2007, 50(1): 63–81

    Article  MathSciNet  MATH  Google Scholar 

  18. Fouss F, Pirotte A, Renders J, Saerens M. Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Transactions on Knowledge and Data Engineering, 2007, 19(3): 355–369

    Article  Google Scholar 

  19. Banerjee A, Dhillon I, Ghosh J, Sra S. Generative model-based clustering of directional data. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2003, 19–28

  20. Wang L, Leckie C, Ramamohanarao K, Bezdek J C. Approximate spectral clustering. In: Proceedings of 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. 2009, 134–146

  21. Fowlkes C, Belongie S, Chung F, Malik J. Spectral grouping using the Nyström method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(2): 214–225

    Article  Google Scholar 

  22. Puzicha J, Belongie S. Model-based halftoning for color image segmentation. In: Proceedings of 15th International Conference on Pattern Recognition. 2000, 629–632

  23. Puzicha J, Held M, Ketterer J, Buhmann J M, Fellner D W. On spatial quantization of color images. IEEE Transactions on Image Processing, 2000, 9(4): 666–682

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianchao Zhang.

Additional information

Xianchao Zhang is a full professor at Dalian University of Technology, China. He received his B.S degree in Applied Mathematics and M.S. degree in Computational Mathematics from National University of Defense Technology in 1994 and 1998, respectively. He received his Ph.D. in Computer Theory and Software from University of Science and Technology of China in 2000. He joined Dalian University of Technology in 2003 after 2 years of industrial working experience at international companies. He worked as Visiting Scholar at The Australian National University and The City University of Hong Kong in 2005 and 2009, respectively. His research interests include algorithms, machine learning, data mining and information retrieval.

Quanzeng You received his B.S. degree from School of Software, Dalian University of Technology, China in 2009. He is Current a master candidate at Dalian University of Technology, China. He joined the Lab of Intelligent Information Processing at DUT in 2009, under the supervision of Prof. Xianchao Zhang. His research interests include spectral clustering, clustering, semi-supervised learning and other data mining techniques. He is especially interested in spectral methods. Currently, his research mainly focuses on the improvement of spectral clustering and how to apply spectral clustering to large scale problems.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, X., You, Q. An improved spectral clustering algorithm based on random walk. Front. Comput. Sci. China 5, 268–278 (2011). https://doi.org/10.1007/s11704-011-0023-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-011-0023-0

Keywords

Navigation