skip to main content
10.1145/3308558.3313719acmotherconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

Constrained Local Graph Clustering by Colored Random Walk

Published:13 May 2019Publication History

ABSTRACT

Detecting local graph clusters is an important problem in big graph analysis. Given seed nodes in a graph, local clustering aims at finding subgraphs around the seed nodes, which consist of nodes highly relevant to the seed nodes. However, existing local clustering methods either allow only a single seed node, or assume all seed nodes are from the same cluster, which is not true in many real applications. Moreover, the assumption that all seed nodes are in a single cluster fails to use the crucial information of relations between seed nodes. In this paper, we propose a method to take advantage of such relationship. With prior knowledge of the community membership of the seed nodes, the method labels seed nodes in the same (different) community by the same (different) color. To further use this information, we introduce a color-based random walk mechanism, where colors are propagated from the seed nodes to every node in the graph. By the interaction of identical and distinct colors, we can enclose the supervision of seed nodes into the random walk process. We also propose a heuristic strategy to speed up the algorithm by more than 2 orders of magnitude. Experimental evaluations reveal that our clustering method outperforms state-of-the-art approaches by a large margin.

References

  1. Reid Andersen, Fan Chung, and Kevin Lang. 2006. Local graph partitioning using pagerank vectors. In FOCS. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Michel Benaïm 1997. Vertex-reinforced random walks and a conjecture of Pemantle. The Annals of Probability 25, 1 (1997), 361-392.Google ScholarGoogle ScholarCross RefCross Ref
  3. Yuchen Bian, Jingchao Ni, Wei Cheng, and Xiang Zhang. 2017. Many Heads are Better than One: Local Community Detection by the Multi-Walker Chain. In Data Mining (ICDM), 2017 IEEE International Conference on. IEEE, 21-30.Google ScholarGoogle ScholarCross RefCross Ref
  4. Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien. 2006. Semi-Supervised Learning. Adaptive Computation and Machine Learning series. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Nicolas A Crossley, Andrea Mechelli, Petra E Ve´rtes, Toby T Winton-Brown, Ameera X Patel, Cedric E Ginestet, Philip McGuire, and Edward T Bullmore. 2013. Cognitive relevance of the community structure of the human brain functional coactivation network. Proceedings of the National Academy of Sciences 110, 28(2013), 11583-11588.Google ScholarGoogle ScholarCross RefCross Ref
  6. Wanyun Cui, Yanghua Xiao, Haixun Wang, and Wei Wang. 2014. Local search of communities in large graphs. In SIGMOD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Roger A Horn, Roger A Horn, and Charles R Johnson. 1990. Matrix analysis. Cambridge university press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kyle Kloster and David F Gleich. 2014. Heat kernel based community detection. In SIGKDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Isabel M Kloumann and Jon M Kleinberg. 2014. Community membership identification from small seed sets. In KDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Andrea Lancichinetti, Santo Fortunato, and Filippo Radicchi. 2008. Benchmark graphs for testing community detection algorithms. Physical review E 78, 4 (2008), 046110.Google ScholarGoogle Scholar
  11. Rui Liu, Wei Cheng, Hanghang Tong, Wei Wang, and Xiang Zhang. 2015. Robust Multi-Network Clustering via Joint Cross-Domain Cluster Alignment. In ICDM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Qiaozhu Mei, Jian Guo, and Dragomir Radev. 2010. Divrank: the interplay of prestige and diversity in information networks. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. Acm, 1009-1018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Jingchao Ni, Hongliang Fei, Wei Fan, and Xiang Zhang. 2017. Automated Medical Diagnosis by Ranking Clusters Across the Symptom-Disease Network. In Data Mining (ICDM), 2017 IEEE International Conference on. IEEE, 1009-1014.Google ScholarGoogle ScholarCross RefCross Ref
  14. Jingchao Ni, Hongliang Fei, Wei Fan, and Xiang Zhang. 2017. Cross-Network Clustering and Cluster Ranking for Medical Diagnosis. In ICDE.Google ScholarGoogle Scholar
  15. Jingchao Ni, Mehmet Koyuturk, Hanghang Tong, Jonathan Haines, Rong Xu, and Xiang Zhang. 2016. Disease gene prioritization by integrating tissue-specific molecular networks using a robust multi-network model. BMC bioinformatics 17, 1 (2016), 453.Google ScholarGoogle Scholar
  16. Robin Pemantle 2007. A survey of random processes with reinforcement. Probability surveys 4(2007), 1-79.Google ScholarGoogle Scholar
  17. Satu Elisa Schaeffer. 2007. Graph clustering. Computer science review 1, 1 (2007), 27-64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Mauro Sozio and Aristides Gionis. 2010. The community-search problem and how to plan a successful cocktail party. In KDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hanghang Tong, Christos Faloutsos, Brian Gallagher, and Tina Eliassi-Rad. 2007. Fast best-effort pattern matching in large attributed graphs. In KDD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. (2006). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Marc A Van Driel, Jorn Bruggeman, Gert Vriend, Han G Brunner, and Jack AM Leunissen. 2006. A text-mining analysis of the human phenome. European journal of human genetics 14, 5 (2006), 535-542.Google ScholarGoogle Scholar
  22. Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. 2015. Robust local community detection: on free rider effect and its elimination. Proceedings of the VLDB Endowment 8, 7 (2015), 798-809. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Wayne W Zachary. 1977. An information flow model for conflict and fission in small groups. Journal of anthropological research 33, 4 (1977), 452-473.Google ScholarGoogle ScholarCross RefCross Ref
  24. Denny Zhou, Olivier Bousquet, Thomas N Lal, Jason Weston, and Bernhard Schölkopf. 2004. Learning with local and global consistency. In Advances in neural information processing systems. 321-328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Xiaojin Zhu, Zoubin Ghahramani, and John D Lafferty. 2003. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International conference on Machine learning (ICML-03). 912-919. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Other conferences
    WWW '19: The World Wide Web Conference
    May 2019
    3620 pages
    ISBN:9781450366748
    DOI:10.1145/3308558

    Copyright © 2019 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 13 May 2019

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate1,899of8,196submissions,23%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format