research-article

Constrained Local Graph Clustering by Colored Random Walk

Authors:
Yaowei Yan

The Pennsylvania State University, USA

The Pennsylvania State University, USA
View Profile

,
Yuchen Bian

The Pennsylvania State University, USA

The Pennsylvania State University, USA
View Profile

,
Dongsheng Luo

The Pennsylvania State University, USA

The Pennsylvania State University, USA
View Profile

,
Dongwon Lee

The Pennsylvania State University, USA

The Pennsylvania State University, USA
View Profile

,
Xiang Zhang

The Pennsylvania State University, USA

The Pennsylvania State University, USA
View Profile

Authors Info & Claims

WWW '19: The World Wide Web ConferenceMay 2019Pages 2137–2146https://doi.org/10.1145/3308558.3313719

Published:13 May 2019Publication History

WWW '19: The World Wide Web Conference

Pages 2137–2146

ABSTRACT

Detecting local graph clusters is an important problem in big graph analysis. Given seed nodes in a graph, local clustering aims at finding subgraphs around the seed nodes, which consist of nodes highly relevant to the seed nodes. However, existing local clustering methods either allow only a single seed node, or assume all seed nodes are from the same cluster, which is not true in many real applications. Moreover, the assumption that all seed nodes are in a single cluster fails to use the crucial information of relations between seed nodes. In this paper, we propose a method to take advantage of such relationship. With prior knowledge of the community membership of the seed nodes, the method labels seed nodes in the same (different) community by the same (different) color. To further use this information, we introduce a color-based random walk mechanism, where colors are propagated from the seed nodes to every node in the graph. By the interaction of identical and distinct colors, we can enclose the supervision of seed nodes into the random walk process. We also propose a heuristic strategy to speed up the algorithm by more than 2 orders of magnitude. Experimental evaluations reveal that our clustering method outperforms state-of-the-art approaches by a large margin.

References

Reid Andersen, Fan Chung, and Kevin Lang. 2006. Local graph partitioning using pagerank vectors. In FOCS. Google ScholarDigital Library
Michel Benaïm 1997. Vertex-reinforced random walks and a conjecture of Pemantle. The Annals of Probability 25, 1 (1997), 361-392.Google ScholarCross Ref
Yuchen Bian, Jingchao Ni, Wei Cheng, and Xiang Zhang. 2017. Many Heads are Better than One: Local Community Detection by the Multi-Walker Chain. In Data Mining (ICDM), 2017 IEEE International Conference on. IEEE, 21-30.Google ScholarCross Ref
Olivier Chapelle, Bernhard Schölkopf, and Alexander Zien. 2006. Semi-Supervised Learning. Adaptive Computation and Machine Learning series. Google ScholarDigital Library
Nicolas A Crossley, Andrea Mechelli, Petra E Ve´rtes, Toby T Winton-Brown, Ameera X Patel, Cedric E Ginestet, Philip McGuire, and Edward T Bullmore. 2013. Cognitive relevance of the community structure of the human brain functional coactivation network. Proceedings of the National Academy of Sciences 110, 28(2013), 11583-11588.Google ScholarCross Ref
Wanyun Cui, Yanghua Xiao, Haixun Wang, and Wei Wang. 2014. Local search of communities in large graphs. In SIGMOD. Google ScholarDigital Library
Roger A Horn, Roger A Horn, and Charles R Johnson. 1990. Matrix analysis. Cambridge university press. Google ScholarDigital Library
Kyle Kloster and David F Gleich. 2014. Heat kernel based community detection. In SIGKDD. Google ScholarDigital Library
Isabel M Kloumann and Jon M Kleinberg. 2014. Community membership identification from small seed sets. In KDD. Google ScholarDigital Library
Andrea Lancichinetti, Santo Fortunato, and Filippo Radicchi. 2008. Benchmark graphs for testing community detection algorithms. Physical review E 78, 4 (2008), 046110.Google Scholar
Rui Liu, Wei Cheng, Hanghang Tong, Wei Wang, and Xiang Zhang. 2015. Robust Multi-Network Clustering via Joint Cross-Domain Cluster Alignment. In ICDM. Google ScholarDigital Library
Qiaozhu Mei, Jian Guo, and Dragomir Radev. 2010. Divrank: the interplay of prestige and diversity in information networks. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. Acm, 1009-1018. Google ScholarDigital Library
Jingchao Ni, Hongliang Fei, Wei Fan, and Xiang Zhang. 2017. Automated Medical Diagnosis by Ranking Clusters Across the Symptom-Disease Network. In Data Mining (ICDM), 2017 IEEE International Conference on. IEEE, 1009-1014.Google ScholarCross Ref
Jingchao Ni, Hongliang Fei, Wei Fan, and Xiang Zhang. 2017. Cross-Network Clustering and Cluster Ranking for Medical Diagnosis. In ICDE.Google Scholar
Jingchao Ni, Mehmet Koyuturk, Hanghang Tong, Jonathan Haines, Rong Xu, and Xiang Zhang. 2016. Disease gene prioritization by integrating tissue-specific molecular networks using a robust multi-network model. BMC bioinformatics 17, 1 (2016), 453.Google Scholar
Robin Pemantle 2007. A survey of random processes with reinforcement. Probability surveys 4(2007), 1-79.Google Scholar
Satu Elisa Schaeffer. 2007. Graph clustering. Computer science review 1, 1 (2007), 27-64. Google ScholarDigital Library
Mauro Sozio and Aristides Gionis. 2010. The community-search problem and how to plan a successful cocktail party. In KDD. Google ScholarDigital Library
Hanghang Tong, Christos Faloutsos, Brian Gallagher, and Tina Eliassi-Rad. 2007. Fast best-effort pattern matching in large attributed graphs. In KDD. Google ScholarDigital Library
Hanghang Tong, Christos Faloutsos, and Jia-Yu Pan. 2006. Fast random walk with restart and its applications. (2006). Google ScholarDigital Library
Marc A Van Driel, Jorn Bruggeman, Gert Vriend, Han G Brunner, and Jack AM Leunissen. 2006. A text-mining analysis of the human phenome. European journal of human genetics 14, 5 (2006), 535-542.Google Scholar
Yubao Wu, Ruoming Jin, Jing Li, and Xiang Zhang. 2015. Robust local community detection: on free rider effect and its elimination. Proceedings of the VLDB Endowment 8, 7 (2015), 798-809. Google ScholarDigital Library
Wayne W Zachary. 1977. An information flow model for conflict and fission in small groups. Journal of anthropological research 33, 4 (1977), 452-473.Google ScholarCross Ref
Denny Zhou, Olivier Bousquet, Thomas N Lal, Jason Weston, and Bernhard Schölkopf. 2004. Learning with local and global consistency. In Advances in neural information processing systems. 321-328. Google ScholarDigital Library
Xiaojin Zhu, Zoubin Ghahramani, and John D Lafferty. 2003. Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International conference on Machine learning (ICML-03). 912-919. Google ScholarDigital Library

Recommendations

Local Higher-Order Graph Clustering
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Local graph clustering methods aim to find a cluster of nodes by exploring a small region of the graph. These methods are attractive because they enable targeted clustering around a given seed node and are faster than traditional global graph clustering ...
Read More
Statistical guarantees for local graph clustering

Local graph clustering methods aim to find small clusters in very large graphs. These methods take as input a graph and a seed node, and they return as output a good cluster in a running time that depends on the size of the output cluster but that is ...
Read More
Multi-agent Random Walks for Local Clustering on Graphs
ICDM '10: Proceedings of the 2010 IEEE International Conference on Data Mining

We consider the problem of local graph clustering where the aim is to discover the local cluster corresponding to a point of interest. The most popular algorithms to solve this problem start a random walk at the point of interest and let it run until ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Editors:
Ling Liu
Georgia Tech, USA
,
Ryen White
Microsoft Research, USA
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 May 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
community detection
local graph clustering
non-Markovian
random walk
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 450
  Total Downloads
- Downloads (Last 12 months)48
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Constrained Local Graph Clustering by Colored Random Walk

WWW '19: The World Wide Web Conference

ABSTRACT

References

Cited By

Recommendations

Local Higher-Order Graph Clustering

Statistical guarantees for local graph clustering

Multi-agent Random Walks for Local Clustering on Graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Constrained Local Graph Clustering by Colored Random Walk

WWW '19: The World Wide Web Conference

ABSTRACT

References

Cited By

Recommendations

Local Higher-Order Graph Clustering

Statistical guarantees for local graph clustering

Multi-agent Random Walks for Local Clustering on Graphs

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media