Abstract
Graph clustering is a central and fundamental problem in numerous graph mining applications, especially in spatial-temporal system. The purpose of the graph local clustering is finding a set of nodes (cluster) containing seed node with high internal density. A series of works have been proposed to solve this problem with carefully designing the measuring metric and improving the efficiency-effectiveness trade-off. However, they are unable to provide a satisfying clustering quality guarantee. In this paper, we investigate the graph local clustering task and propose a End-to-End framework LearnedNibble to address the aforementioned limitation. In particular, we propose several techniques, including the practical self-supervised supervision manner with differential soft-mean-sweep operator, effective optimization method with regradient technique, and scalable inference manner with Approximate Graph Propagation (AGP) paradigm and search-selective method. To the best of our knowledge, LearnedNibble is the first attempt to take responsibility for the cluster quality and take both effectiveness and efficiency into consideration in an End-to-End paradigm with self-supervised manner. Extensive experiments on real-world datasets demonstrate the clustering capacity, generalization ability, and approximation compatibility of our LearnedNibble framework.

Similar content being viewed by others
Data availability
The graph datasets that support the findings of this study are available in SNAP project, https://snap.stanford.edu/data/index.html.
References
Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Nat. Acad. Sci. 99(12), 7821–7826 (2002)
Wasserman, S., Faust, K., et al.: Social network analysis: Methods and applications (1994)
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.-U.: Complex networks: Structure and dynamics. physrep 424(4–5), 175–308 (2006). https://doi.org/10.1016/j.physrep.2005.10.009
Lu, Z., Wahlström, J., Nehorai, A.: Community detection in complex networks via clique conductance. Sci. Rep. 8(1), 1–16 (2018)
Wang, M., Wang, C., Yu, J.X., Zhang, J.: Community detection in social networks: an in-depth benchmarking study with a procedure-oriented framework. Proc. VLDB Endow. 8(10), 998–1009 (2015)
Fortunato, S.: Community detection in graphs. Phys. Rep. 486 (3–5), 75–174 (2010)
Leskovec, J., Lang, K.J., Mahoney, M.: Empirical comparison of algorithms for network community detection. In: Proceedings of the 19th International Conference on World Wide Web, pp. 631–640 (2010)
Yi, F., Moon, I.: Image segmentation: A survey of graph-cut methods. In: 2012 International Conference on Systems and Informatics (ICSAI2012), pp. 1936–1941. IEEE (2012)
Vicente, S., Kolmogorov, V., Rother, C.: Graph cut based image segmentation with connectivity priors. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. Int. J. Comput. Vis. 59(2), 167–181 (2004)
Tolliver, D.A., Miller, G.L.: Graph partitioning by spectral rounding: Applications in image segmentation and clustering. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), vol. 1, pp. 1053–1060. IEEE (2006)
Liao, C. -S., Lu, K., Baym, M., Singh, R., Berger, B.: Isorankn: Spectral methods for global alignment of multiple protein networks. Bioinformatics 25(12), 253–258 (2009)
Voevodski, K., Teng, S. -H., Xia, Y.: Finding local communities in protein networks. BMC Bioinform. 10(1), 1–14 (2009)
Zhou, S., Yang, X., Chang, Q.: Spatial clustering analysis of green economy based on knowledge graph. Journal of Intelligent & Fuzzy Systems (Preprint), 1–10 (2021)
Foysal, K.H., Chang, H.J., Bruess, F., Chong, J.W.: Smartfit: Smartphone application for garment fit detection. Electronics 10(1), 97 (2021)
Zhu, D., Shen, G., Chen, J., Zhou, W., Kong, X.: A higher-order motif-based spatiotemporal graph imputation approach for transportation networks. Wirel. Commun. Mob. Comput., 2022 (2022)
Spielman, D.A., Teng, S. -H.: Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In: Proceedings of the Thirty-sixth Annual ACM Symposium on Theory of Computing, pp. 81–90 (2004)
Andersen, R., Chung, F., Lang, K.: Local graph partitioning using pagerank vectors. In: 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), pp. 475–486. IEEE (2006)
Andersen, R., Peres, Y.: Finding sparse cuts locally using evolving sets. In: Proceedings of the Forty-first Annual ACM Symposium on Theory of Computing, pp. 235–244 (2009)
Spielman, D.A., Teng, S. -H.: A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM J. Comput. 42(1), 1–26 (2013)
Lovász, L., Simonovits, M.: The mixing rate of markov chains, an isoperimetric inequality, and computing the volume. In: Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science, pp. 346–354. IEEE (1990)
Lovász, L., Simonovits, M.: Random walks in a convex body and an improved volume algorithm. Random Struct. Algor. 4(4), 359–412 (1993)
Andersen, R., Chung, F.: Detecting sharp drops in pagerank and a simplified local partitioning algorithm. In: International Conference on Theory and Applications of Models of Computation, pp. 1–12. Springer (2007)
Chung, F.: The heat kernel as the pagerank of a graph. Proc. Natl. Acad. Sci. 104(50), 19735–19740 (2007)
Kloster, K., Gleich, D.F.: Heat kernel based community detection. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1386–1395 (2014)
Li, P., Chien, I., Milenkovic, O.: Optimizing generalized pagerank methods for seed-expansion community detection. Adv. Neural Inf. Process. Syst., 32 (2019)
Wang, H., He, M., Wei, Z., Wang, S., Yuan, Y., Du, X., Wen, J.-R.: Approximate graph propagation. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 1686–1696 (2021)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the Web. Stanford InfoLab, Technical report (1999)
Chung, F., Simpson, O.: Solving linear systems with boundary conditions using heat kernel pagerank. In: International Workshop on Algorithms and Models for the Web-Graph, pp. 203–219. Springer (2013)
Yang, R., Xiao, X., Wei, Z., Bhowmick, S.S., Zhao, J., Li, R. -H.: Efficient estimation of heat kernel pagerank for local clustering. In: Proceedings of the 2019 International Conference on Management of Data, pp. 1339–1356 (2019)
Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 150–160 (2000)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. Nat. Acad. Sci. 101(9), 2658–2663 (2004)
Newman, M.E.: Modularity and community structure in networks. Proc. Nat. Acad. Sci. 103(23), 8577–8582 (2006)
Kobourov, S.G., Pupyrev, S., Simonetto, P.: Visualizing graphs as maps with contiguous regions. In: EuroVis (Short Papers) (2014)
Cheeger, J.: A lower bound for the smallest eigenvalue of the Laplacian. Probl. Anal. 625(195-199), 110 (1970)
Cox, I.J., Rao, S.B., Zhong, Y.: “ratio regions”: a technique for image segmentation. In: Proceedings of 13th International Conference on Pattern Recognition, vol. 2, pp. 557–564. IEEE (1996)
Sharon, E., Galun, M., Sharon, D., Basri, R., Brandt, A.: Hierarchy and adaptivity in segmenting visual scenes. Nature 442(7104), 810–813 (2006)
Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. Knowl. Inf. Syst. 42(1), 181–213 (2015)
Benson, A.R., Gleich, D.F., Leskovec, J.: Higher-order organization of complex networks. Science 353(6295), 163–166 (2016)
Tsourakakis, C.E., Pachocki, J., Mitzenmacher, M.: Scalable motif-aware graph clustering. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1451–1460 (2017)
Yin, H., Benson, A.R., Leskovec, J., Gleich, D.F.: Local higher-order graph clustering. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 555–564 (2017)
Ma, W., Cai, L., He, T., Chen, L., Cao, Z., Li, R.: Local expansion and optimization for higher-order graph clustering. IEEE Internet Things J. 6(5), 8702–8713 (2019)
Huang, S., Li, Y., Bao, Z., Li, Z.: Towards efficient motif-based graph partitioning: An adaptive sampling approach. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), pp. 528–539. IEEE (2021)
Zhou, D., Zhang, S., Yildirim, M.Y., Alcorn, S., Tong, H., Davulcu, H., He, J.: High-order structure exploration on massive graphs: A local graph clustering perspective. ACM Trans. Knowl. Discov. Data (TKDD) 15(2), 1–26 (2021)
Chhabra, A., Faraj, M.F., Schulz, C.: Local motif clustering via (hyper) graph partitioning. arXiv:2205.06176 (2022)
Emmons, S., Kobourov, S., Gallant, M., Börner, K.: Analysis of network clustering algorithms and cluster quality metrics at scale. PloS one 11(7), 0159161 (2016)
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3), 379–423 (1948)
Meilă, M.: Comparing clusterings—an information based distance. J. Multivar. Anal. 98(5), 873–895 (2007)
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)
Avron, H., Horesh, L.: Community detection using time-dependent personalized pagerank. In: International Conference on Machine Learning, pp. 1795–1803. PMLR (2015)
Kloumann, I.M., Ugander, J., Kleinberg, J.: Block models and personalized pagerank. Proc. Natl. Acad. Sci. 114(1), 33–38 (2017)
Li, Y., Liu, J., Lin, G., Hou, Y., Mou, M., Zhang, J.: Gumbel-softmax-based optimization: a simple general framework for optimization problems on graphs. Comput. Soc. Netw. 8(1), 1–16 (2021)
Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: First steps. Soc. Netw. 5(2), 109–137 (1983)
Weiss, P.: L’hypothèse du champ moléculaire et la propriété ferromagnétique. J. Phys. Theor. Appl. 6(1), 661–690 (1907)
Klicpera, J., Weißenberger, S., Günnemann, S.: Diffusion improves graph learning. Advances in Neural Information Processing Systems, 32 (2019)
Berberidis, D., Nikolakopoulos, A.N., Giannakis, G.B.: Adaptive diffusions for scalable learning over graphs. IEEE Trans. Signal Process. 67(5), 1307–1321 (2018)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv:1412.6980 (2014)
Leskovec, J., Sosič, R.: Snap: A general-purpose network analysis and graph-mining library. ACM Trans Intelli Syst Technol (TIST) 8(1), 1–20 (2016)
Getoor, L.: Link-based classification. In: Advanced Methods for Knowledge Discovery from Complex Data, pp. 189–207. Springer (2005)
Namata, G., London, B., Getoor, L., Huang, B., EDU, U.: Query-driven active surveying for collective classification. In: 10th International Workshop on Mining and Learning with Graphs, vol. 8, p. 1 (2012)
Acknowledgements
The author would like to thank Wang Hanzhi and Zhang Ruoqi for their selfless and solid technical supports. This work is partially supported by the Fundamental Research Funds for the Central Universities (No.2020JS005).
Funding
This work is partially supported by the Fundamental Research Funds for the Central Universities (No.2020JS005).
Author information
Authors and Affiliations
Contributions
Yuan Zhe devised the methods and framework, wrote the whole manuscript text and prepared all materials.
Corresponding author
Ethics declarations
Human and Animal Ethics
Not applicable.
Ethics approval and consent to participate
Not applicable.
Consent for Publication
Not applicable.
Competing interests
The author have no relevant financial or non-financial interests to disclose.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: additional experiments
Appendix A: additional experiments
Data sources
We obtain the DBLP, Amazon from the Stanford Network Analysis Project(SNAP) [59], and the rest from their original works [60, 61]. We present the basic information of the datasets used in our experiments in Table 4, and take a view of the conductances of the ground-truth clusters with Figure 2. We can see that the conductance of the labeling clusters are rather large, which should make the information-based metrics conflict with the structure-based metrics, as we note in the following part.
Competitor considerations
Since the effectiveness challenge has not been studied much and little work targets the conductance metric as we do, the competitor of our LearnedNibble may not be any specific research result or algorithm. Besides, the work we present here does not aim to beat any baseline but reveals the capacity of GPR measure family and explore the possibility and method to realize them, with being compatible to the mainstream approximate algorithms.
Comparisons
For GPR instances, we evaluate them by grid-searching a bunch of parameters with 2,000 trials for each, which is also the training budget for LearnedNibble, and take the best performance as their clustering capacities. Specifically, we set the α ∈ [0, 1, 0.0005] for PPR, h ∈ [1, 20, 0.01] for HKPR, 𝜃 ∈ [0, 1, 0.005] and vary the power of 𝜃 which determines the ϕ in [1, 5, 20, 50, 100] for IPR. For MEAN, we directly compute its exact conductance by the standard sweep operation. For GSO, we set the training budget of 200,000 for it since it has much much more parameters to train.
1.1 A.1 Training details
We make the LearnedNibble have the full accessibility of the graph adjacency matrix in the training phase but keep the algorithm local in the inference phase as other computing-based graph local clustering algorithms. The reason we make the algorithm not thoroughly local is twofold. 1) First, we should use the whole graph data since the topology is integrated and should not be sampled as the data points in the Euclidean space. 2) Second, we are looking forward to seeing that the framework have a good generalization ability to the whole graph, which is the crucial character we may depend on to develop the scalability and practicality of LearnedNibble while making the algorithm local seems weird and maybe conflict with the purpose.
For the trainable weighting parameters, we normalize the weight vector w to be one-norm ||x||1 = 1 in the inference phase but keep it free in the training phase for numerical stability sake.
1.2 A.2 Clustering capacity details
Comparisons
We report the average conductance of the 5 training seed nodes with the final model in each datasets with Table 2. The first 4 columns are the GPR family instances and the trivial MEAN pooling operation. The GSO column represents the GSO [53] framework. The last column with title GPR is our LearnedNibble framework.
Results with approximation in detail
We report the results of different datasets in turn and list them with Table 5.
1.3 A.3 Generalization ability details
Comparisons
To see more clearly, we report the generalization abilities of our LearnedNibble framework with competitors in two aspects. 1)In-Cluster: We do inference on the node randomly selected within the same cluster as the training seed nodes. It’s represented by the c columns in Table 3. 2)In-Graph: We do inference on the node randomly selected from the whole graph. It’s represented by the g columns in Table 3. We report the average conductance of the 50 testing nodes with the final model in each dataset.
Results with approximation
We report the results of different datasets in turn with both in-cluster and in-graph situations, which have not been shown in Section 4 with Figure 3.
1.4 A.4 Parameter sensitivity
Initialization comparisons
We test the sensitivity of different initializations by training our LearnedNibble framework from different starting weights. Specifically, we use the PPR weighting vector with teleport constant α = 0.1 to challenge our model. We use the IPR weighting vector with 𝜃 = 0.99,ϕ = 0.9910 for IPR testing. The comparison results of different datasets are listed in Table 6. We can see that the training with different initialization methods achieves similar but slightly different performances. The trivial MEAN and RAW initializations perform a little better, and the IPR with theoretical advantage also plays well in some cases.
Regradient and locality regularization
We investigate the regradient technique proposed in this work by conducting the ablation experiments. At the same time, we test the performance of the popular locality regularization term used in Graph Neural Networks(GNN), which keeps the information diffusion local with the minimizing the 2-norm of the difference between the graph signal after propagating and the initial signal which is the one-hot vector in our situation, i.e., ||gpr − \(\overrightarrow1_{s}\)||. The results under the exact settings with 𝜖 = 0 of both are presented by Table 7. We can see that the regradient sets with R = 1 shows better performance than its comparisons with R = 0, and the training settings with R = 1; L = 0 corresponding to the experiments with regradient technique and without the commonly-used locality regularization achieves the best performance in all situations.
Rights and permissions
About this article
Cite this article
Yuan, Z. Self-supervised end-to-end graph local clustering. World Wide Web 26, 1157–1179 (2023). https://doi.org/10.1007/s11280-022-01081-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-022-01081-8