Abstract
Contrastive graph clustering (CGC) has emerged as a research hotspot in current studies, aiming to leverage the robust representational capability of contrastive learning to improve graph clustering performance. Recent works have shown that CGC can benefit from hard sample mining. However, we observe two primary shortcomings of existing CGC methods that limit further enhancements in clustering performance. Firstly, the widely used contrastive loss mistakenly classifies elements outside the cross-view diagonal as negatives, yielding numerous false negatives. Secondly, without explicit cluster-guiding, learned node embeddings become unsuitable for clustering tasks. To address these issues, we propose a novel CGC method by Enhanced hard sample mining and cluster-guiding (CGCEC). This method generates high-confidence pseudo-labels by clustering node embeddings during network training. Furthermore, we have designed a hard sample debiased mining loss that uses pseudo-labels to remove the false negative samples, repelling hard negatives while attracting hard positives, thus enhancing the discriminability of the learned embeddings. Additionally, we employ the encoder to transform node embeddings into semantic labels, promoting the network to learn node embeddings more suitable to clustering by matching semantic labels with pseudo-labels. To validate CGCEC’s effectiveness, we compare it with state-of-the-art graph clustering methods across six benchmark datasets. The experimental results substantiate the efficacy of our method and its superiority over competing approaches.







Similar content being viewed by others
Data Availability
Data will be made available at a reasonable request.
References
Jin, W., Li, Y., Xu, H., Wang, Y., Ji, S., Aggarwal, C., Tang, J.: Adversarial attacks and defenses on graphs. ACM SIGKDD Explor. Newslett. 22(2), 19–34 (2021)
Jin, W., Ma, Y., Liu, X., Tang, X., Wang, S., Tang, J.: Graph structure learning for robust graph neural networks. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 66–74 (2020)
Wan, S., Pan, S., Yang, J., Gong, C.: Contrastive and generative graph convolutional networks for graph-based semi-supervised learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 10049–10057 (2021)
Shi, D., Zhu, L., Li, Y., Li, J., Nie, X.: Robust structured graph clustering. IEEE Trans. Neural Netw. Learn. Syst. 31(11), 4424–4436 (2019)
Lin, Z., Kang, Z., Zhang, L., Tian, L.: Multi-view attributed graph clustering. IEEE Trans. Knowl. Data Eng. 35(2), 1872–1880 (2021)
Liu, Y., Tu, W., Zhou, S., Liu, X., Song, L., Yang, X., Zhu, E.: Deep graph clustering via dual correlation reduction. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 7603–7611 (2022)
McLaren, C.D., Bruner, M.W.: Citation network analysis. Int. Rev. Sport Exerc. Psychol. 15(1), 179–198 (2022)
Kleminski, R., Kazienko, P., Kajdanowicz, T.: Analysis of direct citation, co-citation and bibliographic coupling in scientific topic identification. J. Inf. Sci. 48(3), 349–373 (2022)
Paulevé, L., Kolčák, J., Chatain, T., Haar, S.: Reconciling qualitative, abstract, and scalable modeling of biological networks. Nat. Commun. 11(1), 4256 (2020)
Ma, A., Wang, X., Li, J., Wang, C., Xiao, T., Liu, Y., Cheng, H., Wang, J., Li, Y., Chang, Y., et al.: Single-cell biological network inference using a heterogeneous graph transformer. Nat. Commun. 14(1), 964 (2023)
Tsitsulin, A., Palowitch, J., Perozzi, B., Müller, E.: Graph clustering with graph neural networks. J. Mach. Learn. Res. 24(127), 1–21 (2023)
Ju, W., Gu, Y., Chen, B., Sun, G., Qin, Y., Liu, X., Luo, X., Zhang, M.: Glcc: A general framework for graph-level clustering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 4391–4399 (2023)
Wang, X., Qi, G.-J.: Contrastive learning with stronger augmentations. IEEE Trans. Pattern Anal. Mach. Intell. 45(5), 5549–5560 (2022)
Cui, J., Zhong, Z., Tian, Z., Liu, S., Yu, B., Jia, J.: Generalized parametric contrastive learning. IEEE Trans. Pattern Anal. Mach. Intell. 46(12), 7463–7474 (2023)
Liu, Y., Yang, X., Zhou, S., Liu, X., Wang, S., Liang, K., Tu, W., Li, L.: Simple contrastive graph clustering. IEEE Trans. Neural Netw. Learn. Syst. 35(10), 13789–13800 (2023)
Yang, X., Tan, C., Liu, Y., Liang, K., Wang, S., Zhou, S., Xia, J., Li, S.Z., Liu, X., Zhu, E.: Convert: Contrastive graph clustering with reliable augmentation. In: Proceedings of the 31st ACM International Conference on Multimedia, pp. 319–327 (2023)
Xie, X., Chen, W., Kang, Z., Peng, C.: Contrastive graph clustering with adaptive filter. Expert Syst. Appl. 219, 119645 (2023)
Zhao, H., Yang, X., Wang, Z., Yang, E., Deng, C.: Graph debiased contrastive learning with joint representation clustering. In: IJCAI, pp. 3434–3440 (2021)
Xia, J., Wu, L., Wang, G., Chen, J., Li, S.Z.: Progcl: rethinking hard negative mining in graph contrastive learning. arXiv preprint arXiv:2110.02027 (2021)
Ma, Z., Leijon, A.: Beta mixture models and the application to image classification. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 2045–2048 (2009). IEEE
Liu, Y., Yang, X., Zhou, S., Liu, X., Wang, Z., Liang, K., Tu, W., Li, L., Duan, J., Chen, C.: Hard sample aware network for contrastive deep graph clustering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 8914–8922 (2023)
Yang, S., Verma, S., Cai, B., Jiang, J., Yu, K., Chen, F., Yu, S.: Variational co-embedding learning for attributed network clustering. Knowl.-Based Syst. 270, 110530 (2023)
Wang, C., Pan, S., Hu, R., Long, G., Jiang, J., Zhang, C.: Attributed graph clustering: A deep attentional embedding approach. arXiv preprint arXiv:1906.06532 (2019)
Bo, D., Wang, X., Shi, C., Zhu, M., Lu, E., Cui, P.: Structural deep clustering network. In: Proceedings of the Web Conference 2020, pp. 1400–1410 (2020)
Tu, W., Zhou, S., Liu, X., Guo, X., Cai, Z., Zhu, E., Cheng, J.: Deep fusion clustering network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 9978–9987 (2021)
Pan, S., Hu, R., Long, G., Jiang, J., Yao, L., Zhang, C.: Adversarially regularized graph autoencoder for graph embedding. arXiv preprint arXiv:1802.04407 (2018)
Gao, H., Pei, J., Huang, H.: Progan: Network embedding via proximity generative adversarial network. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1308–1316 (2019)
Jia, Y., Zhang, Q., Zhang, W., Wang, X.: Communitygan: Community detection with generative adversarial nets. In: The World Wide Web Conference, pp. 784–794 (2019)
Xie, J., Girshick, R., Farhadi, A.: Unsupervised deep embedding for clustering analysis. In: International Conference on Machine Learning, pp. 478–487 (2016). PMLR
Cui, G., Zhou, J., Yang, C., Liu, Z.: Adaptive graph encoder for attributed graph embedding. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 976–985 (2020)
Hassani, K., Khasahmadi, A.H.: Contrastive multi-view representation learning on graphs. In: International Conference on Machine Learning, pp. 4116–4126 (2020). PMLR
Lee, N., Lee, J., Park, C.: Augmentation-free self-supervised learning on graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 7372–7380 (2022)
Jin, W., Liu, X., Zhao, X., Ma, Y., Shah, N., Tang, J.: Automated self-supervised learning for graphs. arXiv preprint arXiv:2106.05470 (2021)
Shen, X., Sun, D., Pan, S., Zhou, X., Yang, L.T.: Neighbor contrastive learning on learnable graph augmentation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 9782–9791 (2023)
Gong, X., Yang, C., Shi, C.: Ma-gcl: Model augmentation tricks for graph contrastive learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, pp. 4284–4292 (2023)
Li, W.-Z., Wang, C.-D., Xiong, H., Lai, J.-H.: Homogcl: Rethinking homophily in graph contrastive learning. In: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1341–1352 (2023)
Chuang, C.-Y., Robinson, J., Lin, Y.-C., Torralba, A., Jegelka, S.: Debiased contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 8765–8775 (2020)
Kalantidis, Y., Sariyildiz, M.B., Pion, N., Weinzaepfel, P., Larlus, D.: Hard negative mixing for contrastive learning. Adv. Neural. Inf. Process. Syst. 33, 21798–21809 (2020)
Robinson, J., Chuang, C.-Y., Sra, S., Jegelka, S.: Contrastive learning with hard negative samples. arXiv preprint arXiv:2010.04592 (2020)
Chu, G., Wang, X., Shi, C., Jiang, X.: Cuco: Graph representation with curriculum contrastive learning. In: IJCAI, pp. 2300–2306 (2021)
Bengio, Y., Louradour, J., Collobert, R., Weston, J.: Curriculum learning. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 41–48 (2009)
Oord, A.v.d., Li, Y., Vinyals, O.: Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018)
Xia, W., Gao, Q., Wang, Q., Gao, X., Ding, C., Tao, D.: Tensorized bipartite graph learning for multi-view clustering. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 5187–5202 (2022)
Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
Acknowledgements
This work was supported by Natural Science Basic Research Program of Shaanxi (2024JC-YBMS-473), Key Scientific Research Program of Education Department of Shaanxi Provincial government (22JS019), Xi’an Major Scientific and Technological Achievements Transformation Industrialization Project (23CGZHCYH0008), and Humanities and Social Science Foundation of Ministry of Education of China (24YJA880034).
Author information
Authors and Affiliations
Contributions
M.L.: Writing – review and editing, Writing – original draft, Resources, Project administration, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. B.Y. and T.X.: Supervision, Resources, Funding acquisition. Y.Z.: Methodology, Formal analysis, Conceptualization. L.Z.: Methodology, Formal analysis, Conceptualization. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, M., Yang, B., Xue, T. et al. Contrastive graph clustering via enhanced hard sample mining and cluster-guiding. Multimedia Systems 30, 366 (2024). https://doi.org/10.1007/s00530-024-01567-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00530-024-01567-7