skip to main content
10.1145/3428757.3429095acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiiwasConference Proceedingsconference-collections
research-article

Fast Clustering of Hypergraphs Based on Bipartite-Edge Restoration and Node Reachability

Published: 27 January 2021 Publication History

Abstract

In recent years, studies on hypergraph that is a generalization of graphs and can represent relationships of two or more nodes, have been actively conducted, however, clustering methods over them have not been established yet. In this study, we propose a fast clustering method where a hypergraph is expanded to a bipartite graph by treating hyperedges as nodes, the relationship between the node and the hyperedge is defined by TFIDF, and the value is treated as the weight of the bipartite edge. Our algorithm can efficiently grasp clusters by restoring bipartite edges in descending order of TFIDF weights and merging nodes reachable along the edges into clusters. Based on the bipartite edge restoration and node reachability, we can realize an efficient and effective clustering for a large-scale hypergraph. Furthermore, by performing our method to hyperedges, one can obtain hard-partitioned hyperedges, and exploiting them, also obtain soft-partitioned nodes. Experimental evaluations using small-scale artificial and large-scale real datasets show that our method outputs more accurate clusters in terms of F1 score between the estimated and the actual clusters, and modularity. In addition, the execution time of our method is significantly faster than the compared existing method. As for the soft-clustering, our method produces results with a more balanced size of clusters.

References

[1]
W.Ahmad and A. A. Khokhar. 2007. cHawk: An Efficient Biclustering Algorithm based on Bipartite Graph Crossing Minimization. In Proceedings of the VLDB Workshop on Data Mining in Bioinformatics.
[2]
K. Ahn, K. Lee, and C. Suh. 2018. Hypergraph Spectral Clustering in the Weighted Stochastic Block Model. IEEE Journal of Selected Topics in Signal Processing 12 (2018), 959--974.
[3]
Y. Ahn, J. Bagrow, and S. Lehmann. 2010. Link communities reveal multiscale complexity in networks. Nature 466 (2010), 761--764. https://doi.org/10.1038/nature09182
[4]
V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10 (2008), P10008.
[5]
P.-Y. Chen and A. O. Hero. 2015. Deep Community Detection. IEEE Transactions on Signal Processing 63, 21 (2015), 5706--5719.
[6]
A. Clauset, M. E.J. Newman, and C. Moore. 2004. Finding community structure in very large networks. Physical Review E 70, 6 (2004), 066111+.
[7]
L. Hagen and A. B. Kahng. 1992. New Spectral Methods for Ratio Cut Partitioning and Clustering. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 11, 9 (1992), 1074--1085.
[8]
B. Kamiński, V. Poulin, P. Pralat, P. Szufel, and F. Théberge. 2019. Clustering via hypergraph modularity. PLOS ONE 14, 11 (11 2019), 1--15. https://doi.org/10.1371/journal.pone.0224307
[9]
T. Kumar, S. Vaidyanathan, H. Ananthapadmanabhan, S. Parthasarathy, and B. Ravindran. 2018. Hypergraph Clustering: A Modularity Maximization Approach. CoRR abs/1812.10869 (2018). arXiv:1812.10869 http://arxiv.org/abs/1812.10869
[10]
M. E.J. Newman. 2003. The Structure and Function of Complex Networks. SIAM Rev. 45, 2 (2003), 167--256.
[11]
J. Shi and J. Malik. 2000. Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 8 (2000), 888--905.
[12]
T. Soma and Y. Yoshida. 2019. Spectral Sparsification of Hypergraphs. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '19). Society for Industrial and Applied Mathematics, USA, 2570--2581.
[13]
D. A. Spielman and N. Srivastava. 2008. Graph Sparsification by Effective Resistances. SIAM J. Comput. 40, 6 (2008), 1913--1926. https://doi.org/10.1137/080734029
[14]
U. von Luxburg. 2007. A tutorial on spectral clustering. Statistics and Computing 17, 4 (2007), 395--416.
[15]
J. Y. Zien, M. D. F. Schlag, and P. K. Chan. 1999. Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 18, 9 (1999), 1389--1399. https://doi.org/10.1109/43.784130

Cited By

View all
  • (2022)sGrow: Explaining the Scale-Invariant Strength Assortativity of Streaming ButterfliesACM Transactions on the Web10.1145/357240817:3(1-46)Online publication date: 14-Dec-2022
  • (2022)High-Speed and Noise-Robust Embedding of Hypergraphs Based on Double-Centered Incidence MatrixComplex Networks & Their Applications X10.1007/978-3-030-93413-2_45(536-548)Online publication date: 1-Jan-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
iiWAS '20: Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services
November 2020
492 pages
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • Johannes Kepler University, Linz, Austria

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 January 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Bipartite Expansion
  2. Clustering
  3. Hypergraph
  4. Reachability
  5. TFIDF

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • JSPS Grant-in-Aid for Scientific Research

Conference

iiWAS '20

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)1
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)sGrow: Explaining the Scale-Invariant Strength Assortativity of Streaming ButterfliesACM Transactions on the Web10.1145/357240817:3(1-46)Online publication date: 14-Dec-2022
  • (2022)High-Speed and Noise-Robust Embedding of Hypergraphs Based on Double-Centered Incidence MatrixComplex Networks & Their Applications X10.1007/978-3-030-93413-2_45(536-548)Online publication date: 1-Jan-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media