Abstract
Finding dense subgraphs in large (hyper)graphs is a key primitive in a variety of real-world application domains, encompassing social network analytics, event detection, biology, and finance. In most such applications, one typically aims at finding several (possibly overlapping) dense subgraphs, which might correspond to communities in social networks or interesting events. While a large amount of work is devoted to finding a single densest subgraph, perhaps surprisingly, the problem of finding several dense subgraphs in weighted hypergraphs with limited overlap has not been studied in a principled way, to the best of our knowledge. In this work, we define and study a natural generalization of the densest subgraph problem in weighted hypergraphs, where the main goal is to find at most k subgraphs with maximum total aggregate density, while satisfying an upper bound on the pairwise weighted Jaccard coefficient, i.e., the ratio of weights of intersection divided by weights of union on two nodes sets of the subgraphs. After showing that such a problem is NP-Hard, we devise an efficient algorithm that comes with provable guarantees in some cases of interest, as well as, an efficient practical heuristic. Our extensive evaluation on large real-world hypergraphs confirms the efficiency and effectiveness of our algorithms.
- [1] . 2009. Finding dense subgraphs with size bounds. In Proceedings of the WAW.Google ScholarDigital Library
- [2] . 2012. Dense subgraph maintenance under streaming edge weight updates for real-time story identification. Proc. VLDB Endow. 5, 6 (2012).Google ScholarDigital Library
- [3] . 2002. Complexity of finding dense subgraphs. Discr. Appl. Math. 121, 1-3 (2002).Google ScholarDigital Library
- [4] . 2000. Greedily finding a dense subgraph. J. Algorithms 34, 2 (2000).Google ScholarDigital Library
- [5] . 2012. Densest subgraph in streaming and MapReduce. Proc. VLDB Endow. 5, 5 (2012).Google ScholarDigital Library
- [6] . 2015. Finding subgraphs with maximum total density and limited overlap. In Proceedings of the WSDM. ACM, 379–388.Google ScholarDigital Library
- [7] . 2020. The pushshift reddit dataset. In Proceedings of the ICWSM, Vol. 14. 830–839.Google ScholarCross Ref
- [8] . 2022. A new dynamic algorithm for densest subhypergraphs. In Proceedings of the WWW. ACM, 1093–1103.Google ScholarDigital Library
- [9] . 2010. Detecting high log-densities: An O(n\({}^{\mbox{1/4}}\)) approximation for densest k-subgraph. In Proceedings of the STOC. 201–210.Google ScholarDigital Library
- [10] . 2015. Space- and time-efficient algorithm for maintaining dense subgraphs on one-pass dynamic streams. In Proceedings of the 47th STOC, and (Eds.). ACM, 173–182.Google ScholarDigital Library
- [11] . 2014. Core decomposition of uncertain graphs. In Proceedings of the KDD.Google ScholarDigital Library
- [12] . 2020. Flowless: Extracting densest subgraphs without flow computations. In Proceedings of the WWW. ACM / IW3C2, 573–583.Google ScholarDigital Library
- [13] . 2000. Greedy approximation algorithms for finding dense components in a graph. In Proceedings of the APPROX, and (Eds.). Springer.Google ScholarCross Ref
- [14] . 2022. Densest subgraph: Supermodularity, iterative peeling, and flow. In Proceedings of the SODA. SIAM, 1531–1555.Google ScholarCross Ref
- [15] . 2012. Dense subgraph extraction with application to community detection. Trans. Knowl. Data Eng. 24, 7 (2012).Google ScholarDigital Library
- [16] . 2018. The densest k-subhypergraph problem. SIAM J. Discret. Math. 32, 2 (2018), 1458–1477.Google ScholarCross Ref
- [17] . 2013. Online search of overlapping communities. In Proceedings of the SIGMOD.Google ScholarDigital Library
- [18] . 2017. Large scale density-friendly graph decomposition via convex programming. In Proceedings of the WWW. ACM, 233–242.Google ScholarDigital Library
- [19] . 2021. Top-k overlapping densest subgraphs: Approximation algorithms and computational complexity. J. Comb. Optim. 41, 1 (2021), 80–104.Google ScholarDigital Library
- [20] 2009. Migration motif: A spatial - temporal pattern mining approach for financial markets. In Proceedings of the KDD.Google ScholarDigital Library
- [21] . 2015. Efficient densest subgraph computation in evolving graphs. In Proceedings of the 24th WWW, , , and (Eds.). ACM, 300–310.Google ScholarDigital Library
- [22] . 2006. MotifCut: Regulatory motifs finding with maximum density subgraphs. In Proceedings of the ISMB.Google ScholarDigital Library
- [23] . 2016. Top-k overlapping densest subgraphs. Data Min. Knowl. Discov. 30, 5 (2016), 1134–1165.Google ScholarDigital Library
- [24] . 2005. Discovering large dense subgraphs in massive graphs. In Proceedings of the VLDB.Google Scholar
- [25] . 1984. Finding a Maximum Density Subgraph.
Technical Report . University of California at Berkeley.Google ScholarDigital Library - [26] . 2017. Maintaining densest subsets efficiently in evolving hypergraphs. In Proceedings of the CIKM. ACM, 929–938.Google ScholarDigital Library
- [27] . 2007. Engineering an efficient canonical labeling tool for large and sparse graphs. In Proceedings of the ALENEX. SIAM.Google ScholarCross Ref
- [28] . 2006. Ruling out PTAS for graph min-bisection, dense \(k\)-subgraph, and bipartite clique. J. Comput. 36, 4 (2006).Google Scholar
- [29] . 2009. On finding dense subgraphs. In Proceedings of the ICALP.Google ScholarDigital Library
- [30] . 2021. Exploring the subgraph density-size trade-off via the Lovaśz extension. In Proceedings of the WSDM. ACM, 743–751.Google Scholar
- [31] . 2023. A survey on the densest subgraph problem and its variants. Retrieved from https://
arXiv:2303.14467 . Google ScholarCross Ref - [32] 2005. A combinatorial approach to the analysis of differential gene expression data: The use of graph algorithms for disease prediction and screening. In Methods of Microarray Data Analysis IV. Springer, Berlin.Google Scholar
- [33] . 2010. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data. Springer, Berlin.Google ScholarCross Ref
- [34] . 2015. Dense subgraph partition of positive hypergraphs. IEEE Trans. Pattern Anal. Mach. Intell. 37, 3 (2015), 541–554.Google ScholarDigital Library
- [35] . 2021. Efficient directed densest subgraph discovery. SIGMOD Rec. 50, 1 (2021), 33–40.Google ScholarDigital Library
- [36] . 2017. Fully dynamic algorithm for top-k densest subgraphs. In Proceedings of the CIKM. ACM, 1817–1826.Google ScholarDigital Library
- [37] . 1991. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci. 43, 3 (1991).Google ScholarCross Ref
- [38] . 2010. The community-search problem and how to plan a successful cocktail party. In Proceedings of the KDD. 939–948.Google ScholarDigital Library
- [39] . 2020. KClist++: A simple algorithm for finding k-clique densest subgraphs in large graphs. Proc. VLDB Endow. 13, 10 (2020), 1628–1640.Google ScholarDigital Library
- [40] . 2013. Discovering nested communities. In Proceedings of the ECML/PKDD (2).Google ScholarCross Ref
- [41] . 2015. Density-friendly graph decomposition. In Proceedings of the WWW. ACM, 1089–1099.Google ScholarDigital Library
- [42] . 2015. The k-clique densest subgraph problem. In Proceedings of the 24th WWW. 1122–1132.Google ScholarDigital Library
- [43] . 2013. Denser than the densest subgraph: Extracting optimal quasi-cliques with quality guarantees. In Proceedings of the KDD.Google ScholarDigital Library
- [44] . 2015. The K-clique densest subgraph problem. In Proceedings of the WWW. ACM, 1122–1132.Google ScholarDigital Library
- [45] . 2019. Novel dense subgraph discovery primitives: Risk aversion and exclusion queries. In Proceedings of the ECML/PKDD (1) (Lecture Notes in Computer Science), Vol. 11906. Springer, 378–394.Google Scholar
- [46] . 2012. Discovery of top-k dense subgraphs in dynamic graph collections. In Proceedings of the SSDBM.Google ScholarDigital Library
- [47] . 2010. On triangulation-based dense neighborhood graph discovery. Proc. VLDB Endow. 4, 2 (2010).Google ScholarDigital Library
Index Terms
- Finding Subgraphs with Maximum Total Density and Limited Overlap in Weighted Hypergraphs
Recommendations
Maximum weighted induced subgraphs
Let G be a finite, simple, undirected graph with vertex set V ( G ) and F be a family of graphs. A subgraph of G is F -free if it does not contain any graph of F as induced subgraph. In this paper, we present lower bounds on the maximum weight w ( H ) = ...
Finding Subgraphs with Maximum Total Density and Limited Overlap
WSDM '15: Proceedings of the Eighth ACM International Conference on Web Search and Data MiningFinding dense subgraphs in large graphs is a key primitive in a variety of real-world application domains, encompassing social network analytics, event detection, biology, and finance. In most such applications, one typically aims at finding several (...
Comments