skip to main content
research-article

Finding Subgraphs with Maximum Total Density and Limited Overlap in Weighted Hypergraphs

Published:12 February 2024Publication History
Skip Abstract Section

Abstract

Finding dense subgraphs in large (hyper)graphs is a key primitive in a variety of real-world application domains, encompassing social network analytics, event detection, biology, and finance. In most such applications, one typically aims at finding several (possibly overlapping) dense subgraphs, which might correspond to communities in social networks or interesting events. While a large amount of work is devoted to finding a single densest subgraph, perhaps surprisingly, the problem of finding several dense subgraphs in weighted hypergraphs with limited overlap has not been studied in a principled way, to the best of our knowledge. In this work, we define and study a natural generalization of the densest subgraph problem in weighted hypergraphs, where the main goal is to find at most k subgraphs with maximum total aggregate density, while satisfying an upper bound on the pairwise weighted Jaccard coefficient, i.e., the ratio of weights of intersection divided by weights of union on two nodes sets of the subgraphs. After showing that such a problem is NP-Hard, we devise an efficient algorithm that comes with provable guarantees in some cases of interest, as well as, an efficient practical heuristic. Our extensive evaluation on large real-world hypergraphs confirms the efficiency and effectiveness of our algorithms.

REFERENCES

  1. [1] Andersen Reid and Chellapilla Kumar. 2009. Finding dense subgraphs with size bounds. In Proceedings of the WAW.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. [2] Angel Albert, Sarkas Nikos, Koudas Nick, and Srivastava Divesh. 2012. Dense subgraph maintenance under streaming edge weight updates for real-time story identification. Proc. VLDB Endow. 5, 6 (2012).Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Asahiro Yuichi, Hassin Refael, and Iwama Kazuo. 2002. Complexity of finding dense subgraphs. Discr. Appl. Math. 121, 1-3 (2002).Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Asahiro Yuichi, Iwama Kazuo, Tamaki Hisao, and Tokuyama Takeshi. 2000. Greedily finding a dense subgraph. J. Algorithms 34, 2 (2000).Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Bahmani Bahman, Kumar Ravi, and Vassilvitskii Sergei. 2012. Densest subgraph in streaming and MapReduce. Proc. VLDB Endow. 5, 5 (2012).Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. [6] Balalau Oana Denisa, Bonchi Francesco, Chan T.-H. Hubert, Gullo Francesco, and Sozio Mauro. 2015. Finding subgraphs with maximum total density and limited overlap. In Proceedings of the WSDM. ACM, 379388.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Baumgartner Jason, Zannettou Savvas, Keegan Brian, Squire Megan, and Blackburn Jeremy. 2020. The pushshift reddit dataset. In Proceedings of the ICWSM, Vol. 14. 830839.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Bera Suman K., Bhattacharya Sayan, Choudhari Jayesh, and Ghosh Prantar. 2022. A new dynamic algorithm for densest subhypergraphs. In Proceedings of the WWW. ACM, 10931103.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. [9] Bhaskara Aditya, Charikar Moses, Chlamtac Eden, Feige Uriel, and Vijayaraghavan Aravindan. 2010. Detecting high log-densities: An O(n\({}^{\mbox{1/4}}\)) approximation for densest k-subgraph. In Proceedings of the STOC. 201210.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Bhattacharya Sayan, Henzinger Monika, Nanongkai Danupon, and Tsourakakis Charalampos E.. 2015. Space- and time-efficient algorithm for maintaining dense subgraphs on one-pass dynamic streams. In Proceedings of the 47th STOC, Servedio Rocco A. and Rubinfeld Ronitt (Eds.). ACM, 173182.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Bonchi Francesco, Gullo Francesco, Kaltenbrunner Andreas, and Volkovich Yana. 2014. Core decomposition of uncertain graphs. In Proceedings of the KDD.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. [12] Boob Digvijay, Gao Yu, Peng Richard, Sawlani Saurabh, Tsourakakis Charalampos E., Wang Di, and Wang Junxing. 2020. Flowless: Extracting densest subgraphs without flow computations. In Proceedings of the WWW. ACM / IW3C2, 573583.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Charikar Moses. 2000. Greedy approximation algorithms for finding dense components in a graph. In Proceedings of the APPROX, Jansen Klaus and Khuller Samir (Eds.). Springer.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Chekuri Chandra, Quanrud Kent, and Torres Manuel R.. 2022. Densest subgraph: Supermodularity, iterative peeling, and flow. In Proceedings of the SODA. SIAM, 15311555.Google ScholarGoogle ScholarCross RefCross Ref
  15. [15] Chen Jie and Saad Yousef. 2012. Dense subgraph extraction with application to community detection. Trans. Knowl. Data Eng. 24, 7 (2012).Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Chlamtác Eden, Dinitz Michael, Konrad Christian, Kortsarz Guy, and Rabanca George. 2018. The densest k-subhypergraph problem. SIAM J. Discret. Math. 32, 2 (2018), 14581477.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Cui Wanyun, Xiao Yanghua, Wang Haixun, Lu Yiqi, and Wang Wei. 2013. Online search of overlapping communities. In Proceedings of the SIGMOD.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Danisch Maximilien, Chan T.-H. Hubert, and Sozio Mauro. 2017. Large scale density-friendly graph decomposition via convex programming. In Proceedings of the WWW. ACM, 233242.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Dondi Riccardo, Hosseinzadeh Mohammad Mehdi, Mauri Giancarlo, and Zoppis Italo. 2021. Top-k overlapping densest subgraphs: Approximation algorithms and computational complexity. J. Comb. Optim. 41, 1 (2021), 80104.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. [20] Du Xiaoxi, Jin Ruoming, Ding Liang, Lee Victor E., and Jr. John H. Thornton,2009. Migration motif: A spatial - temporal pattern mining approach for financial markets. In Proceedings of the KDD.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Epasto Alessandro, Lattanzi Silvio, and Sozio Mauro. 2015. Efficient densest subgraph computation in evolving graphs. In Proceedings of the 24th WWW, Gangemi Aldo, Leonardi Stefano, and Panconesi Alessandro (Eds.). ACM, 300310.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Fratkin Eugene, Naughton Brian T., Brutlag Douglas L., and Batzoglou Serafim. 2006. MotifCut: Regulatory motifs finding with maximum density subgraphs. In Proceedings of the ISMB.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Galbrun Esther, Gionis Aristides, and Tatti Nikolaj. 2016. Top-k overlapping densest subgraphs. Data Min. Knowl. Discov. 30, 5 (2016), 11341165.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Gibson David, Kumar Ravi, and Tomkins Andrew. 2005. Discovering large dense subgraphs in massive graphs. In Proceedings of the VLDB.Google ScholarGoogle Scholar
  25. [25] Goldberg A. V.. 1984. Finding a Maximum Density Subgraph. Technical Report. University of California at Berkeley.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Hu Shuguang, Wu Xiaowei, and Chan T.-H. Hubert. 2017. Maintaining densest subsets efficiently in evolving hypergraphs. In Proceedings of the CIKM. ACM, 929938.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Junttila Tommi A. and Kaski Petteri. 2007. Engineering an efficient canonical labeling tool for large and sparse graphs. In Proceedings of the ALENEX. SIAM.Google ScholarGoogle ScholarCross RefCross Ref
  28. [28] Khot Subhash. 2006. Ruling out PTAS for graph min-bisection, dense \(k\)-subgraph, and bipartite clique. J. Comput. 36, 4 (2006).Google ScholarGoogle Scholar
  29. [29] Khuller Samir and Saha Barna. 2009. On finding dense subgraphs. In Proceedings of the ICALP.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Konar Aritra and Sidiropoulos Nicholas D.. 2021. Exploring the subgraph density-size trade-off via the Lovaśz extension. In Proceedings of the WSDM. ACM, 743751.Google ScholarGoogle Scholar
  31. [31] Lanciano Tommaso, Miyauchi Atsushi, Fazzone Adriano, and Bonchi Francesco. 2023. A survey on the densest subgraph problem and its variants. Retrieved from https://arXiv:2303.14467. Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Langston Michael A. and al. et2005. A combinatorial approach to the analysis of differential gene expression data: The use of graph algorithms for disease prediction and screening. In Methods of Microarray Data Analysis IV. Springer, Berlin.Google ScholarGoogle Scholar
  33. [33] Lee Victor E., Ruan Ning, Jin Ruoming, and Aggarwal Charu C.. 2010. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data. Springer, Berlin.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Liu Hairong, Latecki Longin Jan, and Yan Shuicheng. 2015. Dense subgraph partition of positive hypergraphs. IEEE Trans. Pattern Anal. Mach. Intell. 37, 3 (2015), 541554.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Ma Chenhao, Fang Yixiang, Cheng Reynold, Lakshmanan Laks V. S., Zhang Wenjie, and Lin Xuemin. 2021. Efficient directed densest subgraph discovery. SIGMOD Rec. 50, 1 (2021), 3340.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. [36] Nasir Muhammad Anis Uddin, Gionis Aristides, Morales Gianmarco De Francisci, and Girdzijauskas Sarunas. 2017. Fully dynamic algorithm for top-k densest subgraphs. In Proceedings of the CIKM. ACM, 18171826.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. [37] Papadimitriou Christos H. and Yannakakis Mihalis. 1991. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci. 43, 3 (1991).Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Sozio Mauro and Gionis Aristides. 2010. The community-search problem and how to plan a successful cocktail party. In Proceedings of the KDD. 939948.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Sun Bintao, Danisch Maximilien, Chan T.-H. Hubert, and Sozio Mauro. 2020. KClist++: A simple algorithm for finding k-clique densest subgraphs in large graphs. Proc. VLDB Endow. 13, 10 (2020), 16281640.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Tatti Nikolaj and Gionis Aristides. 2013. Discovering nested communities. In Proceedings of the ECML/PKDD (2).Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Tatti Nikolaj and Gionis Aristides. 2015. Density-friendly graph decomposition. In Proceedings of the WWW. ACM, 10891099.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Tsourakakis Charalampos. 2015. The k-clique densest subgraph problem. In Proceedings of the 24th WWW. 11221132.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. [43] Tsourakakis Charalampos, Bonchi Francesco, Gionis Aristides, Gullo Francesco, and Tsiarli Maria. 2013. Denser than the densest subgraph: Extracting optimal quasi-cliques with quality guarantees. In Proceedings of the KDD.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Tsourakakis Charalampos E.. 2015. The K-clique densest subgraph problem. In Proceedings of the WWW. ACM, 11221132.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Tsourakakis Charalampos E., Chen Tianyi, Kakimura Naonori, and Pachocki Jakub. 2019. Novel dense subgraph discovery primitives: Risk aversion and exclusion queries. In Proceedings of the ECML/PKDD (1) (Lecture Notes in Computer Science), Vol. 11906. Springer, 378394.Google ScholarGoogle Scholar
  46. [46] Valari Elena, Kontaki Maria, and Papadopoulos Apostolos N.. 2012. Discovery of top-k dense subgraphs in dynamic graph collections. In Proceedings of the SSDBM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Wang Nan, Zhang Jingbo, Tan Kian-Lee, and Tung Anthony K. H.. 2010. On triangulation-based dense neighborhood graph discovery. Proc. VLDB Endow. 4, 2 (2010).Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Finding Subgraphs with Maximum Total Density and Limited Overlap in Weighted Hypergraphs

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Knowledge Discovery from Data
        ACM Transactions on Knowledge Discovery from Data  Volume 18, Issue 4
        May 2024
        707 pages
        ISSN:1556-4681
        EISSN:1556-472X
        DOI:10.1145/3613622
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 February 2024
        • Online AM: 2 January 2024
        • Accepted: 15 December 2023
        • Revised: 27 October 2023
        • Received: 16 June 2022
        Published in tkdd Volume 18, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
      • Article Metrics

        • Downloads (Last 12 months)166
        • Downloads (Last 6 weeks)51

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text