Skip to main content

SpecGreedy: Unified Dense Subgraph Detection

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12457))

Abstract

How can we effectively detect fake reviews or fraudulent connections on a website? How can we spot communities that suddenly appear based on users’ interaction? And how can we efficiently find the minimum cut in a big graph? All of these are related to the problem of finding dense subgraphs, an important primitive problem in graph data analysis with extensive applications across various domains.

We focus on formulating the problem of detecting the densest subgraph in real-world large graphs, and we theoretically compare and contrast several closely related problems. Moreover, we propose a unified framework for the densest subgraph detection (GenDS) and devise a simple and computationally efficient algorithm, SpecGreedy, to solve it by leveraging the graph spectral properties with a greedy approach. We conduct thorough experiments on 40 real-world networks with up to 1.47 billion edges from various domains, and demonstrate that our algorithm yields up to \(58.6 \times \) speedup and achieves better or approximately equal-quality solutions for the densest subgraph detection compared to the baselines. Moreover, SpecGreedy scales linearly with the graph size and is proved effective in applications, such as finding collaborations that appear suddenly in a big, time-evolving co-authorship network.

W. Feng, S. Liu, H. Shen and X. Cheng—They are also with CAS Key Laboratory of Network Data Science & Technology, CAS, and University of Chinese Academy of Sciences, Beijing 100049, China.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In the other setting with \(\mathbf {Q}= \mathbf {D}\), this problem is also equivalent to set \(\mathbf {P}= -\mathbf {D}^{-1/2} \mathbf {L}\mathbf {D}^{-1/2}\), i.e., the normalized Laplacian matrix of \(\mathcal {G}\), and \(\mathbf {Q}= \mathbf {I}\).

  2. 2.

    [23, 29] used \(\tilde{\mathbf {A}}\) with different \(\gamma \) to explore the trade-off between density and size of final dense subgraphs with the domain-set based optimization method.

  3. 3.

    The proof details of the theorem refer to [10].

  4. 4.

    If \(\mathbf {A}_r\) is the symmetric matrix as in Eq. (9), \(|{L}| = |{R}| = n\) and \(\varDelta _{{L}} = \varDelta _{{R}} = \nicefrac {1}{\sqrt{n}}\).

References

  1. Akoglu, L., Tong, H., Koutra, D.: Graph based anomaly detection and description: a survey. Data Min. Knowl. Discov. 29(3), 626–688 (2015). https://doi.org/10.1007/s10618-014-0365-y

    Article  MathSciNet  Google Scholar 

  2. Andersen, R., Chellapilla, K.: Finding dense subgraphs with size bounds. In: Avrachenkov, K., Donato, D., Litvak, N. (eds.) WAW 2009. LNCS, vol. 5427, pp. 25–37. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-540-95995-3_3

    Chapter  Google Scholar 

  3. Andersen, R., Cioaba, S.M.: Spectral densest subgraph and independence number of a graph. J. UCS 13(11), 1501–1513 (2007)

    MathSciNet  Google Scholar 

  4. Asahiro, Y., Iwama, K., Tamaki, H., Tokuyama, T.: Greedily finding a dense subgraph. J. Algorithms 34(2), 203–221 (2000)

    Article  MathSciNet  Google Scholar 

  5. Boob, D., et al.: Flowless: Extracting densest subgraphs without flow computations. In: WWW 2020 (2020)

    Google Scholar 

  6. Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: a recursive model for graph mining. In: SDM, pp. 442–446. SIAM (2004)

    Google Scholar 

  7. Charikar, M.: Greedy approximation algorithms for finding dense components in a graph. In: Jansen, K., Khuller, S. (eds.) APPROX 2000. LNCS, vol. 1913, pp. 84–95. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44436-X_10

    Chapter  MATH  Google Scholar 

  8. Chen, J., Saad, Y.: Dense subgraph extraction with application to community detection. In: IEEE TKDE (2010)

    Google Scholar 

  9. Chu, L., Wang, Z., Pei, J., Wang, J., Zhao, Z., Chen, E.: Finding gangs in war from signed networks. In: KDD, pp. 1505–1514. ACM (2016)

    Google Scholar 

  10. Fan, R.K.C.: Spectral graph theory. American Mathematical Society (1996)

    Google Scholar 

  11. Dax, A.: From eigenvalues to singular values: a review. APM 3, 17 (2013)

    Article  Google Scholar 

  12. Eikmeier, N., Gleich, D.F.: Revisiting power-law distributions in spectra of real world networks. In: KDD, pp. 817–826 (2017)

    Google Scholar 

  13. Goldberg, A.V.: Finding a maximum density subgraph. UCB (1984)

    Google Scholar 

  14. Golub, G.H., Van Loan, C.F.: Matrix Computations, vol. 3. JHU Press, Baltimore (2012)

    MATH  Google Scholar 

  15. Hooi, B., Song, H.A., Beutel, A., Shah, N., Shin, K., Faloutsos, C.: FRAUDAR: bounding graph fraud in the face of camouflage. In: SIGKDD, pp. 895–904 (2016)

    Google Scholar 

  16. Lee, V.E., Ruan, N., Jin, R., Aggarwal, C.: A survey of algorithms for dense subgraph discovery. In: Aggarwal, C., Wang, H. (eds.) Managing and Mining Graph Data. Advances in Database Systems, vol. 40, pp. 303–336. Springer, Boston (2010). https://doi.org/10.1007/978-1-4419-6045-0_10

  17. Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., Ghahramani, Z.: Kronecker graphs: an approach to modeling networks. JMLR 11, 985–1042 (2010)

    MathSciNet  MATH  Google Scholar 

  18. Li, Z., Zhang, S., Wang, R.-S., Zhang, X.-S., Chen, L.: Erratum: quantitative function for community detection. Phys. Rev. E 91(1), 019901 (2015)

    Article  Google Scholar 

  19. Liu, S., Hooi, B., Faloutsos, C.: A contrast metric for fraud detection in rich graphs. TKDE 31(12), 2235–2248 (2018)

    Google Scholar 

  20. Liu, Y., Zhu, L., Szekely, P.A., Galstyan, A., Koutra, D.: Coupled clustering of time-series and networks. In: SDM, pp. 531–539. SIAM (2019)

    Google Scholar 

  21. Miyauchi, A., Kakimura, N.: Finding a dense subgraph with sparse cut. In: CIKM (2018)

    Google Scholar 

  22. Papailiopoulos, D., Mitliagkas, I., Dimakis, A., Caramanis, C.: Finding dense subgraphs via low-rank bilinear optimization. In: ICML, pp. 1890–1898 (2014)

    Google Scholar 

  23. Pavan, M., Pelillo, M.: Dominant sets and pairwise clustering. IEEE Trans. Pattern Anal. Mach. Intell. 29(1), 167–172 (2006)

    Article  Google Scholar 

  24. Prakash, B.A., Sridharan, A., Seshadri, M., Machiraju, S., Faloutsos, C.: EigenSpokes: surprising patterns and scalable community chipping in large graphs. In: Zaki, M.J., Yu, J.X., Ravindran, B., Pudi, V. (eds.) PAKDD 2010. LNCS (LNAI), vol. 6119, pp. 435–448. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13672-6_42

    Chapter  Google Scholar 

  25. Shen, H.-W., Cheng, X.-Q.: Spectral methods for the detection of network community structure: a comparative analysis. JSTAT 2010(10), P10020 (2010)

    Article  Google Scholar 

  26. Tsourakakis, C.E.: Fast counting of triangles in large real networks without counting: algorithms and laws. In: ICDM, pp. 608–617. IEEE (2008)

    Google Scholar 

  27. Tsourakakis, C.E., Chen, T., Kakimura, N., Pachocki, J.: Novel dense subgraph discovery primitives: risk aversion and exclusion queries. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds.) ECML PKDD 2019. LNCS (LNAI), vol. 11906, pp. 378–394. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-46150-8_23

    Chapter  Google Scholar 

  28. Wan, H., Zhang, Y., Zhang, J., Tang, J.: AMiner: search and mining of academic social networks. Data Intell. 1(1), 58–76 (2019)

    Article  Google Scholar 

  29. Wang, Z., Chu, L., Pei, J., Al-Barakati, A., Chen, E.: Tradeoffs between density and size in extracting dense subgraphs: a unified framework. In: ASONAM (2016)

    Google Scholar 

  30. Wong, S.W., Pastrello, C., Kotlyar, M., Faloutsos, C., Jurisica, I.: SDREGION: fast spotting of changing communities in biological networks. In: SIGKDD (2018)

    Google Scholar 

  31. Yang, Y., Chu, L., Zhang, Y., Wang, Z., Pei, J., Chen, E.: Mining density contrast subgraphs. In: ICDE, pp. 221–232. IEEE (2018)

    Google Scholar 

  32. Yin, H., Benson, A.R., Leskovec, J., Gleich, D.F.: Local higher-order graph clustering. In: KDD, pp. 555–564 (2017)

    Google Scholar 

Download references

Acknowledgments

This work was upported by the Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDA19020400, NSF of China No. 61772498, U1911401, 61872206, 91746301, National Science Foundation under Grant No. IIS 1845491, and Army Young Investigator Award No. W911NF1810397.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Wenjie Feng or Shenghua Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Feng, W., Liu, S., Koutra, D., Shen, H., Cheng, X. (2021). SpecGreedy: Unified Dense Subgraph Detection. In: Hutter, F., Kersting, K., Lijffijt, J., Valera, I. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12457. Springer, Cham. https://doi.org/10.1007/978-3-030-67658-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-67658-2_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-67657-5

  • Online ISBN: 978-3-030-67658-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics