Skip to main content
Log in

Efficient (\(\alpha \), \(\beta \))-core computation in bipartite graphs

The VLDB Journal Aims and scope Submit manuscript

Abstract

The problem of computing (\(\alpha , \beta \))-core in a bipartite graph for given \(\alpha \) and \(\beta \) is a fundamental problem in bipartite graph analysis and can be used in many applications such as online group recommendation and fraudsters detection Existing solution to computing (\(\alpha , \beta \))-core needs to traverse the entire bipartite graph once and ignore the fact that real-world graphs are often dynamic. Considering the real bipartite graph can be very large and dynamically updated, and the requests to compute (\(\alpha , \beta \))-core can be issued frequently in real applications, the existing solution is too expensive to compute the \((\alpha ,\beta )\)-core. In this paper, we present an efficient algorithm for (\(\alpha , \beta \))-core computation based on a novel index such that the algorithm runs in linear time regarding the result size (thus, the algorithm is optimal since it needs at least linear time to output the result). We prove that the index only requires O(m) space where m is the number of edges in the bipartite graph. We also devise an efficient algorithm with time complexity \(O (\delta \cdot m)\) for index construction where \(\delta \) is bounded by \(\sqrt{m}\) and is much smaller than \(\sqrt{m}\) in practice. Moreover, we discuss efficient algorithms to maintain the index when the bipartite graph is dynamically updated. We show that we can decide whether a node in the index should be updated or not by visiting its neighbors. Based on this locality property, we propose an efficient index maintenance algorithm which only needs to visit a local subgraph near the inserted or removed edge. Finally, we show how to implement our index construction and maintenance algorithms in parallel. The experimental results on real and synthetic graphs (more than 1 billion edges) demonstrate that our algorithms achieve up to 5 orders of magnitude speedup for computing \((\alpha ,\beta )\)-core, up to 3 orders of magnitude speedup for index construction and up to 4 orders of magnitude speedup for index maintenance, respectively, compared with existing techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Notes

  1. http://konect.uni-koblenz.de/networks.

  2. http://konect.uni-koblenz.de/networks/reuters.

  3. http://konect.uni-koblenz.de/networks/gottron-trec.

References

  1. Abbasi, A., Hossain, L., Leydesdorff, L.: Betweenness centrality as a driver of preferential attachment in the evolution of research collaboration networks. J. Informetr. 6(3), 403–412 (2012)

    Article  Google Scholar 

  2. Ahmed, A., Batagelj, V., Fu, X., Hong, S.-H., Merrick, D., Mrvar, A.: Visualisation and analysis of the internet movie database. In: Proceedings of APVIS, pp 17–24 (2007)

  3. Aksoy, S.G., Kolda, T.G., Pinar, A.: Measuring and modeling bipartite graphs with community structure. J. Complex Netw. 5, 581–603 (2017)

    Article  MathSciNet  Google Scholar 

  4. Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: k-core decomposition: a tool for the visualization of large scale networks. arXiv preprint arXiv:cs/0504107 (2005)

  5. Amer-Yahia, S., Roy, S.B., Chawlat, A., Das, G., Yu, C.: Group recommendation: semantics and efficiency. Proc. VLDB 2(1), 754–765 (2009)

    Article  Google Scholar 

  6. Bader, G.D., Hogue, C.W.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformat. 4(1), 2 (2003)

    Article  Google Scholar 

  7. Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  8. Batagelj, V., Zaversnik, M.: An o (m) algorithm for cores decomposition of networks. arXiv preprint arXiv:cs/0310049 (2003)

  9. Beutel, A., Xu, W., Guruswami, V., Palow, C., Faloutsos, C.: Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In: Proceedings of WWW, pp 119–130 (2013)

  10. Carvalho, L.A.M.C., Macedo, H.T.: Users’ satisfaction in recommendation systems for groups: an approach based on noncooperative games. In: Proceedings of WWW, pp 951–958 (2013)

  11. Cerinšek, M., Batagelj, V.: Generalized two-mode cores. Soc. Netw. 42, 80–87 (2015)

    Article  Google Scholar 

  12. Dhulipala, L., Blelloch, G.E., Shun, J.: Theoretically efficient parallel graph algorithms can be fast and scalable. In: Proceedings of SPAA, pp 393–404 (2018)

  13. Ding, D., Li, H., Huang, Z., Mamoulis, N.: Efficient fault-tolerant group recommendation using alpha–beta–core. In: Proceedings of CIKM, pp 2047–2050 (2017)

  14. Dormann, C.F., Fründ, J., Blüthgen, N., Gruber, B.: Indices, graphs and null models: analyzing bipartite ecological networks. Open Ecol. J. 2(1) (2009)

  15. Epasto, A., Lattanzi, S., Sozio, M.: Efficient densest subgraph computation in evolving graphs. In: Proceedings of WWW, pp 300–310 (2015)

  16. Fan, W., Li, J., Luo, J., Tan, Z., Wang, X., Wu, Y.: Incremental graph pattern matching. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, SIGMOD ’11, ACM, pp 925–936 (2011)

  17. Fang, Y., Cheng, R., Chen, Y., Luo, S., Hu, J.: Effective and efficient attributed community search. VLDB J. 26(6), 803–828 (2017)

    Article  Google Scholar 

  18. Fang, Y., Cheng, R., Li, X., Luo, S., Hu, J.: Effective community search over large spatial graphs. Proc. VLDB Endow. 10(6), 709–720 (2017)

    Article  Google Scholar 

  19. Fang, Y., Cheng, R., Luo, S., Hu, J.: Effective community search for large attributed graphs. Proc. VLDB Endow. 9(12), 1233–1244 (2016)

    Article  Google Scholar 

  20. Fang, Y., Wang, Z., Cheng, R., Li, X., Luo, S., Hu, J., Chen, X.: On spatial-aware community search. IEEE TKDE 31(4), 783–798 (2018)

    Google Scholar 

  21. Fang, Y., Wang, Z., Cheng, R., Wang, H., Hu, J.: Effective and efficient community search over large directed graphs. IEEE TKDE 31(11), 2093–2107 (2018)

    Google Scholar 

  22. Fang, Y., Yu, K., Cheng, R., Lakshmanan, L.V., Lin, X.: Efficient algorithms for densest subgraph discovery. Proc. VLDB Endow. 12(11), 1719–1732 (2019)

    Article  Google Scholar 

  23. Feng, X., Chang, L., Lin, X., Qin, L., Zhang, W., Yuan, L.: Distributed computing connected components with linear communication cost. Distrib. Parallel Databases 36(3), 555–592 (2018)

    Article  Google Scholar 

  24. Gartrell, M., Xing, X., Lv, Q., Beach, A., Han, R., Mishra, S., Seada, K.: Enhancing group recommendation by incorporating social relationship interactions. In: Proceedings of the 16th ACM International Conference on Supporting Group Work, ACM, pp 97–106 (2010)

  25. Giatsidis, C., Thilikos, D.M., Vazirgiannis, M.: D-cores: measuring collaboration of directed graphs based on degeneracy. In: Proceedings of ICDM, pp 201–210 (2011)

  26. Giatsidis, C., Thilikos, D.M., Vazirgiannis, M.: Evaluating cooperation in communities with the k-core structure. In: Proceedings of ASONAM, IEEE, pp 87–93 (2011)

  27. Gorla, J., Lathia, N., Robertson, S., Wang, J.: Probabilistic group recommendation via information matching. In: Proceedings of WWW, pp 495–504 (2013)

  28. Guillaume, J.-L., Latapy, M.: Bipartite structure of all complex networks. Inf. Process. Lett. 90(5), 215–221 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  29. Guillaume, J.-L., Latapy, M.: Bipartite graphs as models of complex networks. Phys. A Stat. Mech. Appl. 371(2), 795–813 (2006)

    Article  Google Scholar 

  30. Gunnemann, S., Muller, E., Raubach, S., Seidl, T.: Flexible fault tolerant subspace clustering for data with missing values. In: Proceedings of ICDM, pp 231–240 (2011)

  31. Hochbaum, D.S.: Approximating clique and biclique problems. J. Algorithms 29(1), 174–200 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  32. Kannan, R., Tetali, P., Vempala, S.: Simple Markov–Chain algorithms for generating bipartite graphs and tournaments. In: Proceedings of SODA, pp 193–200 (1997)

  33. Kaytoue, M., Kuznetsov, S.O., Napoli, A., Duplessis, S.: Mining gene expression data with pattern structures in formal concept analysis. Inf. Sci. 181(10), 1989–2001 (2011)

    Article  MathSciNet  Google Scholar 

  34. Khaouid, W., Barsky, M., Srinivasan, V., Thomo, A.: K-core decomposition of large networks on a single PC. Proc. VLDB Endow. 9(1), 13–23 (2015)

    Article  Google Scholar 

  35. Kolda, T.G., Pinar, A., Plantenga, T., Seshadhri, C.: A scalable generative graph model with community structure. SIAM J. Sci. Comput. 36(5), C424–C452 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  36. Kumar, R., Novak, J., Tomkins, A.: Structure and evolution of online social networks, pp. 337–357. Springer, New York (2010)

    Google Scholar 

  37. Ley, M.: The DBLP computer science bibliography: evolution, research issues, perspectives. In: Proc. Int. SPIRE, pp 1–10 (2002)

  38. Li, J., Sim, K., Liu, G., Wong, L.: Maximal quasi-bicliques with balanced noise tolerance: concepts and co-clustering applications. In: Proceedings of ICDM, pp 72–83 (2008)

  39. Linden, G., Smith, B., York, J.: Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Comput. 1, 76–80 (2003)

    Article  Google Scholar 

  40. Liu, B., Yuan, L., Lin, X., Qin, L., Zhang, W., Zhou, J.: Efficient (\(\alpha \), \(\beta \))-core computation: an index-based approach. In: Proceedings of WWW, pp 1130–1141 (2019)

  41. Liu, B., Zhang, F., Zhang, C., Zhang, W., Lin, X.: Corecube: core decomposition in multilayer graphs. In: WISE, Springer, pp 694–710. (2019)

  42. Liu, X., Li, J., Wang, L.: Modeling protein interacting groups by quasi-bicliques: complexity, algorithm, and application. IEEE/ACM Trans. Comput. Biol. Bioinformat. 7(2), 354–364 (2010)

    Article  Google Scholar 

  43. Lumsdaine, A., Gregor, D., Hendrickson, B., Berry, J.: Challenges in parallel graph processing. Parallel Process. Lett. 17(01), 5–20 (2007)

    Article  MathSciNet  Google Scholar 

  44. Mohammad, A., Aleksandar, I., Boualem, B., Seyed-Mehdi-Reza, B., Elisa, B., Norman, F.: Collusion detection in online rating systems. In: Web Technologies and Applications, pp 196–207 (2013)

  45. Nacher, J., Ochiai, T., Hayashida, M., Akutsu, T.: A mathematical model for generating bipartite graphs and its application to protein networks. J. Phys. A Math. Theor. 42(48), 485005 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  46. Ntoutsi, E., Stefanidis, K., Nørvåg, K., Kriegel, H.-P.: Fast group recommendations by applying user clustering. In: International Conference on Conceptual Modeling, Springer, pp 126–140 (2012)

  47. Ntoutsi, E., Stefanidis, K., Rausch, K., Kriegel, H.-P.: “strength lies in differences”: diversifying friends for recommendations through subspace clustering. In: Proceedings of CIKM, pp 729–738 (2014)

  48. Ohsaka, N., Maehara, T., Kawarabayashi, K.: Efficient pagerank tracking in evolving networks. In: Proceedings of SIGKDD, pp 875–884 (2015)

  49. Oliveira, R.V., Zhang, B., Zhang, L.: Observing the evolution of internet as topology. SIGCOMM Comput. Commun. Rev. 37(4), 313–324 (2007)

    Article  Google Scholar 

  50. Peeters, R.: The maximum edge biclique problem is NP-complete. Dis. Appl. Math. 131(3), 651–654 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  51. Peng, Y., Zhang, Y., Zhang, W., Lin, X., Qin, L.: Efficient probabilistic k-core computation on uncertain graphs. In: Proceedings of ICDE, IEEE, pp 1192–1203 (2018)

  52. Poernomo, A.K., Gopalkrishnan, V.: Towards efficient mining of proportional fault-tolerant frequent itemsets. In: Proceedings of SIGKDD, pp 697–706 (2009)

  53. Saavedra, S., Reed-Tsochas, F., Uzzi, B.: A simple model of bipartite cooperation for ecological and organizational networks. Nature 457(7228), 463–466 (2009)

    Article  Google Scholar 

  54. Sanei-Mehri, S.-V., Sariyuce, A.E., Tirthapura, S.: Butterfly counting in bipartite networks. In: Proceedings of KDD, ACM, pp 2150–2159 (2018)

  55. Saríyüce, A.E., Gedik, B., Jacques-Silva, G., Wu, K.-L., Çatalyürek, Ü.V.: Streaming algorithms for k-core decomposition. Proc. VLDB Endow. 6(6), 433–444 (2013)

    Article  Google Scholar 

  56. Sarıyüce, A.E., Gedik, B., Jacques-Silva, G., Wu, K.-L., Çatalyürek, Ü.V.: Incremental k-core decomposition: algorithms and evaluation. VLDB J. 25(3), 425–447 (2016)

    Article  Google Scholar 

  57. Sarıyüce, A.E., Pinar, A.: Peeling bipartite networks for dense subgraph discovery. In: Proceedings of WSDM, pp 504–512 (2018)

  58. Seidman, S.B.: Network structure and minimum degree. Soc. Netw. 5(3), 269–287 (1983)

    Article  MathSciNet  Google Scholar 

  59. Shun, J., Blelloch, G.E.: Ligra: a lightweight graph processing framework for shared memory. In: Proceedings of PPoPP, pp 135–146 (2013)

  60. Sim, K., Li, J., Gopalkrishnan, V., Liu, G.: Mining maximal quasi-bicliques to co-cluster stocks and financial ratios for value investment. In: Proceedings of ICDM, pp 1059–1063 (2006)

  61. Slota, G.M., Rajamanickam, S., Madduri, K.: BFS and coloring-based parallel algorithms for strongly connected components and related problems. In: Proceedings of IPDPS, pp 550–559 (2014)

  62. Wang, J., De Vries, A.P., Reinders, M.J.: Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In: Proceedings of SIGIR, pp 501–508 (2006)

  63. Wang, J., Huang, P., Zhao, H., Zhang, Z., Zhao, B., Lee, D.L.: Billion-scale commodity embedding for e-commerce recommendation in alibaba. In: Proceedings of SIGKDD, pp 839–848 (2018)

  64. Wang, K., Cao, X., Lin, X., Zhang, W., Qin, L.: Efficient computing of radius-bounded k-cores. In: Proceedings of ICDE, pp 233–244 (2018)

  65. Wu, X., Yuan, L., Lin, X., Yang, S., Zhang, W.: Towards efficient k-tripeak decomposition on large graphs. In: Proceedings of DASFAA, pp 604–621 (2019)

  66. Wuchty, S., Almaas, E.: Peeling the yeast protein network. Proteomics 5(2), 444–449 (2005)

    Article  Google Scholar 

  67. Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: Diversified top-k clique search. In: Proceedings of ICDE, pp 387–398 (2015)

  68. Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: Diversified top-k clique search. VLDB J. 25(2), 171–196 (2016)

    Article  Google Scholar 

  69. Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: I/O efficient ECC graph decomposition via graph reduction. PVLDB 9(7), 516–527 (2016)

    Google Scholar 

  70. Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: Effective and efficient dynamic graph coloring. PVLDB 11(3), 338–351 (2017)

    Google Scholar 

  71. Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: I/O efficient ECC graph decomposition via graph reduction. VLDB J. 26(2), 275–300 (2017)

    Article  Google Scholar 

  72. Yuan, L., Qin, L., Zhang, W., Chang, L., Yang, J.: Index-based densest clique percolation community search in networks. IEEE TKDE 30(5), 922–935 (2018)

    Google Scholar 

  73. Yuan, Q., Cong, G., Lin, C.-Y.: Com: a generative model for group recommendation. In: Proceedings of KDD, pp 163–172 (2014)

  74. Zhang, F., Yuan, L., Zhang, Y., Qin, L., Lin, X., Zhou, A.: Discovering strong communities with user engagement and tie strength. In: Proceedings of DASFAA, pp 425–441 (2018)

  75. Zhang, Y., Parthasarathy, S.: Extracting analyzing and visualizing triangle k-core motifs within networks. In: Proceedings of ICDE, pp 1049–1060 (2012)

  76. Zhang, Y., Phillips, C.A., Rogers, G.L., Baker, E.J., Chesler, E.J., Langston, M.A.: On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC Bioinformat. 15(1), 110 (2014)

    Article  Google Scholar 

  77. Zhang, Y., Yu, J.X., Zhang, Y., Qin, L.: A fast order-based approach for core maintenance. In: Proceedings of ICDE, pp 337–348 (2017)

  78. Zhu, A.D., Lin, W., Wang, S., Xiao, X.: Reachability queries on large dynamic graphs: a total order approach. In: Proceedings of SIGMOD, pp 1323–1334 (2014)

Download references

Acknowledgements

Long Yuan is supported by NSFC61902184 and NSF of Jiangsu Province BK20190453. Xuemin Lin is supported by 2018YFB1003504, NSFC61232006, ARC DP180103096 and DP170101628. Lu Qin is supported by ARC DP160101513. Wenjie Zhang is supported by ARC DP180103096.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Long Yuan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, B., Yuan, L., Lin, X. et al. Efficient (\(\alpha \), \(\beta \))-core computation in bipartite graphs. The VLDB Journal 29, 1075–1099 (2020). https://doi.org/10.1007/s00778-020-00606-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-020-00606-9

Keywords

Navigation