Abstract
The problem of computing (\(\alpha , \beta \))-core in a bipartite graph for given \(\alpha \) and \(\beta \) is a fundamental problem in bipartite graph analysis and can be used in many applications such as online group recommendation and fraudsters detection Existing solution to computing (\(\alpha , \beta \))-core needs to traverse the entire bipartite graph once and ignore the fact that real-world graphs are often dynamic. Considering the real bipartite graph can be very large and dynamically updated, and the requests to compute (\(\alpha , \beta \))-core can be issued frequently in real applications, the existing solution is too expensive to compute the \((\alpha ,\beta )\)-core. In this paper, we present an efficient algorithm for (\(\alpha , \beta \))-core computation based on a novel index such that the algorithm runs in linear time regarding the result size (thus, the algorithm is optimal since it needs at least linear time to output the result). We prove that the index only requires O(m) space where m is the number of edges in the bipartite graph. We also devise an efficient algorithm with time complexity \(O (\delta \cdot m)\) for index construction where \(\delta \) is bounded by \(\sqrt{m}\) and is much smaller than \(\sqrt{m}\) in practice. Moreover, we discuss efficient algorithms to maintain the index when the bipartite graph is dynamically updated. We show that we can decide whether a node in the index should be updated or not by visiting its neighbors. Based on this locality property, we propose an efficient index maintenance algorithm which only needs to visit a local subgraph near the inserted or removed edge. Finally, we show how to implement our index construction and maintenance algorithms in parallel. The experimental results on real and synthetic graphs (more than 1 billion edges) demonstrate that our algorithms achieve up to 5 orders of magnitude speedup for computing \((\alpha ,\beta )\)-core, up to 3 orders of magnitude speedup for index construction and up to 4 orders of magnitude speedup for index maintenance, respectively, compared with existing techniques.



















Similar content being viewed by others
References
Abbasi, A., Hossain, L., Leydesdorff, L.: Betweenness centrality as a driver of preferential attachment in the evolution of research collaboration networks. J. Informetr. 6(3), 403–412 (2012)
Ahmed, A., Batagelj, V., Fu, X., Hong, S.-H., Merrick, D., Mrvar, A.: Visualisation and analysis of the internet movie database. In: Proceedings of APVIS, pp 17–24 (2007)
Aksoy, S.G., Kolda, T.G., Pinar, A.: Measuring and modeling bipartite graphs with community structure. J. Complex Netw. 5, 581–603 (2017)
Alvarez-Hamelin, J.I., Dall’Asta, L., Barrat, A., Vespignani, A.: k-core decomposition: a tool for the visualization of large scale networks. arXiv preprint arXiv:cs/0504107 (2005)
Amer-Yahia, S., Roy, S.B., Chawlat, A., Das, G., Yu, C.: Group recommendation: semantics and efficiency. Proc. VLDB 2(1), 754–765 (2009)
Bader, G.D., Hogue, C.W.: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformat. 4(1), 2 (2003)
Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)
Batagelj, V., Zaversnik, M.: An o (m) algorithm for cores decomposition of networks. arXiv preprint arXiv:cs/0310049 (2003)
Beutel, A., Xu, W., Guruswami, V., Palow, C., Faloutsos, C.: Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In: Proceedings of WWW, pp 119–130 (2013)
Carvalho, L.A.M.C., Macedo, H.T.: Users’ satisfaction in recommendation systems for groups: an approach based on noncooperative games. In: Proceedings of WWW, pp 951–958 (2013)
Cerinšek, M., Batagelj, V.: Generalized two-mode cores. Soc. Netw. 42, 80–87 (2015)
Dhulipala, L., Blelloch, G.E., Shun, J.: Theoretically efficient parallel graph algorithms can be fast and scalable. In: Proceedings of SPAA, pp 393–404 (2018)
Ding, D., Li, H., Huang, Z., Mamoulis, N.: Efficient fault-tolerant group recommendation using alpha–beta–core. In: Proceedings of CIKM, pp 2047–2050 (2017)
Dormann, C.F., Fründ, J., Blüthgen, N., Gruber, B.: Indices, graphs and null models: analyzing bipartite ecological networks. Open Ecol. J. 2(1) (2009)
Epasto, A., Lattanzi, S., Sozio, M.: Efficient densest subgraph computation in evolving graphs. In: Proceedings of WWW, pp 300–310 (2015)
Fan, W., Li, J., Luo, J., Tan, Z., Wang, X., Wu, Y.: Incremental graph pattern matching. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, SIGMOD ’11, ACM, pp 925–936 (2011)
Fang, Y., Cheng, R., Chen, Y., Luo, S., Hu, J.: Effective and efficient attributed community search. VLDB J. 26(6), 803–828 (2017)
Fang, Y., Cheng, R., Li, X., Luo, S., Hu, J.: Effective community search over large spatial graphs. Proc. VLDB Endow. 10(6), 709–720 (2017)
Fang, Y., Cheng, R., Luo, S., Hu, J.: Effective community search for large attributed graphs. Proc. VLDB Endow. 9(12), 1233–1244 (2016)
Fang, Y., Wang, Z., Cheng, R., Li, X., Luo, S., Hu, J., Chen, X.: On spatial-aware community search. IEEE TKDE 31(4), 783–798 (2018)
Fang, Y., Wang, Z., Cheng, R., Wang, H., Hu, J.: Effective and efficient community search over large directed graphs. IEEE TKDE 31(11), 2093–2107 (2018)
Fang, Y., Yu, K., Cheng, R., Lakshmanan, L.V., Lin, X.: Efficient algorithms for densest subgraph discovery. Proc. VLDB Endow. 12(11), 1719–1732 (2019)
Feng, X., Chang, L., Lin, X., Qin, L., Zhang, W., Yuan, L.: Distributed computing connected components with linear communication cost. Distrib. Parallel Databases 36(3), 555–592 (2018)
Gartrell, M., Xing, X., Lv, Q., Beach, A., Han, R., Mishra, S., Seada, K.: Enhancing group recommendation by incorporating social relationship interactions. In: Proceedings of the 16th ACM International Conference on Supporting Group Work, ACM, pp 97–106 (2010)
Giatsidis, C., Thilikos, D.M., Vazirgiannis, M.: D-cores: measuring collaboration of directed graphs based on degeneracy. In: Proceedings of ICDM, pp 201–210 (2011)
Giatsidis, C., Thilikos, D.M., Vazirgiannis, M.: Evaluating cooperation in communities with the k-core structure. In: Proceedings of ASONAM, IEEE, pp 87–93 (2011)
Gorla, J., Lathia, N., Robertson, S., Wang, J.: Probabilistic group recommendation via information matching. In: Proceedings of WWW, pp 495–504 (2013)
Guillaume, J.-L., Latapy, M.: Bipartite structure of all complex networks. Inf. Process. Lett. 90(5), 215–221 (2004)
Guillaume, J.-L., Latapy, M.: Bipartite graphs as models of complex networks. Phys. A Stat. Mech. Appl. 371(2), 795–813 (2006)
Gunnemann, S., Muller, E., Raubach, S., Seidl, T.: Flexible fault tolerant subspace clustering for data with missing values. In: Proceedings of ICDM, pp 231–240 (2011)
Hochbaum, D.S.: Approximating clique and biclique problems. J. Algorithms 29(1), 174–200 (1998)
Kannan, R., Tetali, P., Vempala, S.: Simple Markov–Chain algorithms for generating bipartite graphs and tournaments. In: Proceedings of SODA, pp 193–200 (1997)
Kaytoue, M., Kuznetsov, S.O., Napoli, A., Duplessis, S.: Mining gene expression data with pattern structures in formal concept analysis. Inf. Sci. 181(10), 1989–2001 (2011)
Khaouid, W., Barsky, M., Srinivasan, V., Thomo, A.: K-core decomposition of large networks on a single PC. Proc. VLDB Endow. 9(1), 13–23 (2015)
Kolda, T.G., Pinar, A., Plantenga, T., Seshadhri, C.: A scalable generative graph model with community structure. SIAM J. Sci. Comput. 36(5), C424–C452 (2014)
Kumar, R., Novak, J., Tomkins, A.: Structure and evolution of online social networks, pp. 337–357. Springer, New York (2010)
Ley, M.: The DBLP computer science bibliography: evolution, research issues, perspectives. In: Proc. Int. SPIRE, pp 1–10 (2002)
Li, J., Sim, K., Liu, G., Wong, L.: Maximal quasi-bicliques with balanced noise tolerance: concepts and co-clustering applications. In: Proceedings of ICDM, pp 72–83 (2008)
Linden, G., Smith, B., York, J.: Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Internet Comput. 1, 76–80 (2003)
Liu, B., Yuan, L., Lin, X., Qin, L., Zhang, W., Zhou, J.: Efficient (\(\alpha \), \(\beta \))-core computation: an index-based approach. In: Proceedings of WWW, pp 1130–1141 (2019)
Liu, B., Zhang, F., Zhang, C., Zhang, W., Lin, X.: Corecube: core decomposition in multilayer graphs. In: WISE, Springer, pp 694–710. (2019)
Liu, X., Li, J., Wang, L.: Modeling protein interacting groups by quasi-bicliques: complexity, algorithm, and application. IEEE/ACM Trans. Comput. Biol. Bioinformat. 7(2), 354–364 (2010)
Lumsdaine, A., Gregor, D., Hendrickson, B., Berry, J.: Challenges in parallel graph processing. Parallel Process. Lett. 17(01), 5–20 (2007)
Mohammad, A., Aleksandar, I., Boualem, B., Seyed-Mehdi-Reza, B., Elisa, B., Norman, F.: Collusion detection in online rating systems. In: Web Technologies and Applications, pp 196–207 (2013)
Nacher, J., Ochiai, T., Hayashida, M., Akutsu, T.: A mathematical model for generating bipartite graphs and its application to protein networks. J. Phys. A Math. Theor. 42(48), 485005 (2009)
Ntoutsi, E., Stefanidis, K., Nørvåg, K., Kriegel, H.-P.: Fast group recommendations by applying user clustering. In: International Conference on Conceptual Modeling, Springer, pp 126–140 (2012)
Ntoutsi, E., Stefanidis, K., Rausch, K., Kriegel, H.-P.: “strength lies in differences”: diversifying friends for recommendations through subspace clustering. In: Proceedings of CIKM, pp 729–738 (2014)
Ohsaka, N., Maehara, T., Kawarabayashi, K.: Efficient pagerank tracking in evolving networks. In: Proceedings of SIGKDD, pp 875–884 (2015)
Oliveira, R.V., Zhang, B., Zhang, L.: Observing the evolution of internet as topology. SIGCOMM Comput. Commun. Rev. 37(4), 313–324 (2007)
Peeters, R.: The maximum edge biclique problem is NP-complete. Dis. Appl. Math. 131(3), 651–654 (2003)
Peng, Y., Zhang, Y., Zhang, W., Lin, X., Qin, L.: Efficient probabilistic k-core computation on uncertain graphs. In: Proceedings of ICDE, IEEE, pp 1192–1203 (2018)
Poernomo, A.K., Gopalkrishnan, V.: Towards efficient mining of proportional fault-tolerant frequent itemsets. In: Proceedings of SIGKDD, pp 697–706 (2009)
Saavedra, S., Reed-Tsochas, F., Uzzi, B.: A simple model of bipartite cooperation for ecological and organizational networks. Nature 457(7228), 463–466 (2009)
Sanei-Mehri, S.-V., Sariyuce, A.E., Tirthapura, S.: Butterfly counting in bipartite networks. In: Proceedings of KDD, ACM, pp 2150–2159 (2018)
Saríyüce, A.E., Gedik, B., Jacques-Silva, G., Wu, K.-L., Çatalyürek, Ü.V.: Streaming algorithms for k-core decomposition. Proc. VLDB Endow. 6(6), 433–444 (2013)
Sarıyüce, A.E., Gedik, B., Jacques-Silva, G., Wu, K.-L., Çatalyürek, Ü.V.: Incremental k-core decomposition: algorithms and evaluation. VLDB J. 25(3), 425–447 (2016)
Sarıyüce, A.E., Pinar, A.: Peeling bipartite networks for dense subgraph discovery. In: Proceedings of WSDM, pp 504–512 (2018)
Seidman, S.B.: Network structure and minimum degree. Soc. Netw. 5(3), 269–287 (1983)
Shun, J., Blelloch, G.E.: Ligra: a lightweight graph processing framework for shared memory. In: Proceedings of PPoPP, pp 135–146 (2013)
Sim, K., Li, J., Gopalkrishnan, V., Liu, G.: Mining maximal quasi-bicliques to co-cluster stocks and financial ratios for value investment. In: Proceedings of ICDM, pp 1059–1063 (2006)
Slota, G.M., Rajamanickam, S., Madduri, K.: BFS and coloring-based parallel algorithms for strongly connected components and related problems. In: Proceedings of IPDPS, pp 550–559 (2014)
Wang, J., De Vries, A.P., Reinders, M.J.: Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In: Proceedings of SIGIR, pp 501–508 (2006)
Wang, J., Huang, P., Zhao, H., Zhang, Z., Zhao, B., Lee, D.L.: Billion-scale commodity embedding for e-commerce recommendation in alibaba. In: Proceedings of SIGKDD, pp 839–848 (2018)
Wang, K., Cao, X., Lin, X., Zhang, W., Qin, L.: Efficient computing of radius-bounded k-cores. In: Proceedings of ICDE, pp 233–244 (2018)
Wu, X., Yuan, L., Lin, X., Yang, S., Zhang, W.: Towards efficient k-tripeak decomposition on large graphs. In: Proceedings of DASFAA, pp 604–621 (2019)
Wuchty, S., Almaas, E.: Peeling the yeast protein network. Proteomics 5(2), 444–449 (2005)
Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: Diversified top-k clique search. In: Proceedings of ICDE, pp 387–398 (2015)
Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: Diversified top-k clique search. VLDB J. 25(2), 171–196 (2016)
Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: I/O efficient ECC graph decomposition via graph reduction. PVLDB 9(7), 516–527 (2016)
Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: Effective and efficient dynamic graph coloring. PVLDB 11(3), 338–351 (2017)
Yuan, L., Qin, L., Lin, X., Chang, L., Zhang, W.: I/O efficient ECC graph decomposition via graph reduction. VLDB J. 26(2), 275–300 (2017)
Yuan, L., Qin, L., Zhang, W., Chang, L., Yang, J.: Index-based densest clique percolation community search in networks. IEEE TKDE 30(5), 922–935 (2018)
Yuan, Q., Cong, G., Lin, C.-Y.: Com: a generative model for group recommendation. In: Proceedings of KDD, pp 163–172 (2014)
Zhang, F., Yuan, L., Zhang, Y., Qin, L., Lin, X., Zhou, A.: Discovering strong communities with user engagement and tie strength. In: Proceedings of DASFAA, pp 425–441 (2018)
Zhang, Y., Parthasarathy, S.: Extracting analyzing and visualizing triangle k-core motifs within networks. In: Proceedings of ICDE, pp 1049–1060 (2012)
Zhang, Y., Phillips, C.A., Rogers, G.L., Baker, E.J., Chesler, E.J., Langston, M.A.: On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC Bioinformat. 15(1), 110 (2014)
Zhang, Y., Yu, J.X., Zhang, Y., Qin, L.: A fast order-based approach for core maintenance. In: Proceedings of ICDE, pp 337–348 (2017)
Zhu, A.D., Lin, W., Wang, S., Xiao, X.: Reachability queries on large dynamic graphs: a total order approach. In: Proceedings of SIGMOD, pp 1323–1334 (2014)
Acknowledgements
Long Yuan is supported by NSFC61902184 and NSF of Jiangsu Province BK20190453. Xuemin Lin is supported by 2018YFB1003504, NSFC61232006, ARC DP180103096 and DP170101628. Lu Qin is supported by ARC DP160101513. Wenjie Zhang is supported by ARC DP180103096.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, B., Yuan, L., Lin, X. et al. Efficient (\(\alpha \), \(\beta \))-core computation in bipartite graphs. The VLDB Journal 29, 1075–1099 (2020). https://doi.org/10.1007/s00778-020-00606-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-020-00606-9