Skip to main content

Accelerating maximum biplex search over large bipartite graphs

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

As a typical most-to-most connected quasi-biclique model, k-biplex allows nodes on each side of a fully connected subgraph to lose at most k connections. In this paper, we investigate the maximum k-biplex search problem to find a k-biplex with the maximum number of edges and prove that it is NP-hard and inapproximable. To solve this problem, we first define a new dense subgraph over a given bipartite graph, named (xy)-core, based on which a core-based maximum k-biplex search (CMBS) framework is presented by introducing a core-based graph reduction technique. In addition, we design a bidirectional positioning strategy and propose a \(\hbox {CMBS}^+\) framework. After that, two exact algorithms, namely a maximum k-biplex search (MBPS) algorithm and a core-based symmetric search (CSS) algorithm, are developed to compute the maximum k-biplex in (xy)-cores. In particular, MBPS integrates degree-based and 2-hop pruning strategies, and CSS explores symmetric BK branching and early termination strategies. To process large bipartite graphs more effectively, we further develop a heuristic fast search (HFS) algorithm and a FPGA-based parallel HFS (FP-HFS) algorithm, where a two-level parallel architecture at and inside the processing element (PE) is introduced to improve the pipeline. Moreover, a double buffering technique is utilized to overcome the resource limitation of FP-HFS and improve scalability. Extensive experiments conducted on 12 real datasets, as well as two synthetic datasets, demonstrate the efficiency and effectiveness of the proposed algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Algorithm 2
Fig. 3
Algorithm 3
Fig. 4
Fig. 5
Fig. 6
Algorithm 4
Fig. 7
Algorithm 5
Algorithm 6
Algorithm 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

Notes

  1. http://konect.cc/networks/.

  2. https://gitee.com/hnu_pardon/for-personal-use-only.

References

  1. Xilinx. https://www.xilinx.com (2024)

  2. Xilinx documents. https://docs.xilinx.com (2024)

  3. Abidi, A., Zhou, R., Chen, L., Liu, C.: Pivot-based maximal biclique enumeration. In: IJCAI, pp. 3558–3564 (2020)

  4. Alexe, G., Alexe, S., Crama, Y., Foldes, S., Hammer, P.L., Simeone, B.: Consensus algorithms for the generation of all maximal bicliques. Discret. Appl. Math. 145(1), 11–21 (2004)

    Article  MathSciNet  Google Scholar 

  5. Ambühl, C., Mastrolilli, M., Svensson, O.: Inapproximability results for maximum edge biclique, minimum linear arrangement, and sparsest cut. SIAM J. Comput. 40(2), 567–596 (2011)

    Article  MathSciNet  Google Scholar 

  6. Besta, M., Stanojevic, D., Licht, J.D.F., Ben-Nun, T., Hoefler, T.: Graph processing on fpgas: Taxonomy, survey, challenges. arXiv preprint arXiv:1903.06697 (2019)

  7. Birmelé, E.: A scale-free graph model based on bipartite graphs. Discret. Appl. Math. 157(10), 2267–2284 (2009)

    Article  MathSciNet  Google Scholar 

  8. Bron, C., Kerbosch, J.: Finding all cliques of an undirected graph (algorithm 457). Commun. ACM 16(9), 575–576 (1973)

    Article  Google Scholar 

  9. Chang, W.C., Vakati, S., Krause, R., Eulenstein, O.: Exploring biological interaction networks with tailored weighted quasi-bicliques. In: BMC bioinformatics, pp. 1–9. BioMed Central (2012)

  10. Chen, L., Liu, C., Zhou, R., Xu, J., Li, J.: Efficient exact algorithms for maximum balanced biclique search in bipartite graphs. In: SIGMOD ’21: International Conference on Management of Data, pp. 248–260. ACM (2021)

  11. Conte, A., De Matteis, T., De Sensi, D., Grossi, R., Marino, A., Versari, L.: D2k: scalable community detection in massive networks via small-diameter k-plexes. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1272–1281 (2018)

  12. Conte, A., Firmani, D., Mordente, C., Patrignani, M., Torlone, R.: Fast enumeration of large k-plexes. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 115–124 (2017)

  13. Date, K., Feng, K., Nagi, R., Xiong, J., Kim, N.S., Hwu, W.M.: Collaborative (cpu+ gpu) algorithms for triangle counting and truss decomposition on the minsky architecture: Static graph challenge: Subgraph isomorphism. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7. IEEE (2017)

  14. Ding, D., Li, H., Huang, Z., Mamoulis, N.: Efficient fault-tolerant group recommendation using alpha-beta-core. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 2047–2050 (2017)

  15. Fang, Y., Huang, X., Qin, L., Zhang, Y., Zhang, W., Cheng, R., Lin, X.: A survey of community search over big graphs. VLDB J. 29(1), 353–392 (2020)

    Article  Google Scholar 

  16. Fang, Y., Wang, K., Lin, X., Zhang, W.: Cohesive subgraph search over big heterogeneous information networks: Applications, challenges, and solutions. In: SIGMOD ’21: International Conference on Management of Data, pp. 2829–2838. ACM (2021)

  17. Fratkin, E., Naughton, B.T., Brutlag, D.L., Batzoglou, S.: Motifcut: regulatory motifs finding with maximum density subgraphs. Bioinformatics 22(14), e150–e157 (2006)

    Article  Google Scholar 

  18. He, Y., Wang, K., Zhang, W., Lin, X., Zhang, Y.: Exploring cohesive subgraphs with vertex engagement and tie strength in bipartite graphs. Inf. Sci. 572, 277–296 (2021)

    Article  MathSciNet  Google Scholar 

  19. Henriques, R., Madeira, S.C.: Bicnet: flexible module discovery in large-scale biological networks using biclustering. Algorithms Mol. Biol. 11(1), 1–30 (2016)

    Article  Google Scholar 

  20. Jin, X., Yang, Z., Lin, X., Yang, S., Qin, L., Peng, Y.: Fast: Fpga-based subgraph matching on massive graphs. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), pp. 1452–1463 (2021)

  21. Khot, S.: Improved inapproximability results for maxclique, chromatic number and approximate graph coloring. In: Proceedings 42nd IEEE Symposium on Foundations of Computer Science, pp. 600–609 (2001)

  22. Lai, Z., Peng, Y., Yang, S., Lin, X., Zhang, W.: Pefp: Efficient k-hop constrained st simple path enumeration on fpga. In: 2021 IEEE 37th International Conference on Data Engineering (ICDE), pp. 1320–1331. IEEE (2021)

  23. Lanciano, T., Miyauchi, A., Fazzone, A., Bonchi, F.: A survey on the densest subgraph problem and its variants. arXiv preprint arXiv:2303.14467 (2023)

  24. Lewis, H.R.: Computers and intractability. a guide to the theory of np-completeness (1983)

  25. Li, Z., Fresacher, M., Scarlett, J.: Learning erdos-renyi random graphs via edge detecting queries. Adv. Neural Inf. Process. Syst. 32 (2019)

  26. Liu, B., Yuan, L., Lin, X., Qin, L., Zhang, W., Zhou, J.: Efficient (\(\alpha \), \(\beta \))-core computation in bipartite graphs. VLDB J. 29(5), 1075–1099 (2020)

    Article  Google Scholar 

  27. Liu, G., Sim, K., Li, J.: Efficient mining of large maximal bicliques. In: Data Warehousing and Knowledge Discovery: 8th International Conference, DaWaK 2006, Krakow, Poland, September 4-8, 2006. Proceedings 8, no. 12 in DaWaK’06, pp. 437–448 (2006)

  28. Liu, Q., Liao, X., Huang, X., Xu, J., Gao, Y.: Distributed (\(\alpha \), \(\beta \))-core decomposition over bipartite graphs. In: 39th IEEE International Conference on Data Engineering, ICDE 2023, Anaheim, CA, USA, April 3-7, 2023, pp. 909–921. IEEE (2023)

  29. Liu, Q., Zhao, M., Huang, X., Xu, J., Gao, Y.: Truss-based community search over large directed graphs. In: Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, pp. 2183–2197. ACM (2020)

  30. Liu, X., Li, J., Wang, L.: Modeling protein interacting groups by quasi-bicliques: complexity, algorithm, and application. IEEE/ACM Trans. Comput. Biol. Bioinf. 7(2), 354–364 (2008)

    Google Scholar 

  31. Luo, W., Li, K., Zhou, X., Gao, Y., Li, K.: Maximum biplex search over bipartite graphs. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp. 898–910 (2022)

  32. Luo, W., Zhou, X., Yang, J., Peng, P., Xiao, G., Gao, Y.: Efficient approaches to top-r influential community search. IEEE Internet Things J. (2020)

  33. Lyu, B., Qin, L., Lin, X., Zhang, Y., Qian, Z., Zhou, J.: Maximum biclique search at billion scale. Proc. VLDB Endow. 13(9), 1359–1372 (2020)

    Article  Google Scholar 

  34. Maulik, U., Mukhopadhyay, A., Bhattacharyya, M., Kaderali, L., Brors, B., Bandyopadhyay, S., Eils, R.: Mining quasi-bicliques from hiv-1-human protein interaction network: a multiobjective biclustering approach. IEEE/ACM Trans. Comput. Biol. Bioinf. 10(2), 423–435 (2012)

    Article  Google Scholar 

  35. Mishra, N., Ron, D., Swaminathan, R.: A new conceptual clustering framework. Mach. Learn. 56(1–3), 115–151 (2004)

    Article  Google Scholar 

  36. Mitzenmacher, M., Pachocki, J., Peng, R., Tsourakakis, C., Xu, S.C.: Scalable large near-clique detection in large-scale networks via sampling. In: The 21th ACM SIGKDD International Conference, pp. 815-824 (2015)

  37. Mitzenmacher, M., Pachocki, J., Peng, R., Tsourakakis, C., Xu, S.C.: Scalable large near-clique detection in large-scale networks via sampling. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 815–824 (2015)

  38. Mushlin, R.A., Kershenbaum, A., Gallagher, S.T., Rebbeck, T.R.: A graph-theoretical approach for pattern discovery in epidemiological research. IBM Syst. J. 46(1), 135–149 (2007)

    Article  Google Scholar 

  39. Sanderson, M.J., Driskell, A.C., Ree, R.H., Eulenstein, O., Langley, S.: Obtaining maximal concatenated phylogenetic data sets from large sequence databases. Mol. Biol. Evol. 20(7), 1036–1042 (2003)

    Article  Google Scholar 

  40. Sim, K., Li, J., Gopalkrishnan, V., Liu, G.: Mining maximal quasi-bicliques: novel algorithm and applications in the stock market and protein networks. Stat. Anal. Data Min. ASA Data Sci. J. 2(4), 255–273 (2009)

    Article  MathSciNet  Google Scholar 

  41. Su, X., Lin, Y., Zou, L.: Fasi: Fpga-friendly subgraph isomorphism on massive graphs. In: 2023 IEEE 39th International Conference on Data Engineering (ICDE), pp. 2099–2112 (2023)

  42. Wang, K., Lin, X., Qin, L., Zhang, W., Zhang, Y.: Efficient bitruss decomposition for large-scale bipartite graphs. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 661–672. IEEE (2020)

  43. Wang, K., Zhang, W., Lin, X., Zhang, Y., Qin, L., Zhang, Y.: Efficient and effective community search on large-scale bipartite graphs. In: 37th IEEE International Conference on Data Engineering, ICDE 2021, pp. 85–96. IEEE (2021)

  44. Wang, L.: Near optimal solutions for maximum quasi-bicliques. J. Comb. Optim. 25(3), 481–497 (2013)

    Article  MathSciNet  Google Scholar 

  45. Yan, C., Burleigh, J.G., Eulenstein, O.: Identifying optimal incomplete phylogenetic data sets from sequence databases. Mol. Phylogenet. Evol. 35(3), 528–535 (2005)

    Article  Google Scholar 

  46. Yang, J., Peng, Y., Ouyang, D., Zhang, W., Lin, X., Zhao, X.: (p, q)-biclique counting and enumeration for large sparse bipartite graphs. VLDB J. 32(5), 1137–1161 (2023)

    Article  Google Scholar 

  47. Yu, K., Long, C.: Maximum k-biplex search on bipartite graphs: a symmetric-bk branching approach. Proc. ACM Manag. Data 1(1), 49:1-49:26 (2023)

    Article  Google Scholar 

  48. Yu, K., Long, C., Deepak, P., Chakraborty, T.: On efficient large maximal biplex discovery. IEEE Transactions on Knowledge and Data Engineering (2021)

  49. Yu, K., Long, C., Liu, S., Yan, D.: Efficient algorithms for maximal k-biplex enumeration. In: Proceedings of the 2022 International Conference on Management of Data, pp. 860–873 (2022)

  50. Zhang, Y., Phillips, C.A., Rogers, G.L., Baker, E.J., Chesler, E.J., Langston, M.A.: On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC Bioinform. 15(1), 1–18 (2014)

    Article  Google Scholar 

  51. Zhao, G., Wang, K., Zhang, W., Lin, X., Zhang, Y., He, Y.: Efficient computation of cohesive subgraphs in uncertain bipartite graphs. In: 2022 IEEE 38th International Conference on Data Engineering (ICDE), pp. 2333–2345. IEEE (2022)

  52. Zou, Z.: Bitruss decomposition of bipartite graphs. In: International Conference on Database Systems for Advanced Applications, pp. 218–233. Springer (2016)

  53. Zuckerman, D.: Linear degree extractors and the inapproximability of max clique and chromatic number. In: Proceedings of the Thirty-Eighth Annual ACM Symposium on Theory of Computing, STOC ’06, p. 681-690 (2006)

Download references

Acknowledgements

We thank all anonymous reviewers for their constructive feedback that helped us refine our ideas and arguments. The research is partially funded by the Natural Science Foundation of China under (Grant Nos. U23A20317, 62172146), the Natural Science Foundation of Hunan Province (Grant Nos. 2023JJ10016, 2023JJ30083), the Key R&D Program of Hunan Province (Grant Nos. 2023GK2002, 2024AQ2025), and a Project Supported by Scientific Research Fund of Hunan Provincial Education Department (22A0592).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wensheng Luo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pan, D., Zhou, X., Luo, W. et al. Accelerating maximum biplex search over large bipartite graphs. The VLDB Journal 34, 1 (2025). https://doi.org/10.1007/s00778-024-00882-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00778-024-00882-9

Keywords