skip to main content
10.1145/3308558.3313522acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Efficient (α, β)-core Computation: an Index-based Approach

Published: 13 May 2019 Publication History

Abstract

The problem of computing (α, β)-core in a bipartite graph for given α and β is a fundamental problem in bipartite graph analysis and can be used in many applications such as online group recommendation, fraudsters detection, etc. Existing solution to computing (α, β)-core needs to traverse the entire bipartite graph once. Considering the real bipartite graph can be very large and the requests to compute (α, β)-core can be issued frequently in real applications, the existing solution is too expensive to compute the (α, β)-core. In this paper, we present an efficient algorithm based on a novel index such that the algorithm runs in linear time regarding the result size (thus, the algorithm is optimal since it needs at least linear time to output the result). We prove that the index only requires O(m) space where m is the number of edges in the bipartite graph. Moreover, we devise an efficient algorithm with time complexity O(δ·m) for index construction where δ is bounded by √m and is much smaller than √m in practice. We also discuss efficient algorithms to maintain the index when the bipartite graph is dynamically updated and parallel implementation of the index construction algorithm. The experimental results on real and synthetic graphs (more than 1 billion edges) demonstrate that our algorithms achieve up to 5 orders of magnitude speedup for computing (α, β)-core and up to 3 orders of magnitude speedup for index construction, respectively, compared with existing techniques.

References

[1]
Adel Ahmed, Vladimir Batagelj, Xiaoyan Fu, Seok-Hee Hong, Damian Merrick, and Andrej Mrvar. 2007. Visualisation and analysis of the Internet movie database. In Visualization, 2007. APVIS'07. 2007 6th International Asia-Pacific Symposium on. IEEE, 17-24.
[2]
Mohammad Allahbakhsh, Aleksandar Ignjatovic, Boualem Benatallah, Seyed-Mehdi-Reza Beheshti, Elisa Bertino, and Norman Foo. 2013. Collusion Detection in Online Rating Systems. In Web Technologies and Applications, Yoshiharu Ishikawa, Jianzhong Li, Wei Wang, Rui Zhang, and Wenjie Zhang (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 196-207.
[3]
Jose´ Ignacio Alvarez-Hamelin, Luca Dall'Asta, Alain Barrat, and Alessandro Vespignani. 2005. k-core decomposition: A tool for the visualization of large scale networks. arXiv preprint cs/0504107(2005).
[4]
Sihem Amer-Yahia, Senjuti Basu Roy, Ashish Chawlat, Gautam Das, and Cong Yu. 2009. Group Recommendation: Semantics and Efficiency. Proc. VLDB Endow. 2, 1 (Aug. 2009), 754-765.
[5]
Gary D Bader and Christopher WV Hogue. 2003. An automated method for finding molecular complexes in large protein interaction networks. BMC bioinformatics 4, 1 (2003), 2.
[6]
Albert-László Barabási and Re´ka Albert. 1999. Emergence of Scaling in Random Networks. Science 286, 5439 (1999), 509-512. arXiv:http://science.sciencemag.org/content/286/5439/509.full.pdf
[7]
Vladimir Batagelj and Matjaz Zaversnik. 2003. An O (m) algorithm for cores decomposition of networks. arXiv preprint cs/0310049(2003).
[8]
Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd international conference on World Wide Web. ACM, 119-130.
[9]
Lucas Augusto Montalvão Costa Carvalhoand Hendrik Teixeira Macedo. 2013. Users' satisfaction in recommendation systems for groups: an approach based on noncooperative games. In Proceedings of the 22nd International Conference on World Wide Web. ACM, 951-958.
[10]
Monika Cerinšek and Vladimir Batagelj. 2015. Generalized two-mode cores. Social Networks 42(2015), 80 - 87.
[11]
Danhao Ding, Hui Li, Zhipeng Huang, and Nikos Mamoulis. 2017. Efficient Fault-Tolerant Group Recommendation Using Alpha-beta-core. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management(CIKM '17). ACM, New York, NY, USA, 2047-2050.
[12]
Mike Gartrell, Xinyu Xing, Qin Lv, Aaron Beach, Richard Han, Shivakant Mishra, and Karim Seada. 2010. Enhancing group recommendation by incorporating social relationship interactions. In Proceedings of the 16th ACM international conference on Supporting group work. ACM, 97-106.
[13]
Christos Giatsidis, Dimitrios M Thilikos, and Michalis Vazirgiannis. 2011. D-cores: Measuring collaboration of directed graphs based on degeneracy. In Data Mining (ICDM), 2011 IEEE 11th International Conference on. IEEE, 201-210.
[14]
Christos Giatsidis, Dimitrios M Thilikos, and Michalis Vazirgiannis. 2011. Evaluating cooperation in communities with the k-core structure. In Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on. IEEE, 87-93.
[15]
Jagadeesh Gorla, Neal Lathia, Stephen Robertson, and Jun Wang. 2013. Probabilistic group recommendation via information matching. In Proceedings of the 22nd international conference on World Wide Web. ACM, 495-504.
[16]
Jean-Loup Guillaume and Matthieu Latapy. 2004. Bipartite structure of all complex networks. Information processing letters 90, 5 (2004), 215-221.
[17]
Jean-Loup Guillaume and Matthieu Latapy. 2006. Bipartite graphs as models of complex networks. Physica A: Statistical Mechanics and its Applications 371, 2(2006), 795-813.
[18]
S. Gunnemann, E. Muller, S. Raubach, and T. Seidl. 2011. Flexible Fault Tolerant Subspace Clustering for Data with Missing Values. In 2011 IEEE 11th International Conference on Data Mining. 231-240.
[19]
Dorit S Hochbaum. 1998. Approximating clique and biclique problems. Journal of Algorithms 29, 1 (1998), 174-200.
[20]
Ravi Kannan, Prasad Tetali, and Santosh Vempala. 1997. Simple Markov-chain algorithms for generating bipartite graphs and tournaments. In Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms. Society for Industrial and Applied Mathematics, 193-200.
[21]
Mehdi Kaytoue, Sergei O Kuznetsov, Amedeo Napoli, and Se´bastien Duplessis. 2011. Mining gene expression data with pattern structures in formal concept analysis. Information Sciences 181, 10 (2011), 1989-2001.
[22]
Wissam Khaouid, Marina Barsky, Venkatesh Srinivasan, and Alex Thomo. 2015. K-core decomposition of large networks on a single PC. Proceedings of the VLDB Endowment 9, 1 (2015), 13-23.
[23]
Sune Lehmann, Martin Schwartz, and Lars Kai Hansen. 2008. Biclique communities. Phys. Rev. E 78 (Jul 2008), 016108. Issue 1.
[24]
Michael Ley. 2002. The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives. In Proc. Int. Symposium on String Processing and Information Retrieval. 1-10.
[25]
Jinyan Li, Kelvin Sim, Guimei Liu, and Limsoon Wong. {n. d.}. Maximal Quasi-Bicliques with Balanced Noise Tolerance: Concepts and Co-clustering Applications. 72-83. arXiv:http://epubs.siam.org/doi/pdf/10.1137/1.9781611972788.7
[26]
Jinyan Li, Kelvin Sim, Guimei Liu, and Limsoon Wong. 2008. Maximal quasi-bicliques with balanced noise tolerance: Concepts and co-clustering applications. In Proceedings of the 2008 SIAM International Conference on Data Mining. SIAM, 72-83.
[27]
Greg Linden, Brent Smith, and Jeremy York. 2003. Amazon. com recommendations: Item-to-item collaborative filtering. IEEE Internet computing1(2003), 76-80.
[28]
Xiaowen Liu, Jinyan Li, and Lusheng Wang. 2010. Modeling Protein Interacting Groups by Quasi-Bicliques: Complexity, Algorithm, and Application. IEEE/ACM Trans. Comput. Biol. Bioinformatics 7, 2 (April 2010), 354-364.
[29]
Mark EJ Newman, Steven H Strogatz, and Duncan J Watts. 2001. Random graphs with arbitrary degree distributions and their applications. Physical review E 64, 2 (2001), 026118.
[30]
Eirini Ntoutsi, Kostas Stefanidis, Kjetil Nørvåg, and Hans-Peter Kriegel. 2012. Fast group recommendations by applying user clustering. In International Conference on Conceptual Modeling. Springer, 126-140.
[31]
Eirini Ntoutsi, Kostas Stefanidis, Katharina Rausch, and Hans-Peter Kriegel. 2014. ”Strength Lies in Differences”: Diversifying Friends for Recommendations Through Subspace Clustering. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management(CIKM '14). ACM, New York, NY, USA, 729-738.
[32]
Rene´ Peeters. 2003. The maximum edge biclique problem is NP-complete. Discrete Applied Mathematics 131, 3 (2003), 651-654.
[33]
Ardian Kristanto Poernomo and Vivekanand Gopalkrishnan. 2009. Towards Efficient Mining of Proportional Fault-tolerant Frequent Itemsets. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD '09). ACM, New York, NY, USA, 697-706.
[34]
Seyed-Vahid Sanei-Mehri, Ahmet Erdem Sariyuce, and Srikanta Tirthapura. 2018. Butterfly Counting in Bipartite Networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2150-2159.
[35]
Ahmet Erdem Saríyüce, Bugra Gedik, Gabriela Jacques-Silva, Kun-Lung Wu, and Ümit V Çatalyürek. 2013. Streaming algorithms for k-core decomposition. Proceedings of the VLDB Endowment 6, 6 (2013), 433-444.
[36]
Ahmet Erdem Sariyüce, Bugra Gedik, Gabriela Jacques-Silva, Kun-Lung Wu, and Ümit V Çatalyürek. 2016. Incremental k-core decomposition: algorithms and evaluation. The VLDB Journal-The International Journal on Very Large Data Bases 25, 3(2016), 425-447.
[37]
Ahmet Erdem Sariyüce and Ali Pinar. 2018. Peeling Bipartite Networks for Dense Subgraph Discovery. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining(WSDM '18). ACM, New York, NY, USA, 504-512.
[38]
Stephen B Seidman. 1983. Network structure and minimum degree. Social networks 5, 3 (1983), 269-287.
[39]
Kelvin Sim, Jinyan Li, Vivekanand Gopalkrishnan, and Guimei Liu. 2006. Mining maximal quasi-bicliques to co-cluster stocks and financial ratios for value investment. In Data Mining, 2006. ICDM'06. Sixth International Conference on. IEEE, 1059-1063.
[40]
Jun Wang, Arjen P De Vries, and Marcel JT Reinders. 2006. Unifying user-based and item-based collaborative filtering approaches by similarity fusion. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, 501-508.
[41]
Jizhe Wang, Pipei Huang, Huan Zhao, Zhibo Zhang, Binqiang Zhao, and Dik Lun Lee. 2018. Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining(KDD '18). ACM, New York, NY, USA, 839-848.
[42]
Stefan Wuchty and Eivind Almaas. 2005. Peeling the yeast protein network. Proteomics 5, 2 (2005), 444-449.
[43]
Long Yuan, Lu Qin, Xuemin Lin, Lijun Chang, and Wenjie Zhang. 2015. Diversified top-k clique search. In 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, April 13-17, 2015. 387-398.
[44]
Long Yuan, Lu Qin, Xuemin Lin, Lijun Chang, and Wenjie Zhang. 2016. Diversified top-k clique search. VLDB J. 25, 2 (2016), 171-196.
[45]
Long Yuan, Lu Qin, Xuemin Lin, Lijun Chang, and Wenjie Zhang. 2016. I/O Efficient ECC Graph Decomposition via Graph Reduction. PVLDB 9, 7 (2016), 516-527.
[46]
Long Yuan, Lu Qin, Xuemin Lin, Lijun Chang, and Wenjie Zhang. 2017. Effective and Efficient Dynamic Graph Coloring. PVLDB 11, 3 (2017), 338-351.
[47]
Long Yuan, Lu Qin, Xuemin Lin, Lijun Chang, and Wenjie Zhang. 2017. I/O efficient ECC graph decomposition via graph reduction. VLDB J. 26, 2 (2017), 275-300.
[48]
Long Yuan, Lu Qin, Wenjie Zhang, Lijun Chang, and Jianye Yang. 2018. Index-Based Densest Clique Percolation Community Search in Networks. IEEE Trans. Knowl. Data Eng. 30, 5 (2018), 922-935.
[49]
Quan Yuan, Gao Cong, and Chin-Yew Lin. 2014. COM: A Generative Model for Group Recommendation. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(KDD '14). ACM, New York, NY, USA, 163-172.
[50]
Fan Zhang, Long Yuan, Ying Zhang, Lu Qin, Xuemin Lin, and Alexander Zhou. 2018. Discovering Strong Communities with User Engagement and Tie Strength. In Database Systems for Advanced Applications - 23rd International Conference, DASFAA 2018, Gold Coast, QLD, Australia, May 21-24, 2018, Proceedings, Part I. 425-441.
[51]
Yang Zhang and Srinivasan Parthasarathy. 2012. Extracting analyzing and visualizing triangle k-core motifs within networks. In Data Engineering (ICDE), 2012 IEEE 28th International Conference on. IEEE, 1049-1060.
[52]
Yun Zhang, Charles A. Phillips, Gary L. Rogers, Erich J. Baker, Elissa J. Chesler, and Michael A. Langston. 2014. On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC Bioinformatics 15, 1 (15 Apr 2014), 110.
[53]
Yun Zhang, Charles A Phillips, Gary L Rogers, Erich J Baker, Elissa J Chesler, and Michael A Langston. 2014. On finding bicliques in bipartite graphs: a novel algorithm and its application to the integration of diverse biological data types. BMC bioinformatics 15, 1 (2014), 110.
[54]
Yikai Zhang, Jeffrey Xu Yu, Ying Zhang, and Lu Qin. 2017. A Fast Order-Based Approach for Core Maintenance. In 33rd IEEE International Conference on Data Engineering, ICDE 2017, San Diego, CA, USA, April 19-22, 2017. 337-348.

Cited By

View all
  • (2025)Efficient Maximum Vertex (k,ℓ)-Biplex Computation on Bipartite GraphsTsinghua Science and Technology10.26599/TST.2024.901000930:2(569-584)Online publication date: Apr-2025
  • (2025)Density Decomposition of Bipartite GraphsProceedings of the ACM on Management of Data10.1145/37096803:1(1-25)Online publication date: 11-Feb-2025
  • (2025)Efficient Projection-Based Algorithms for Tip Decomposition on Dynamic Bipartite GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348631037:2(626-640)Online publication date: Feb-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '19: The World Wide Web Conference
May 2019
3620 pages
ISBN:9781450366748
DOI:10.1145/3308558
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • IW3C2: International World Wide Web Conference Committee

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Graph processing
  2. indexing
  3. optimization

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

WWW '19
WWW '19: The Web Conference
May 13 - 17, 2019
CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)121
  • Downloads (Last 6 weeks)8
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Efficient Maximum Vertex (k,ℓ)-Biplex Computation on Bipartite GraphsTsinghua Science and Technology10.26599/TST.2024.901000930:2(569-584)Online publication date: Apr-2025
  • (2025)Density Decomposition of Bipartite GraphsProceedings of the ACM on Management of Data10.1145/37096803:1(1-25)Online publication date: 11-Feb-2025
  • (2025)Efficient Projection-Based Algorithms for Tip Decomposition on Dynamic Bipartite GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348631037:2(626-640)Online publication date: Feb-2025
  • (2025)Temporal Insights for Group-Based Fraud Detection on e-Commerce PlatformsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348512737:2(951-965)Online publication date: Feb-2025
  • (2024)Efficient Maximal Frequent Group Enumeration in Temporal Bipartite GraphsProceedings of the VLDB Endowment10.14778/3681954.368199717:11(3243-3255)Online publication date: 30-Aug-2024
  • (2024)BIRD: Efficient Approximation of Bidirectional Hidden Personalized PageRankProceedings of the VLDB Endowment10.14778/3665844.366585517:9(2255-2268)Online publication date: 1-May-2024
  • (2024)Efficient Temporal Butterfly Counting and Enumeration on Temporal Bipartite GraphsProceedings of the VLDB Endowment10.14778/3636218.363622317:4(657-670)Online publication date: 5-Mar-2024
  • (2024)MCR-Tree: An Efficient Index for Multi-dimensional Core SearchProceedings of the ACM on Management of Data10.1145/36549562:3(1-25)Online publication date: 30-May-2024
  • (2024)Efficient Maximal Biclique Enumeration on Large Signed Bipartite GraphsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.337365436:9(4618-4631)Online publication date: Sep-2024
  • (2024)A Unified and Scalable Algorithm Framework of User-Defined Temporal $(k,\mathcal {X})$-Core QueryIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.3349310(1-15)Online publication date: 2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media