Abstract
As one of the most primitive operators in graph algorithms, such as the triangle counting, maximal clique enumeration, and subgraph listing, a set intersection operator returns common vertices between any two given sets of vertices in data graphs. It is therefore very important to accelerate the set intersection, which will benefit a bunch of tasks that take it as a built-in block. Existing works on the set intersection usually followed the merge intersection or galloping-search framework, and most optimization research focused on how to leverage the SIMD hardware instructions. In this paper, we propose a novel multi-level set intersection framework, namely hierarchical set partitioning and join (HERO), by using our well-designed set intersection bitmap tree (SIB-tree) index, which is independent of SIMD instructions and completely orthogonal to the merge intersection framework. We recursively decompose the set intersection task into small-sized subtasks and solve each subtask using bitmap and boolean AND operations. To sufficiently achieve the acceleration brought by our proposed intersection approach, we formulate a graph reordering problem, prove its NP-hardness, and then develop a heuristic algorithm to tackle this problem. Extensive experiments on real-world graphs have been conducted to confirm the efficiency and effectiveness of our HERO approach. The speedup over classic merge intersection achieves up to 188x and 176x for triangle counting and maximal clique enumeration, respectively.
- Aberger, C. R., Lamb, A., Tu, S., Nötzli, A., Olukotun, K., and Ré, C. Emptyheaded: A relational engine for graph processing. ACM Trans. Database Syst. 42, 4 (2017), 20:1--20:44.Google ScholarDigital Library
- Andreev, K., and Räcke, H. Balanced graph partitioning. In SPAA 2004: Proceedings of the Sixteenth Annual ACM Symposium on Parallelism in Algorithms and Architectures, June 27--30, 2004, Barcelona, Spain (2004), P. B. Gibbons and M. Adler, Eds., ACM, pp. 120--124.Google ScholarDigital Library
- Blandford, D. K., Blelloch, G. E., and Kash, I. A. Compact representations of separable graphs. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, January 12--14, 2003, Baltimore, Maryland, USA (2003), ACM/SIAM, pp. 679--688.Google Scholar
- Brendel, W., Han, F., Marujo, L., Jie, L., and Korolova, A. Practical privacy-preserving friend recommendations on social networks. In Companion of the The Web Conference 2018 on The Web Conference 2018, WWW 2018, Lyon , France, April 23--27, 2018 (2018), P. Champin, F. Gandon, M. Lalmas, and P. G. Ipeirotis, Eds., ACM, pp. 111--112.Google ScholarDigital Library
- Bron, C., and Kerbosch, J. Finding all cliques of an undirected graph (algorithm 457). Commun. ACM 16, 9 (1973), 575--576.Google ScholarDigital Library
- Chambi, S., Lemire, D., Godin, R., and Kaser, O. Roaring bitmap : nouveau modèle de compression bitmap. In Actes des 10e journées francophones sur les Entrepôts de Données et l'Analyse en Ligne, EDA 2014, Vichy, France, 5--6 Juin, 2014 (2014), S. Bimonte, L. d'Orazio, and E. Negre, Eds., vol. B-10 of RNTI, Hermann-Éditions, pp. 37--50.Google Scholar
- Chandran, J., and V., M. V. A novel triangle count-based influence maximization method on social networks. Int. J. Knowl. Syst. Sci. 12, 4 (2021), 92--108.Google Scholar
- Chu, S., and Cheng, J. Triangle listing in massive networks. ACM Trans. Knowl. Discov. Data 6, 4 (2012), 17:1--17:32.Google ScholarDigital Library
- Cui, W., Xiao, Y., Wang, H., Lu, Y., and Wang, W. Online search of overlapping communities. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22--27, 2013 (2013), K. A. Ross, D. Srivastava, and D. Papadias, Eds., ACM, pp. 277--288.Google ScholarDigital Library
- Demaine, E. D., López-Ortiz, A., and Munro, J. I. Adaptive set intersections, unions, and differences. In Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms, January 9--11, 2000, San Francisco, CA, USA (2000), D. B. Shmoys, Ed., ACM/SIAM, pp. 743--752.Google Scholar
- Dhulipala, L., Kabiljo, I., Karrer, B., Ottaviano, G., Pupyrev, S., and Shalita, A. Compressing graphs and indexes with recursive graph bisection. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13--17, 2016 (2016), B. Krishnapuram, M. Shah, A. J. Smola, C. C. Aggarwal, D. Shen, and R. Rastogi, Eds., ACM, pp. 1535--1544.Google ScholarDigital Library
- Ding, B., and König, A. C. Fast set intersection in memory. Proc. VLDB Endow. 4, 4 (2011), 255--266.Google ScholarDigital Library
- Garey, M. R., and Johnson, D. S. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979.Google ScholarDigital Library
- Han, S., Zou, L., and Yu, J. X. Speeding up set intersections in graph algorithms using SIMD instructions. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD Conference 2018, Houston, TX, USA, June 10--15, 2018 (2018), G. Das, C. M. Jermaine, and P. A. Bernstein, Eds., ACM, pp. 1587--1602.Google ScholarDigital Library
- Huang, M., Jiang, Q., Qu, Q., Chen, L., and Chen, H. Information fusion oriented heterogeneous social network for friend recommendation via community detection. Appl. Soft Comput. 114 (2022), 108103.Google ScholarDigital Library
- Inoue, H., Ohara, M., and Taura, K. Faster set intersection with SIMD instructions by reducing branch mispredictions. Proc. VLDB Endow. 8, 3 (2014), 293--304.Google ScholarDigital Library
- Kang, J., Zhang, J., Song, W., and Yang, X. Friend relationships recommendation algorithm in online education platform. In Web Information Systems and Applications - 18th International Conference, WISA 2021, Kaifeng, China, September 24--26, 2021, Proceedings (2021), C. Xing, X. Fu, Y. Zhang, G. Zhang, and C. Borjigin, Eds., vol. 12999 of Lecture Notes in Computer Science, Springer, pp. 592--604.Google Scholar
- Kunegis, J. KONECT: the koblenz network collection. In 22nd International World Wide Web Conference, WWW '13, Rio de Janeiro, Brazil, May 13--17, 2013, Companion Volume (2013), L. Carr, A. H. F. Laender, B. F. Lóscio, I. King, M. Fontoura, D. Vrandecic, L. Aroyo, J. P. M. de Oliveira, F. Lima, and E. Wilde, Eds., International World Wide Web Conferences Steering Committee / ACM, pp. 1343--1350.Google Scholar
- Lemire, D., Boytsov, L., and Kurz, N. SIMD compression and the intersection of sorted integers. Softw. Pract. Exp. 46, 6 (2016), 723--749.Google ScholarDigital Library
- Lemire, D., Kaser, O., Kurz, N., Deri, L., O'Hara, C., Saint-Jacqes, F., and Kai, G. S. Y. Roaring bitmaps: Implementation of an optimized software library. Softw. Pract. Exp. 48, 4 (2018), 867--895.Google ScholarCross Ref
- Leskovec, J., and Krevl, A. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2014.Google Scholar
- Lim, Y., Kang, U., and Faloutsos, C. Slashburn: Graph compression and mining beyond caveman communities. IEEE Trans. Knowl. Data Eng. 26, 12 (2014), 3077--3089.Google ScholarCross Ref
- Schlegel, B., Willhalm, T., and Lehner, W. Fast sorted-set intersection using SIMD instructions. In International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures - ADMS 2011, Seattle, WA, USA, September 2, 2011 (2011), R. Bordawekar and C. A. Lang, Eds., pp. 1--8.Google Scholar
- Shao, Y., Cui, B., Chen, L., Ma, L., Yao, J., and Xu, N. Parallel subgraph listing in a large-scale graph. In International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22--27, 2014 (2014), C. E. Dyreson, F. Li, and M. T. Özsu, Eds., ACM, pp. 625--636.Google ScholarDigital Library
- Shoaran, M., and Thomo, A. Zero-knowledge-private counting of group triangles in social networks. Comput. J. 60, 1 (2017), 126--134.Google ScholarCross Ref
- Shun, J. Shared-memory parallelism can be simple, fast, and scalable. Morgan & Claypool, 2017.Google ScholarDigital Library
- Shun, J., and Tangwongsan, K. Multicore triangle computations without tuning. In 31st IEEE International Conference on Data Engineering, ICDE 2015, Seoul, South Korea, April 13--17, 2015 (2015), J. Gehrke, W. Lehner, K. Shim, S. K. Cha, and G. M. Lohman, Eds., IEEE Computer Society, pp. 149--160.Google ScholarCross Ref
- Wang, N., Zhang, J., Tan, K., and Tung, A. K. H. On triangulation-based dense neighborhood graphs discovery. Proc. VLDB Endow. 4, 2 (2010), 58--68.Google ScholarDigital Library
- Yaozu, Cui, Junqiu, Li, Xingyuan, and Wang. Uncovering the overlapping community structure of complex networks by maximal cliques. Physica, A. Statistical mechanics and its applications 415 (2014), 398--406.Google Scholar
- Zechner, N., and Lingas, A. Efficient algorithms for subgraph listing. Algorithms 7, 2 (2014), 243--252.Google ScholarCross Ref
- Zheng, W., Yang, Y., and Piao, C. Accelerating set intersections over graphs by reducing-merging. In KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, Singapore, August 14--18, 2021 (2021), F. Zhu, B. C. Ooi, and C. Miao, Eds., ACM, pp. 2349--2359.Google ScholarDigital Library
Index Terms
- HERO: A Hierarchical Set Partitioning and Join Framework for Speeding up the Set Intersection Over Graphs
Recommendations
Partitioning P 4-tidy graphs into a stable set and a forest
AbstractGiven a graph G, a near-bipartition of G is a partition of V ( G ) into S and F, where S is a stable set, and F induces a forest. Given a graph G and a vertex set P inducing a P 4 in G, a vertex v is said partner of P if v ∈ V ( G ) ∖ ...
Set colorings of graphs
A set coloring of the graph G is an assignment (function) of distinct subsets of a finite set X of colors to the vertices of the graph, where the colors of the edges are obtained as the symmetric differences of the sets assigned to their end vertices ...
On the structure of certain intersection graphs
Recently Gavril introduced a new class of intersection graphs called interval-filament graphs. These include co-comparability graphs and polygon-circle graphs (the intersection graphs of polygons inscribed in a circle), which include circular-arc graphs ...
Comments