Abstract
Subgraph searching is the problem of determining the presence of a given query graph in either a single or multiple data graph. Due to the wide adoption of graphs in various domains for dataset representation, studies related to graph database management have evolved over the last decades. The subgraph containment query has vast applications in multiple disciplines, particularly for biological datasets that support molecular searching. The classical solution performs expensive one-to-one mapping of the vertices between the query graph and data graph. In the case of multiple data graph settings, also called transactional graph database, the filter-then-verify framework (FTV) adopts specific index structures to represent graph features and to reduce the run-time overheads associated with the one-to-one mapping of the vertices. However, the state-of-the-art approaches mainly suffer from large indexing sizes.
In this paper, we study the problem of subgraph searching in a transactional graph database. We present a new compact representation and faster algorithm to reduce the search space by using (1) a compact data structure for indexing the subgraph patterns, and (2) state-of-the-art compressed inverted and bitmap indexes for maintaining the graph occurrences information. Finally, the candidate graph set, generated after the intersection operation, was verified using the subgraph isomorphism algorithm. Extensive experiments with real datasets show that our compressed inverted and bitmap-based indexes outperform the state-of-the-art algorithm regarding memory usage and filtering time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bonnici, V., Ferro, A., Giugno, R., Pulvirenti, A., Shasha, D.: Enhancing graph database indexing by suffix tree structure. In: Dijkstra, T.M.H., Tsivtsivadze, E., Marchiori, E., Heskes, T. (eds.) PRIB 2010. LNCS, vol. 6282, pp. 195–203. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16001-1_17
Cheng, J., Ke, Y., Ng, W.: Efficient query processing on graph databases. ACM Trans. Database Syst. 34(1), 2:1–2:48 (2009). https://doi.org/10.1145/1508857.1508859
Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: A (sub) graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1367–1372 (2004)
Fuentes-Sepúlveda, J., Ladra, S.: Energy consumption in compact integer vectors: a study case. IEEE Access 7, 155625–155636 (2019). https://doi.org/10.1109/ACCESS.2019.2949655
Giugno, R., Bonnici, V., Bombieri, N., Pulvirenti, A., Ferro, A., Shasha, D.: GRAPES: a software for parallel searching on biological graphs targeting multi-core architectures. PLoS ONE 8(10), e76911 (2013)
Giugno, R., Shasha, D.E.: GraphGrep: a fast and universal method for querying graphs. In: 16th International Conference on Pattern Recognition, ICPR 2002, Quebec, Canada, 11–15 August 2002, pp. 112–115. IEEE Computer Society (2002). https://doi.org/10.1109/ICPR.2002.1048250
Katsarou, F.: Improving the performance and scalability of pattern subgraph queries. Ph.D. thesis, University of Glasgow, UK (2018)
Katsarou, F., Ntarmos, N., Triantafillou, P.: Hybrid algorithms for subgraph pattern queries in graph databases. In: Nie, J., et al. (eds.) 2017 IEEE International Conference on Big Data (IEEE BigData 2017), Boston, MA, USA, 11–14 December 2017, pp. 656–665. IEEE Computer Society (2017). https://doi.org/10.1109/BigData.2017.8257981
Kim, H., Choi, Y., Park, K., Lin, X., Hong, S., Han, W.: Versatile equivalences: speeding up subgraph query processing and subgraph matching. In: Li, G., Li, Z., Idreos, S., Srivastava, D. (eds.) SIGMOD 2021: International Conference on Management of Data, Virtual Event, China, 20–25 June 2021, pp. 925–937. ACM (2021). https://doi.org/10.1145/3448016.3457265
Lemire, D., Boytsov, L., Kurz, N.: SIMD compression and the intersection of sorted integers. Softw. Pract. Exp. 46(6), 723–749 (2016). https://doi.org/10.1002/spe.2326
Lemire, D., Kaser, O., Aouiche, K.: Sorting improves word-aligned bitmap indexes. Data Knowl. Eng. 69(1), 3–28 (2010). https://doi.org/10.1016/j.datak.2009.08.006
Licheri, N., Bonnici, V., Beccuti, M., Giugno, R.: GRAPES-DD: exploiting decision diagrams for index-driven search in biological graph databases. BMC Bioinform. 22(1), 209 (2021). https://doi.org/10.1186/s12859-021-04129-0
Mrzic, A., et al.: Grasping frequent subgraph mining for bioinformatics applications. BioData Min. 11(1), 20:1–20:24 (2018)
Sun, S., Luo, Q.: Scaling up subgraph query processing with efficient subgraph matching. In: 35th IEEE International Conference on Data Engineering, ICDE 2019, Macao, China, 8–11 April 2019, pp. 220–231. IEEE (2019). https://doi.org/10.1109/ICDE.2019.00028
Wangmo, C., Wiese, L.: Efficient subgraph indexing for biochemical graphs. In: Cuzzocrea, A., Gusikhin, O., van der Aalst, W.M.P., Hammoudi, S. (eds.) Proceedings of the 11th International Conference on Data Science, Technology and Applications, DATA 2022, Lisbon, Portugal, 11–13 July 2022, pp. 533–540. SCITEPRESS (2022). https://doi.org/10.5220/0011350100003269
Xie, Y., Yu, P.S.: CP-index: on the efficient indexing of large graphs. In: Macdonald, C., Ounis, I., Ruthven, I. (eds.) Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011, Glasgow, United Kingdom, 24–28 October 2011, pp. 1795–1804. ACM (2011). https://doi.org/10.1145/2063576.2063835
Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: Weikum, G., König, A.C., Deßloch, S. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, 13–18 June 2004, pp. 335–346. ACM (2004). https://doi.org/10.1145/1007568.1007607
Yuan, D., Mitra, P.: Lindex: a lattice-based index for graph databases. VLDB J. 22(2), 229–252 (2013). https://doi.org/10.1007/s00778-012-0284-8
Acknowledgements
The authors would like to thank Deutscher Akademischer Austauschdienst (DAAD) for providing funds for research on this project. Extensive calculations were conducted on the Lichtenberg high-performance computer of the TU Darmstadt for this research under project ID P0020213. Furthermore, the authors would like to thank Prof. Dr. Daniel Lemire, for the insightful discussion during the preparation of this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wangmo, C., Wiese, L. (2023). SubTempora: A Hybrid Approach for Optimising Subgraph Searching. In: Cuzzocrea, A., Gusikhin, O., Hammoudi, S., Quix, C. (eds) Data Management Technologies and Applications. DATA DATA 2022 2021. Communications in Computer and Information Science, vol 1860. Springer, Cham. https://doi.org/10.1007/978-3-031-37890-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-37890-4_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37889-8
Online ISBN: 978-3-031-37890-4
eBook Packages: Computer ScienceComputer Science (R0)