Skip to main content

SubTempora: A Hybrid Approach for Optimising Subgraph Searching

  • Conference paper
  • First Online:
Data Management Technologies and Applications (DATA 2022, DATA 2021)

Abstract

Subgraph searching is the problem of determining the presence of a given query graph in either a single or multiple data graph. Due to the wide adoption of graphs in various domains for dataset representation, studies related to graph database management have evolved over the last decades. The subgraph containment query has vast applications in multiple disciplines, particularly for biological datasets that support molecular searching. The classical solution performs expensive one-to-one mapping of the vertices between the query graph and data graph. In the case of multiple data graph settings, also called transactional graph database, the filter-then-verify framework (FTV) adopts specific index structures to represent graph features and to reduce the run-time overheads associated with the one-to-one mapping of the vertices. However, the state-of-the-art approaches mainly suffer from large indexing sizes.

In this paper, we study the problem of subgraph searching in a transactional graph database. We present a new compact representation and faster algorithm to reduce the search space by using (1) a compact data structure for indexing the subgraph patterns, and (2) state-of-the-art compressed inverted and bitmap indexes for maintaining the graph occurrences information. Finally, the candidate graph set, generated after the intersection operation, was verified using the subgraph isomorphism algorithm. Extensive experiments with real datasets show that our compressed inverted and bitmap-based indexes outperform the state-of-the-art algorithm regarding memory usage and filtering time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bonnici, V., Ferro, A., Giugno, R., Pulvirenti, A., Shasha, D.: Enhancing graph database indexing by suffix tree structure. In: Dijkstra, T.M.H., Tsivtsivadze, E., Marchiori, E., Heskes, T. (eds.) PRIB 2010. LNCS, vol. 6282, pp. 195–203. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16001-1_17

    Chapter  Google Scholar 

  2. Cheng, J., Ke, Y., Ng, W.: Efficient query processing on graph databases. ACM Trans. Database Syst. 34(1), 2:1–2:48 (2009). https://doi.org/10.1145/1508857.1508859

  3. Cordella, L.P., Foggia, P., Sansone, C., Vento, M.: A (sub) graph isomorphism algorithm for matching large graphs. IEEE Trans. Pattern Anal. Mach. Intell. 26(10), 1367–1372 (2004)

    Article  Google Scholar 

  4. Fuentes-Sepúlveda, J., Ladra, S.: Energy consumption in compact integer vectors: a study case. IEEE Access 7, 155625–155636 (2019). https://doi.org/10.1109/ACCESS.2019.2949655

    Article  Google Scholar 

  5. Giugno, R., Bonnici, V., Bombieri, N., Pulvirenti, A., Ferro, A., Shasha, D.: GRAPES: a software for parallel searching on biological graphs targeting multi-core architectures. PLoS ONE 8(10), e76911 (2013)

    Article  Google Scholar 

  6. Giugno, R., Shasha, D.E.: GraphGrep: a fast and universal method for querying graphs. In: 16th International Conference on Pattern Recognition, ICPR 2002, Quebec, Canada, 11–15 August 2002, pp. 112–115. IEEE Computer Society (2002). https://doi.org/10.1109/ICPR.2002.1048250

  7. Katsarou, F.: Improving the performance and scalability of pattern subgraph queries. Ph.D. thesis, University of Glasgow, UK (2018)

    Google Scholar 

  8. Katsarou, F., Ntarmos, N., Triantafillou, P.: Hybrid algorithms for subgraph pattern queries in graph databases. In: Nie, J., et al. (eds.) 2017 IEEE International Conference on Big Data (IEEE BigData 2017), Boston, MA, USA, 11–14 December 2017, pp. 656–665. IEEE Computer Society (2017). https://doi.org/10.1109/BigData.2017.8257981

  9. Kim, H., Choi, Y., Park, K., Lin, X., Hong, S., Han, W.: Versatile equivalences: speeding up subgraph query processing and subgraph matching. In: Li, G., Li, Z., Idreos, S., Srivastava, D. (eds.) SIGMOD 2021: International Conference on Management of Data, Virtual Event, China, 20–25 June 2021, pp. 925–937. ACM (2021). https://doi.org/10.1145/3448016.3457265

  10. Lemire, D., Boytsov, L., Kurz, N.: SIMD compression and the intersection of sorted integers. Softw. Pract. Exp. 46(6), 723–749 (2016). https://doi.org/10.1002/spe.2326

    Article  Google Scholar 

  11. Lemire, D., Kaser, O., Aouiche, K.: Sorting improves word-aligned bitmap indexes. Data Knowl. Eng. 69(1), 3–28 (2010). https://doi.org/10.1016/j.datak.2009.08.006

    Article  Google Scholar 

  12. Licheri, N., Bonnici, V., Beccuti, M., Giugno, R.: GRAPES-DD: exploiting decision diagrams for index-driven search in biological graph databases. BMC Bioinform. 22(1), 209 (2021). https://doi.org/10.1186/s12859-021-04129-0

    Article  Google Scholar 

  13. Mrzic, A., et al.: Grasping frequent subgraph mining for bioinformatics applications. BioData Min. 11(1), 20:1–20:24 (2018)

    Google Scholar 

  14. Sun, S., Luo, Q.: Scaling up subgraph query processing with efficient subgraph matching. In: 35th IEEE International Conference on Data Engineering, ICDE 2019, Macao, China, 8–11 April 2019, pp. 220–231. IEEE (2019). https://doi.org/10.1109/ICDE.2019.00028

  15. Wangmo, C., Wiese, L.: Efficient subgraph indexing for biochemical graphs. In: Cuzzocrea, A., Gusikhin, O., van der Aalst, W.M.P., Hammoudi, S. (eds.) Proceedings of the 11th International Conference on Data Science, Technology and Applications, DATA 2022, Lisbon, Portugal, 11–13 July 2022, pp. 533–540. SCITEPRESS (2022). https://doi.org/10.5220/0011350100003269

  16. Xie, Y., Yu, P.S.: CP-index: on the efficient indexing of large graphs. In: Macdonald, C., Ounis, I., Ruthven, I. (eds.) Proceedings of the 20th ACM Conference on Information and Knowledge Management, CIKM 2011, Glasgow, United Kingdom, 24–28 October 2011, pp. 1795–1804. ACM (2011). https://doi.org/10.1145/2063576.2063835

  17. Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: Weikum, G., König, A.C., Deßloch, S. (eds.) Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, 13–18 June 2004, pp. 335–346. ACM (2004). https://doi.org/10.1145/1007568.1007607

  18. Yuan, D., Mitra, P.: Lindex: a lattice-based index for graph databases. VLDB J. 22(2), 229–252 (2013). https://doi.org/10.1007/s00778-012-0284-8

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Deutscher Akademischer Austauschdienst (DAAD) for providing funds for research on this project. Extensive calculations were conducted on the Lichtenberg high-performance computer of the TU Darmstadt for this research under project ID P0020213. Furthermore, the authors would like to thank Prof. Dr. Daniel Lemire, for the insightful discussion during the preparation of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chimi Wangmo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wangmo, C., Wiese, L. (2023). SubTempora: A Hybrid Approach for Optimising Subgraph Searching. In: Cuzzocrea, A., Gusikhin, O., Hammoudi, S., Quix, C. (eds) Data Management Technologies and Applications. DATA DATA 2022 2021. Communications in Computer and Information Science, vol 1860. Springer, Cham. https://doi.org/10.1007/978-3-031-37890-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-37890-4_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-37889-8

  • Online ISBN: 978-3-031-37890-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics