Skip to main content
Log in

Efficient Compilation of Regular Path Queries

  • Fachbeitrag
  • Published:
Datenbank-Spektrum Aims and scope Submit manuscript

Abstract

Ad hoc code generation is a state-of-the-art processing paradigm for database execution engines. It minimizes resource consumption by generating specialized code, tailored and streamlined for the single query at hand. In this work, we apply ad hoc code generation to regular path queries (RPQs), an advanced query type in declarative graph query languages. We investigate code generation from multiple angles. We propose COAT, an embedded domain specific language (EDSL) in C++ to improve accessibility of code generation by simplifying the interaction with compiler APIs. Furthermore, we analyze and compare two back ends for COAT providing the just-in-time (JIT) compilation functionality: LLVM, a compiler framework popularly used in databases for code generation, and AsmJit, a JIT assembler with very low compilation latency. We evaluate various compilation techniques for RPQs on different synthetic graph workloads.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. https://asmjit.com/.

  2. https://asmjit.com/.

  3. https://github.com/tetzank/sigmod18contest.

  4. http://www.ldbcouncil.org.

  5. https://github.com/tetzank/coat.

References

  1. Aberger CR, Tu S, Olukotun K, Ré C (2016) Emptyheaded: a relational engine for graph processing. In: Özcan F, Koutrika G, Madden S (eds) Proceedings of the 2016 International Conference on Management of Data SIGMOD Conference 2016, San Francisco, CA, USA, 26 June–1 July 2016. ACM, New York, NY, USA, pp 431–446. https://doi.org/10.1145/2882903.2915213

    Chapter  Google Scholar 

  2. Angles R, Arenas M, Barceló P, Boncz PA, Fletcher GHL, Gutierrez C, Lindaaker T, Paradies M, Plantikow S, Sequeda JF, van Rest O, Voigt H (2018) G‑CORE: A core for future graph query languages. In: Das G, Jermaine CM, Bernstein PA (eds) Proceedings of the 2018 International Conference on Management of Data SIGMOD Conference 2018, Houston, TX, USA, 10-15 June 2018. ACM, New York, NY, USA, pp 1421–1432. https://doi.org/10.1145/3183713.3190654

    Chapter  Google Scholar 

  3. Bonifati A, Martens W, Timm T (2017) An analytical study of large SPARQL query logs. Proc VLDB Endow 11(2):149–161. https://doi.org/10.14778/3149193.3149196 (http://www.vldb.org/pvldb/vol11/p149-bonifati.pdf)

    Article  Google Scholar 

  4. Butterstein D, Grust T (2016) Precision performance surgery for postgresql: Llvm-based expression compilation, just in time. Proc VLDB Endow 9(13):1517–1520. https://doi.org/10.14778/3007263.3007298 (http://www.vldb.org/pvldb/vol9/p1517-butterstein.pdf)

    Article  Google Scholar 

  5. Chamberlin DD, Astrahan MM, King WF III, Lorie RA, Mehl JW, Price TG, Schkolnick M, Selinger PG, Slutz DR, Wade BW, Yost RA (1981) Support for repetitive transactions and ad hoc queries in system R. ACM Trans Database Syst 6(1):70–94. https://doi.org/10.1145/319540.319550 (https://doi.org/10.1145/319540.319550)

    Article  Google Scholar 

  6. Finkel H, Poliakoff D, Richards DF (2019) Clangjit: Enhancing C++ with just-in-time compilation. CoRR abs/1904.08555. http://arxiv.org/abs/1904.08555. Accessed 13 Dec 2019

  7. Freedman C, Ismert E, Larson P (2014) Compilation in the microsoft SQL server hekaton engine. IEEE Data Eng Bull 37(1):22–30 (http://sites.computer.org/debull/A14mar/p22.pdf)

    Google Scholar 

  8. Hong S, Chafi H, Sedlar E, Olukotun K (2012) Green-marl: a DSL for easy and efficient graph analysis. In: Harris T, Scott ML (eds) Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems ASPLOS 2012, London, UK, 3-7 March 2012. ACM, New York, NY, USA, pp 349–362. https://doi.org/10.1145/2150976.2151013

    Chapter  Google Scholar 

  9. Kersten T, Leis V, Kemper A, Neumann T, Pavlo A, Boncz PA (2018) Everything you always wanted to know about compiled and vectorized queries but were afraid to ask. Proc VLDB Endow 11(13):2209–2222. https://doi.org/10.14778/3275366.3275370 (http://www.vldb.org/pvldb/vol11/p2209-kersten.pdf)

    Article  Google Scholar 

  10. Kohn A, Leis V, Neumann T (2018) Adaptive execution of compiled queries. In: 34th IEEE International Conference on Data Engineering ICDE 2018, Paris, France, 16-19 April 2018. IEEE Computer Society, pp 197–208. https://doi.org/10.1109/ICDE.2018.00027

    Chapter  Google Scholar 

  11. Lattner C, Adve VS (2004) LLVM: A compilation framework for lifelong program analysis & transformation. In: 2nd IEEE / ACM International Symposium on Code Generation and Optimization (CGO 2004) San Jose, CA, USA, 20-24 March 2004. IEEE Computer Society, pp 75–88. https://doi.org/10.1109/CGO.2004.1281665

    Chapter  Google Scholar 

  12. Menon P, Pavlo A, Mowry TC (2017) Relaxed operator fusion for in-memory databases: making compilation, vectorization, and prefetching work together at last. Proc VLDB Endow 11(1):1–13. https://doi.org/10.14778/3151113.3151114 (http://www.vldb.org/pvldb/vol11/p1-menon.pdf)

    Article  Google Scholar 

  13. Neumann T (2011) Efficiently compiling efficient query plans for modern hardware. Proc VLDB Endow 4(9):539–550. https://doi.org/10.14778/2002938.2002940 (http://www.vldb.org/pvldb/vol4/p539-neumann.pdf)

    Article  Google Scholar 

  14. Paradies M, Kinder C, Bross J, Fischer T, Kasperovics R, Gildhoff H (2017) Graphscript: implementing complex graph algorithms in SAP HANA. In: Rompf T, Alexandrov A (eds) Proceedings of The 16th International Symposium on Database Programming Languages DBPL 2017, Munich, Germany, 1 Sept 2017. ACM, New York, NY, USA, pp 13:1–13:4. https://doi.org/10.1145/3122831.3122841

    Chapter  Google Scholar 

  15. Pirk H, Moll O, Zaharia M, Madden S (2016) Voodoo – A vector algebra for portable database performance on modern hardware. Proc VLDB Endow 9(14):1707–1718. https://doi.org/10.14778/3007328.3007336 (http://www.vldb.org/pvldb/vol9/p1707-pirk.pdf)

    Article  Google Scholar 

  16. Raducanu B, Boncz PA, Zukowski M (2013) Micro adaptivity in vectorwise. In: Ross KA, Divesh S, Papadias D (eds) Proceedings of the ACM SIGMOD International Conference on Management of Data SIGMOD 2013, New York, NY, USA, 22-27 June 2013. ACM, New York, NY, USA, pp 1231–1242. https://doi.org/10.1145/2463676.2465292

    Chapter  Google Scholar 

  17. van Rest O, Hong S, Kim J, Meng X, Chafi H (2016) PGQL: a property graph query language. In: Boncz PA, Larriba-Pey J (eds) Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems Redwood Shores, CA, USA, June 24–24, 2016. ACM, New York, NY, USA, p 7. https://doi.org/10.1145/2960414.2960421

    Chapter  Google Scholar 

  18. Rompf T, Odersky M (2012) Lightweight modular staging: a pragmatic approach to runtime code generation and compiled dsls. Commun ACM 55(6):121–130. https://doi.org/10.1145/2184319.2184345

    Article  Google Scholar 

  19. Shaikhha A, Klonatos Y, Koch C (2018) Building efficient query engines in a high-level language. ACM Trans Database Syst 43(1):1–45. https://doi.org/10.1145/3183653

    Article  MathSciNet  Google Scholar 

  20. Tahboub RY, Essertel GM, Rompf T (2018) How to architect a query compiler, revisited. In: Das G, Jermaine CM, Bernstein PA (eds) Proceedings of the 2018 International Conference on Management of Data SIGMOD Conference 2018, Houston, TX, USA, 10-15 June 2018. ACM, New York, NY, USA, pp 307–322. https://doi.org/10.1145/3183713.3196893

    Chapter  Google Scholar 

  21. Tahboub RY, Wu X, Essertel GM, Rompf T (2019) Towards compiling graph queries in relational engines. In: Cheung A, Nguyen K (eds) Proceedings of the 17th ACM SIGPLAN International Symposium on Database Programming Languages DBPL 2019, Phoenix, AZ, USA, 23 June 2019. ACM, New York, NY, USA, pp 30–41. https://doi.org/10.1145/3315507.3330200

    Chapter  Google Scholar 

  22. Terpstra D, Jagode H, You H, Dongarra JJ (2010) Collecting performance data with PAPI‑C. In: Müller MS, Resch MM, Schulz A, Nagel WE (eds) Tools for High Performance Computing 2009 – Proceedings of the 3rd International Workshop on Parallel Tools for High Performance Computing ZIH, Dresden, 09.2009 Springer, Berlin, Heidelberg, New York, pp 157–173 https://doi.org/10.1007/978-3-642-11261-4_11

    Chapter  Google Scholar 

  23. Tetzel F, Kasperovics R, Lehner W (2019) Graph traversals for regular path queries. In: Arora A, Bhattacharya A, Fletcher GHL (eds) Proceedings of the 2nd Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA) Amsterdam, The Netherlands, 30 June 2019. ACM, New York, NY, USA, pp 5:1–5:8. https://doi.org/10.1145/3327964.3328494

    Chapter  Google Scholar 

  24. W3C (2013) SPARQL 1.1 overview. http://www.w3.org/TR/2013/REC-sparql11-overview-20130321/. Accessed 13 Dec 2019

  25. Wadhwa S, Prasad A, Ranu S, Bagchi A, Bedathur S (2019) Efficiently answering regular simple path queries on large labeled networks. In: Boncz PA, Manegold S, Ailamaki A, Deshpande A, Kraska T (eds) Proceedings of the 2019 International Conference on Management of Data SIGMOD Conference 2019, Amsterdam, The Netherlands, 30 June–5 July 2019. ACM, New York, NY, USA, pp 1463–1480. https://doi.org/10.1145/3299869.3319882

    Chapter  Google Scholar 

  26. Wanderman-Milne S, Li N (2014) Runtime code generation in cloudera impala. IEEE Data Eng Bull 37(1):31–37 (http://sites.computer.org/debull/A14mar/p31.pdf)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Frank Tetzel.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tetzel, F., Lehner, W. & Kasperovics, R. Efficient Compilation of Regular Path Queries. Datenbank Spektrum 20, 243–259 (2020). https://doi.org/10.1007/s13222-020-00353-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13222-020-00353-9

Keywords

Navigation