Skip to main content

Evaluation of a Representative Selection of SPARQL Query Engines Using Wikidata

  • Conference paper
  • First Online:
The Semantic Web (ESWC 2023)

Abstract

In this paper, we present an evaluation of the performance of five representative RDF triplestores, including GraphDB, Jena Fuseki, Neptune, RDFox, and Stardog, and one experimental SPARQL query engine, QLever. We compare importing time, loading time, and exporting time using a complete version of the knowledge graph Wikidata, and we also evaluate query performances using 328 queries defined by Wikidata users. To put this evaluation into context with respect to previous evaluations, we also analyze the query performances of these systems using a prominent synthetic benchmark: SP\(^2\)Bench. We observed that most of the systems we considered for the evaluation were able to complete the execution of almost all the queries defined by Wikidata users before the timeout we established. We noticed, however, that the time needed by most systems to import and export Wikidata might be longer than required in some industrial and academic projects, where information is represented, enriched, and stored using different representation means.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified stress testing of RDF data management systems. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 197–212. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_13

    Chapter  Google Scholar 

  2. Amazon AWS: Amazon Neptune Official Website. https://aws.amazon.com/neptune/

  3. Amazon Web Services: Amazon EC2 Instance Types - Memory Optimized. https://aws.amazon.com/ec2/instance-types/#Memory_Optimized. Accessed 12 Dec 2022

  4. Amazon Web Services: Amazon Neptune Pricing. https://aws.amazon.com/neptune/pricing/. Accessed 12 Dec 2022

  5. Angles, R., Aranda, C.B., Hogan, A., Rojas, C., Vrgoc̆, D.: WDBench: A Wikidata Graph Query Benchmark. In: The Semantic Web–ISWC 2022. ISWC 2022. Lecture Notes in Computer Science, vol. pp. 714–731 13489. Springer, Cham. https://doi.org/10.1007/978-3-031-19433-7_41

  6. Apache Jena: Apache Jena Fuseki Documentation. https://jena.apache.org/documentation/fuseki2/

  7. Apache Jena: Apache Jena TDB xloader. https://jena.apache.org/documentation/tdb/tdb-xloader.html. Accessed 12 Dec 2022

  8. Bail, S., et al.: FishMark: A Linked Data Application Benchmark. CEUR (2012)

    Google Scholar 

  9. Hannah, B., Björn, B.: QLever GitHub repository. https://github.com/ad-freiburg/qlever

  10. Bizer, C., Schultz, A.: The berlin SPARQL benchmark. Int. J. Seman. Web Inf. Syst. (IJSWIS) 5(2), 1–24 (2009)

    Article  Google Scholar 

  11. Blazegraph: Blazegraph Official Website. https://blazegraph.com/

  12. Demartini, G., Enchev, I., Wylot, M., Gapany, J., Cudré-Mauroux, P.: BowlognaBench—benchmarking RDF analytics. In: Aberer, K., Damiani, E., Dillon, T. (eds.) SIMPDA 2011. LNBIP, vol. 116, pp. 82–102. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34044-4_5

    Chapter  Google Scholar 

  13. Erling, O., et al.: The LDBC social network benchmark: interactive workload. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 619–630 (2015)

    Google Scholar 

  14. Fahl, W., Holzheim, T., Westerinen, A., Lange, C., Decker, S.: Getting and hosting your own copy of Wikidata. In: Proceedings of the 3rd Wikidata Workshop 2022. CEUR-WS.org (2022). https://ceur-ws.org/Vol-3262/paper9.pdf

  15. GitHub: Analysis and supplementary information for the paper, including queries, execution logs, query results and scripts. https://github.com/SINTEF-9012/rdf-triplestore-benchmark. Accessed 13 Mar 2023

  16. Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. J. Web Seman. 3(2–3), 158–182 (2005)

    Article  Google Scholar 

  17. Hogan, A., Riveros, C., Rojas, C., Soto, A.: A worst-case optimal join algorithm for SPARQL. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11778, pp. 258–275. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30793-6_15

    Chapter  Google Scholar 

  18. Ma, L., Yang, Y., Qiu, Z., Xie, G., Pan, Y., Liu, S.: Towards a complete OWL ontology benchmark. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 125–139. Springer, Heidelberg (2006). https://doi.org/10.1007/11762256_12

    Chapter  Google Scholar 

  19. Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_29

    Chapter  Google Scholar 

  20. Ontotext: GraphDB Official Website. https://graphdb.ontotext.com/

  21. Ontotext: GraphDB Requirements. https://graphdb.ontotext.com/documentation/enterprise/requirements.html. Accessed 12 Dec 2022

  22. OST: RDFox Documentation: Managing Data Stores. https://docs.oxfordsemantic.tech/5.4/data-stores.html#. Accessed 12 Dec 2022

  23. OST: RDFox Documentation: Operations on Data Stores, persist-ds. https://docs.oxfordsemantic.tech/5.4/data-stores.html#persist-ds. Accessed 12 Dec 2022

  24. Oxford Semantic Technologies: RDFox Official Website. https://www.oxfordsemantic.tech/product

  25. Saleem, M., Ali, M.I., Hogan, A., Mehmood, Q., Ngomo, A.-C.N.: LSQ: the linked SPARQL queries dataset. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 261–269. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25010-6_15

    Chapter  Google Scholar 

  26. Saleem, M., Mehmood, Q., Ngonga Ngomo, A.-C.: FEASIBLE: a feature-based SPARQL benchmark generation framework. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 52–69. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_4

    Chapter  Google Scholar 

  27. Saleem, M., Szárnyas, G., Conrads, F., Bukhari, S.A.C., Mehmood, Q., Ngonga Ngomo, A.C.: How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benchmarks. In: The World Wide Web Conference, pp. 1623–1633 (2019)

    Google Scholar 

  28. Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP\(^2\)Bench: A SPARQL performance benchmark. In: 2009 IEEE 25th International Conference on Data Engineering, pp. 222–233. IEEE (2009)

    Google Scholar 

  29. Singh, G., Bhatia, S., Mutharaju, R.: OWL2Bench: a benchmark for OWL 2 reasoners. In: Pan, J.Z., et al. (eds.) ISWC 2020. LNCS, vol. 12507, pp. 81–96. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62466-8_6

    Chapter  Google Scholar 

  30. Stardog: Stardog Capacity Planning. https://docs.stardog.com/operating-stardog/server-administration/capacity-planning. Accessed 12 Dec 2022

  31. Stardog: Stardog Official Website. https://www.stardog.com/

  32. Stardog: 7 Steps to Fast SPARQL Queries. https://www.stardog.com/blog/7-steps-to-fast-sparql-queries/ (2017). Accessed 12 Dec 2022

  33. Szárnyas, G., Izsó, B., Ráth, I., Varró, D.: The train benchmark: cross-technology performance evaluation of continuous model queries. Softw. Syst. Model. 17(4), 1365–1393 (2018)

    Article  Google Scholar 

  34. Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)

    Article  Google Scholar 

  35. W3C: RDF 1.1 Concepts and Abstract Syntax, W3C Recommendation (2014). https://www.w3.org/TR/rdf11-concepts/. Accessed 12 Dec 2022

  36. W3C: SPARQL 1.1 Query Language, W3C Recommendation (2013). https://www.w3.org/TR/sparql11-query/. Accessed 12 Dec 2022

  37. WDQS Search Team: WDQS Backend Alternatives: The process, details and result. Technical report, Wikimedia Foundation (2022). https://www.wikidata.org/wiki/File:WDQS_Backend_Alternatives_working_paper.pdf

  38. Wikidata: SPARQL query service/queries/examples. https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples. Accessed 12 Dec 2022

  39. Wikidata: SPARQL query service/WDQS backend update. https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update. Accessed 12 Dec 2022

  40. Wu, H., Fujiwara, T., Yamamoto, Y., Bolleman, J., Yamaguchi, A.: BioBenchmark toyama 2012: an evaluation of the performance of triple stores on biological data. J. Biomed. Seman. 5(1), 1–11 (2014)

    Google Scholar 

Download references

Acknowledgment

The authors would like to thank the anonymous reviewers for their valuable feedback and the companies Ontotext and Oxford Semantic Technologies (OST) for their support during the evaluation. This work has been funded by The Research Council of Norway projects SkyTrack (No 309714), DataBench Norway (No 310134) and SIRIUS Centre (No 237898), and the European Commission projects DataBench (No 780966), VesselAI (No 957237), Iliad (No 101037643), enRichMyData (No 101070284) and Graph-Massivizer (No 101093202).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to An Ngoc Lam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lam, A.N., Elvesæter, B., Martin-Recuerda, F. (2023). Evaluation of a Representative Selection of SPARQL Query Engines Using Wikidata. In: Pesquita, C., et al. The Semantic Web. ESWC 2023. Lecture Notes in Computer Science, vol 13870. Springer, Cham. https://doi.org/10.1007/978-3-031-33455-9_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-33455-9_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-33454-2

  • Online ISBN: 978-3-031-33455-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics