Abstract
In this paper, we present an evaluation of the performance of five representative RDF triplestores, including GraphDB, Jena Fuseki, Neptune, RDFox, and Stardog, and one experimental SPARQL query engine, QLever. We compare importing time, loading time, and exporting time using a complete version of the knowledge graph Wikidata, and we also evaluate query performances using 328 queries defined by Wikidata users. To put this evaluation into context with respect to previous evaluations, we also analyze the query performances of these systems using a prominent synthetic benchmark: SP\(^2\)Bench. We observed that most of the systems we considered for the evaluation were able to complete the execution of almost all the queries defined by Wikidata users before the timeout we established. We noticed, however, that the time needed by most systems to import and export Wikidata might be longer than required in some industrial and academic projects, where information is represented, enriched, and stored using different representation means.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified stress testing of RDF data management systems. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 197–212. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11964-9_13
Amazon AWS: Amazon Neptune Official Website. https://aws.amazon.com/neptune/
Amazon Web Services: Amazon EC2 Instance Types - Memory Optimized. https://aws.amazon.com/ec2/instance-types/#Memory_Optimized. Accessed 12 Dec 2022
Amazon Web Services: Amazon Neptune Pricing. https://aws.amazon.com/neptune/pricing/. Accessed 12 Dec 2022
Angles, R., Aranda, C.B., Hogan, A., Rojas, C., Vrgoc̆, D.: WDBench: A Wikidata Graph Query Benchmark. In: The Semantic Web–ISWC 2022. ISWC 2022. Lecture Notes in Computer Science, vol. pp. 714–731 13489. Springer, Cham. https://doi.org/10.1007/978-3-031-19433-7_41
Apache Jena: Apache Jena Fuseki Documentation. https://jena.apache.org/documentation/fuseki2/
Apache Jena: Apache Jena TDB xloader. https://jena.apache.org/documentation/tdb/tdb-xloader.html. Accessed 12 Dec 2022
Bail, S., et al.: FishMark: A Linked Data Application Benchmark. CEUR (2012)
Hannah, B., Björn, B.: QLever GitHub repository. https://github.com/ad-freiburg/qlever
Bizer, C., Schultz, A.: The berlin SPARQL benchmark. Int. J. Seman. Web Inf. Syst. (IJSWIS) 5(2), 1–24 (2009)
Blazegraph: Blazegraph Official Website. https://blazegraph.com/
Demartini, G., Enchev, I., Wylot, M., Gapany, J., Cudré-Mauroux, P.: BowlognaBench—benchmarking RDF analytics. In: Aberer, K., Damiani, E., Dillon, T. (eds.) SIMPDA 2011. LNBIP, vol. 116, pp. 82–102. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-34044-4_5
Erling, O., et al.: The LDBC social network benchmark: interactive workload. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 619–630 (2015)
Fahl, W., Holzheim, T., Westerinen, A., Lange, C., Decker, S.: Getting and hosting your own copy of Wikidata. In: Proceedings of the 3rd Wikidata Workshop 2022. CEUR-WS.org (2022). https://ceur-ws.org/Vol-3262/paper9.pdf
GitHub: Analysis and supplementary information for the paper, including queries, execution logs, query results and scripts. https://github.com/SINTEF-9012/rdf-triplestore-benchmark. Accessed 13 Mar 2023
Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. J. Web Seman. 3(2–3), 158–182 (2005)
Hogan, A., Riveros, C., Rojas, C., Soto, A.: A worst-case optimal join algorithm for SPARQL. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11778, pp. 258–275. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30793-6_15
Ma, L., Yang, Y., Qiu, Z., Xie, G., Pan, Y., Liu, S.: Towards a complete OWL ontology benchmark. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 125–139. Springer, Heidelberg (2006). https://doi.org/10.1007/11762256_12
Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_29
Ontotext: GraphDB Official Website. https://graphdb.ontotext.com/
Ontotext: GraphDB Requirements. https://graphdb.ontotext.com/documentation/enterprise/requirements.html. Accessed 12 Dec 2022
OST: RDFox Documentation: Managing Data Stores. https://docs.oxfordsemantic.tech/5.4/data-stores.html#. Accessed 12 Dec 2022
OST: RDFox Documentation: Operations on Data Stores, persist-ds. https://docs.oxfordsemantic.tech/5.4/data-stores.html#persist-ds. Accessed 12 Dec 2022
Oxford Semantic Technologies: RDFox Official Website. https://www.oxfordsemantic.tech/product
Saleem, M., Ali, M.I., Hogan, A., Mehmood, Q., Ngomo, A.-C.N.: LSQ: the linked SPARQL queries dataset. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 261–269. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25010-6_15
Saleem, M., Mehmood, Q., Ngonga Ngomo, A.-C.: FEASIBLE: a feature-based SPARQL benchmark generation framework. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 52–69. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25007-6_4
Saleem, M., Szárnyas, G., Conrads, F., Bukhari, S.A.C., Mehmood, Q., Ngonga Ngomo, A.C.: How Representative Is a SPARQL Benchmark? An Analysis of RDF Triplestore Benchmarks. In: The World Wide Web Conference, pp. 1623–1633 (2019)
Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP\(^2\)Bench: A SPARQL performance benchmark. In: 2009 IEEE 25th International Conference on Data Engineering, pp. 222–233. IEEE (2009)
Singh, G., Bhatia, S., Mutharaju, R.: OWL2Bench: a benchmark for OWL 2 reasoners. In: Pan, J.Z., et al. (eds.) ISWC 2020. LNCS, vol. 12507, pp. 81–96. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62466-8_6
Stardog: Stardog Capacity Planning. https://docs.stardog.com/operating-stardog/server-administration/capacity-planning. Accessed 12 Dec 2022
Stardog: Stardog Official Website. https://www.stardog.com/
Stardog: 7 Steps to Fast SPARQL Queries. https://www.stardog.com/blog/7-steps-to-fast-sparql-queries/ (2017). Accessed 12 Dec 2022
Szárnyas, G., Izsó, B., Ráth, I., Varró, D.: The train benchmark: cross-technology performance evaluation of continuous model queries. Softw. Syst. Model. 17(4), 1365–1393 (2018)
Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)
W3C: RDF 1.1 Concepts and Abstract Syntax, W3C Recommendation (2014). https://www.w3.org/TR/rdf11-concepts/. Accessed 12 Dec 2022
W3C: SPARQL 1.1 Query Language, W3C Recommendation (2013). https://www.w3.org/TR/sparql11-query/. Accessed 12 Dec 2022
WDQS Search Team: WDQS Backend Alternatives: The process, details and result. Technical report, Wikimedia Foundation (2022). https://www.wikidata.org/wiki/File:WDQS_Backend_Alternatives_working_paper.pdf
Wikidata: SPARQL query service/queries/examples. https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples. Accessed 12 Dec 2022
Wikidata: SPARQL query service/WDQS backend update. https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update. Accessed 12 Dec 2022
Wu, H., Fujiwara, T., Yamamoto, Y., Bolleman, J., Yamaguchi, A.: BioBenchmark toyama 2012: an evaluation of the performance of triple stores on biological data. J. Biomed. Seman. 5(1), 1–11 (2014)
Acknowledgment
The authors would like to thank the anonymous reviewers for their valuable feedback and the companies Ontotext and Oxford Semantic Technologies (OST) for their support during the evaluation. This work has been funded by The Research Council of Norway projects SkyTrack (No 309714), DataBench Norway (No 310134) and SIRIUS Centre (No 237898), and the European Commission projects DataBench (No 780966), VesselAI (No 957237), Iliad (No 101037643), enRichMyData (No 101070284) and Graph-Massivizer (No 101093202).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lam, A.N., Elvesæter, B., Martin-Recuerda, F. (2023). Evaluation of a Representative Selection of SPARQL Query Engines Using Wikidata. In: Pesquita, C., et al. The Semantic Web. ESWC 2023. Lecture Notes in Computer Science, vol 13870. Springer, Cham. https://doi.org/10.1007/978-3-031-33455-9_40
Download citation
DOI: https://doi.org/10.1007/978-3-031-33455-9_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-33454-2
Online ISBN: 978-3-031-33455-9
eBook Packages: Computer ScienceComputer Science (R0)