Abstract
A meaningful analysis and comparison of both existing storage schemes for RDF data and evaluation approaches for SPARQL queries necessitates a comprehensive and universal benchmark platform. We present SP2Bench, a publicly available, language-specific performance benchmark for the SPARQL query language. SP2Bench is settled in the DBLP scenario and comprises a data generator for creating arbitrarily large DBLP-like documents and a set of carefully designed benchmark queries. The generated documents mirror vital key characteristics and social-world distributions encountered in the original DBLP data set, while the queries implement meaningful requests on top of this data, covering a variety of SPARQL operator constellations and RDF access patterns. In this chapter, we discuss requirements and desiderata for SPARQL benchmarks and present the SP2Bench framework, including its data generator, benchmark queries and performance metrics.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abadi, D.J., Marcus, A., Madden, S., Hollenbach, K.J.: Scalable Semantic Web data management using vertical partitioning. In: VLDB, pp. 411–422 (2007)
Abadi, D.J., Marcus, A., Madden, S., Hollenbach, K.J.: Using the Barton libraries dataset as an RDF benchmark. Technical Report, MIT-CSAIL-TR-2007-036, MIT (2007)
Angles, R., Gutiérrez, C.: The expressive power of SPARQL. In: ISWC, pp. 114–129 (2008)
Alexaki, S., Christophides, V., Karvounarakis, G., Plexousakis, D.: On storing voluminous RDF descriptions: the case of web portal catalogs. In: WebDB, pp. 43–48 (2001)
Bizer, C., Cyganiak, R.: D2R Server publishing the DBLP Bibliography Database. http://www4.wiwiss.fu-berlin.de/dblp/
Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. Int. J. Semant. Web Inf. Syst., Special Issue on Scalability and Performance of Semantic Web Systems (2009)
Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: A generic architecture for storing and querying RDF and RDF schema. In: ISWC, pp. 54–68 (2002)
Carey, M.J., DeWitt, D.J., Naughton, J.F.: The OO7 benchmark. In: SIGMOD, pp. 12–21 (1993)
Chebotko, A., Lu, S., Jamil, H.M., Fotouhi, F.: Semantics preserving SPARQL-to-SQL query translation for optional graph patterns. Technical Report, TR-DB-052006-CLJF (2006)
Cyganiac, R.: A relational algebra for SPARQL. Technical Report, HP Laboratories Bristol (2005)
Elmacioglu, E., Lee, D.: On six degrees of separation in DBLP-DB and more. SIGMOD Rec. 34(2), 33–40 (2005)
Gray, J.: The Benchmark Handbook for Database and Transaction Systems. Morgan Kaufmann, San Mateo (1993)
Groppe, S., Groppe, J., Linnemann, V.: Using an index of precomputed joins in order to speed up SPARQL processing. In: ICEIS, pp. 13–20 (2007)
Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. In: Web Semantics: Science, Services and Agents on the WWW, vol. 3(2–3), pp. 158–182 (2005)
Harris, S., Gibbins, N.: 3store: efficient bulk RDF storage. In: PSSS (2003)
Harth, A., Decker, S.: Optimized index structures for querying RDF from the web. In: LA-WEB, pp. 71–80 (2005)
Hartig, O., Heese, R.: The SPARQL query graph model for query optimization. In: ESWC, pp. 564–578 (2007)
Lausen, G., Meier, M., Schmidt, M.: SPARQLing constraints for RDF. In: EDBT, pp. 499–509 (2008)
Ley, M.: DBLP Database. http://www.informatik.uni-trier.de/~ley/db/
Lotka, A.J.: The frequency distribution of scientific production. J. Wash. Acad. Sci. 16, 317–323 (1926)
Magkanaraki, A., Alexaki, S., Christophides, V., Plexousakis, D.: Benchmarking RDF schemas for the Semantic Web. In: ISWC, pp. 132–146 (2002)
Neumann, T., Weikum, G.: RDF-3X: a RISC-style engine for RDF. In: PVLDB, pp. 647–659 (2008)
Pérez, J., Arenas, M., Gutiérrez, C.: Semantics and complexity of SPARQL. In: ICSW, pp. 30–43 (2006)
Polleres, A.: From SPARQL to rules (and back). In: WWW, pp. 787–796 (2007)
Schmidt, A., Waas, F., Kersten, M.L., Carey, M.J., Manolescu, I., Busse, R.: XMark: A benchmark for XML data management. In: VLDB, pp. 974–985 (2002)
Schmidt, M., Hornung, T., Küchlin, N., Lausen, G., Pinkel, C.: An experimental comparison of RDF data management approaches in a SPARQL benchmark scenario. In: ISWC, pp. 82–97 (2008)
Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. Technical Report, Corr cs.DB 0812.3788 (2008)
Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP2Bench: a SPARQL performance benchmark. In: ICDE, pp. 222–233 (2009)
Sidirourgos, L., Goncalves, R., Kersten, M.L., Nes, N., Manegold, S.: Column-store support for RDF data management: not all swans are white. In: PVLDB, pp. 1553–1563 (2008)
Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: WWW, pp. 505–604 (2008)
Theoharis, Y., Christophides, V., Karvounarakis, G.: Benchmarking RDF representations of RDF/S stores. In: ISWC, pp. 685–701 (2005)
Theoharis, Y., Tzitzikas, Y., Kotzinos, D., Christophides, V.: On graph features of Semantic Web schemas. IEEE Trans. Knowl. Data Eng. 20(5), 692–702 (2008)
W3C: Web Ontology Language (OWL). http://www.w3.org/2004/OWL/
W3C: Resource Description Framework (RDF). http://www.w3.org/RDF/
W3C: SPARQL Query Language for RDF. W3C Recommendation, 15 January 2008. http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/
Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for Semantic Web data management. In: VLDB, pp. 1008–1019 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Schmidt, M., Hornung, T., Meier, M., Pinkel, C., Lausen, G. (2010). SP2Bench: A SPARQL Performance Benchmark. In: de Virgilio, R., Giunchiglia, F., Tanca, L. (eds) Semantic Web Information Management. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04329-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-04329-1_16
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04328-4
Online ISBN: 978-3-642-04329-1
eBook Packages: Computer ScienceComputer Science (R0)