skip to main content
survey

RDF Data Storage and Query Processing Schemes: A Survey

Published: 06 September 2018 Publication History

Abstract

The Resource Description Framework (RDF) represents a main ingredient and data representation format for Linked Data and the Semantic Web. It supports a generic graph-based data model and data representation format for describing things, including their relationships with other things. As the size of RDF datasets is growing fast, RDF data management systems must be able to cope with growing amounts of data. Even though physically handling RDF data using a relational table is possible, querying a giant triple table becomes very expensive because of the multiple nested joins required for answering graph queries. In addition, the heterogeneity of RDF Data poses entirely new challenges to database systems. This article provides a comprehensive study of the state of the art in handling and querying RDF data. In particular, we focus on data storage techniques, indexing strategies, and query execution mechanisms. Moreover, we provide a classification of existing systems and approaches. We also provide an overview of the various benchmarking efforts in this context and discuss some of the open problems in this domain.

Supplementary Material

a84-wylot-apndx.pdf (wylot.zip)
Supplemental movie, appendix, image and software files for, RDF Data Storage and Query Processing Schemes: A Survey

References

[1]
Daniel J. Abadi, Adam Marcus, Samuel R. Madden, and Kate Hollenbach. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd International Conference on Very Large Data Bases. VLDB Endowment, 411--422.
[2]
Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Alexander Rasin, and Avi Silberschatz. 2009. HadoopDB: An architectural hybrid of mapreduce and DBMS technologies for analytical workloads. Proc. VLDB 2, 1 (2009), 922--933. Retrieved from http://www.vldb.org/pvldb/2/vldb09-861.pdf.
[3]
Maribel Acosta, Maria-Esther Vidal, Tomas Lampo, Julio Castillo, and Edna Ruckhaus. 2011. ANAPSID: An adaptive query processing engine for SPARQL endpoints. Semant. Web (2011), 18--34.
[4]
Razen Al-Harbi, Ibrahim Abdelaziz, Panos Kalnis, Nikos Mamoulis, Yasser Ebrahim, and Majed Sahli. 2016. Accelerating SPARQL queries by exploiting hash-based locality and adaptive partitioning. VLDB J. 25, 3 (2016), 355--380.
[5]
Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, and Dimitris Plexousakis. 2001. On storing voluminous RDF descriptions: The case of web portal catalogs. In Proceedings of the International Workshop on the Web and Databases (WebDB’01). 43--48.
[6]
Keith Alexander and Michael Hausenblas. 2009. Describing linked datasets—On the design and usage of void, the vocabulary of interlinked datasets. In Proceedings of the Linked Data on the Web Workshop (LDOW’09). Retrieved from http://richard.cyganiak.de/2008/papers/void-ldow2009.pdf.
[7]
Güneş Aluç, Olaf Hartig, M. Tamer Özsu, and Khuzaima Daudjee. 2014a. Diversified stress testing of RDF data management systems. In Proceedings of the International Semantic Web Conference. Springer, 197--212.
[8]
Güneş Aluç, M. Tamer Özsu, and Khuzaima Daudjee. 2014b. Workload matters: Why RDF databases need a new design. Proc. VLDB Endow. 7, 10 (2014), 837--840.
[9]
Güneş Aluç, M. Tamer Ozsu, Khuzaima Daudjee, and Olaf Hartig. 2013. Chameleon-db: A Workload-Aware Robust RDF Data Management System. Technical Report CS-2013-10. University of Waterloo.
[10]
Andrés Aranda-Andújar, Francesca Bugiotti, Jesús Camacho-Rodríguez, Dario Colazzo, François Goasdoué, Zoi Kaoudi, and Ioana Manolescu. 2012. AMADA: Web data repositories in the amazon cloud. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’12). 2749--2751.
[11]
Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. 2015. Spark SQL: Relational data processing in spark. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’15). 1383--1394.
[12]
Medha Atre and James A. Hendler. 2009. BitMat: A main memory bit-matrix of RDF triples. In Proceedings of the 5th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’09). Citeseer, 33.
[13]
Medha Atre, Jagannathan Srinivasan, and James A. Hendler. 2008. BitMat: A main-memory bit matrix of RDF triples for conjunctive triple pattern queries. In Proceedings of the Poster and Demonstration Session at the 7th International Semantic Web Conference (ISWC’08). Retrieved from http://ceur-ws.org/Vol-401/iswc2008pd_submission_16.pdf.
[14]
Anirudh Badam and Vivek S. Pai. 2011. SSDAlloc: Hybrid SSD/RAM memory management made easy. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation. USENIX Association, 16--16.
[15]
Tim Berners-Lee, James Hendler, Ora Lassila et al. 2001. The semantic web. Sci. Amer. 284, 5 (2001), 28--37.
[16]
Philip A. Bernstein and Dah-Ming W. Chiu. 1981. Using semi-joins to solve relational queries. J. ACM 28, 1 (1981), 25--40.
[17]
Christian Bizer, Tom Heath, and Tim Berners-Lee. 2009. Linked data-the story so far. https://eprints.soton.ac.uk/271285/.
[18]
Christian Bizer and Andreas Schultz. 2009. The Berlin SPARQL benchmark. Int. J. Semantic Web Inf. Syst. 5, 2 (2009), 1--24.
[19]
Mihaela A. Bornea, Julian Dolby, Anastasios Kementsietsidis, Kavitha Srinivas, Patrick Dantressangle, Octavian Udrea, and Bishwaranjan Bhattacharjee. 2013. Building an efficient RDF store over a relational database. In Proceedings of the 2013 International Conference on Management of Data. ACM, 121--132.
[20]
Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen. 2002. Sesame: A generic architecture for storing and querying RDF and RDF schema. In Proceedings of the 1st International Semantic Web Conference on the Semantic Web (ISWC’02). Springer, 54--68.
[21]
Rick Cattell. 2011. Scalable SQL and NoSQL data stores. ACM SIGMOD Rec. 39, 4 (2011), 12--27.
[22]
Surajit Chaudhuri and Gerhard Weikum. 2000. Rethinking database system architecture: Toward a self-tuning RISC-style database system. In Proceedings of 26th International Conference on Very Large Data Bases (VLDB’00). 1--10.
[23]
Xi Chen, Huajun Chen, Ningyu Zhang, and Songyang Zhang. 2014. SparkRDF: Elastic discreted RDF graph processing engine with distributed memory. In Proceedings of the Posters 8 Demonstrations Track a Track Within the 13th International Semantic Web Conference (ISWC’14). 261--264. Retrieved from http://ceur-ws.org/Vol-1272/paper_43.pdf.
[24]
Xi Chen, Huajun Chen, Ningyu Zhang, and Songyang Zhang. 2015. SparkRDF: Elastic discreted RDF graph processing engine with distributed memory. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT’15). 292--300.
[25]
Long Cheng and Spyros Kotoulas. 2015. Scale-out processing of large RDF datasets. IEEE Trans. Big Data 1, 4 (2015), 138--150.
[26]
Eugene Inseok Chong, Souripriya Das, George Eadon, and Jagannathan Srinivasan. 2005. An efficient SQL-based RDF querying scheme. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB’05). VLDB Endowment, 1216--1227. Retrieved from http://portal.acm.org/citation.cfm?id=1083592.1083734.
[27]
World Wide Web Consortium. 2014a. RDF 1.1: On Semantics of RDF Datasets. https://www.w3.org/TR/rdf11-datasets/.
[28]
World Wide Web Consortium. 2014b. RDF 1.1 Primer.
[29]
George P. Copeland and Setrag Khoshafian. 1985. A decomposition storage model. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 268--279.
[30]
Philippe Cudré-Mauroux, Iliya Enchev, Sever Fundatureanu, Paul Groth, Albert Haque, Andreas Harth, Felix Leif Keppmann, Daniel Miranker, Juan F Sequeda, and Marcin Wylot. 2013. Nosql databases for rdf: An empirical evaluation. In Proceedings of the International Semantic Web Conference. Springer, 310--325.
[31]
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51 (Jan. 2008), 107--113. Issue 1.
[32]
Gianluca Demartini, Iliya Enchev, Marcin Wylot, Joel Gapany, and Philippe Cudre-Mauroux. 2012. BowlognaBench—Benchmarking RDF analytics. In Data-Driven Process Discovery and Analysis, Karl Aberer, Ernesto Damiani, and Tharam Dillon (Eds.). Lecture Notes in Business Information Processing, Vol. 116. Springer, Berlin, 82--102.
[33]
Uwe Deppisch. 1986. S-tree: A dynamic balanced signature index for office retrieval. In Proceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 77--87.
[34]
Amol Deshpande, Zachary Ives, Vijayshankar Raman et al. 2007. Adaptive query processing. Foundations and Trends in Databases 1, 1 (2007), 1--140.
[35]
Benjamin Djahandideh, François Goasdoué, Zoi Kaoudi, Ioana Manolescu, Jorge-Arnulfo Quiané-Ruiz, and Stamatis Zampetakis. 2015. CliqueSquare in action: Flat plans for massively parallel RDF queries. In Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE’15). 1432--1435.
[36]
Orri Erling and Ivan Mikhailov. 2008. Towards web scale RDF. Proc. SSWS (2008). https://www.csee.umbc.edu/courses/graduate/691/spring13/01/papers/VOSArticleWebScaleRDF.pdf.
[37]
Dieter Fensel. 2003. Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer Science 8 Business Media.
[38]
Luis Galárraga, Katja Hose, and Ralf Schenkel. 2014. Partout: A distributed engine for efficient RDF processing. In 23rd International World Wide Web Conference (WWW’14). 267--268.
[39]
José M. Giménez-García, Javier D. Fernández, and Miguel A. Martínez-Prieto. 2015. HDT-MR: A scalable solution for RDF compression with HDT and MapReduce. In Proceedings of the European Semantic Web Conference. Springer, 253--268.
[40]
François Goasdoué, Zoi Kaoudi, Ioana Manolescu, Jorge-Arnulfo Quiané-Ruiz, and Stamatis Zampetakis. 2015. CliqueSquare: Flat plans for massively parallel RDF queries. In Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE’15). 771--782.
[41]
Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 599--613. Retrieved from https://www.usenix.org/conference/osdi14/technical-sessions/presentation/gonzalez.
[42]
Eric L. Goodman and Dirk Grunwald. 2014. Using vertex-centric programming platforms to implement SPARQL queries on large graphs. In Proceedings of the 4th Workshop on Irregular Applications: Architectures and Algorithms (IA3’14). IEEE Press, Piscataway, NJ, 25--32.
[43]
Olaf Görlitz and Steffen Staab. 2011. Splendid: Sparql endpoint federation exploiting void descriptions. In Proceedings of the 2nd International Conference on Consuming Linked Data. CEUR-WS.org, 13--24.
[44]
Yuanbo Guo, Zhengxiang Pan, and Jeff Heflin. 2005. LUBM: A benchmark for OWL knowledge base systems. Web Semant. 3 (Oct. 2005), 158--182. Issue 2--3.
[45]
Sairam Gurajada, Stephan Seufert, Iris Miliaraki, and Martin Theobald. 2014. TriAD: A distributed shared-nothing RDF engine based on asynchronous message passing. In Proceedings of the International Conference on Management of Data (SIGMOD’14). 289--300.
[46]
Laura Haas, Donald Kossmann, Edward Wimmers, and Jun Yang. 1997. Optimizing queries across diverse data sources. VLDB. 276--285. http://www.vldb.org/conf/1997/P276.PDF.
[47]
Peter Haase, Katja Hose, Ralf Schenkel, Michael Schmidt, and Andreas Schwarte. 2014. Federated query processing over linked data. In Linked Data Management. 369--387. Retrieved from
[48]
Peter Haase, Tobias Mathäß, and Michael Ziller. 2010. An evaluation of approaches to federated query processing over linked data. In Proceedings of the 6th International Conference on Semantic Systems. ACM, 5.
[49]
Mohammad Hammoud, Dania Abed Rabbou, Reza Nouri, Seyed-Mehdi-Reza Beheshti, and Sherif Sakr. 2015. DREAM: Distributed RDF engine with adaptive query planner and minimal communication. Proc. VLDB 8, 6 (2015), 654--665. Retrieved from http://www.vldb.org/pvldb/vol8/p654-Hammoud.pdf.
[50]
Razen Harbi, Ibrahim Abdelaziz, Panos Kalnis, and Nikos Mamoulis. 2015. Evaluating SPARQL queries on massive RDF datasets. Proc. VLDB 8, 12 (2015), 1848--1851. Retrieved from http://www.vldb.org/pvldb/vol8/p1848-harbi.pdf.
[51]
Stephen Harris and Nicholas Gibbins. 2003. 3store: Efficient bulk RDF storage. In Proceedings of the 1st International Workshop on Practical and Scalable Semantic Systems (PSSS’03). CEUR-WS.org.
[52]
Steve Harris, Nick Lamb, and Nigel Shadbolt. 2009. 4store: The design and implementation of a clustered RDF store. In Proceedings of the 5th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’09). 94--109.
[53]
Andreas Harth and Stefan Decker. 2005. Optimized index structures for querying RDF from the web. In Proceedings of the IEEE Latin American Web Congress (LA-WEB’05). 71--80.
[54]
Aisha Hasan, Mohammad Hammoud, Reza Nouri, and Sherif Sakr. 2016. DREAM in action: A distributed and adaptive RDF system on the cloud. In Proceedings of the 25th International Conference on World Wide Web (WWW’16). 191--194.
[55]
Jiewen Huang, Daniel J. Abadi, and Kun Ren. 2011. Scalable SPARQL querying of large RDF graphs. Proc. VLDB 4, 11 (2011), 1123--1134.
[56]
Mohammad Husain, James McGlothlin, Mohammad M. Masud, Latifur Khan, and Bhavani M. Thuraisingham. 2011. Heuristics-based query processing for large RDF graphs using cloud computing. IEEE Trans. Knowl. Data Eng. 23, 9 (2011), 1312--1327.
[57]
Vijay Ingalalli, Dino Ienco, Pascal Poncelet, and Serena Villata. 2016. Querying RDF data using a multigraph-based approach. In Proceedings of the 19th International Conference on Extending Database Technology (EDBT’16). 245--256.
[58]
Zoi Kaoudi and Ioana Manolescu. 2015. RDF in the clouds: A survey. VLDB J. 24, 1 (2015), 67--91.
[59]
Vaibhav Khadilkar, Murat Kantarcioglu, Bhavani M. Thuraisingham, and Paolo Castagna. 2012. Jena-HBase: A distributed, scalable and effcient RDF triple store. In Proceedings of the ISWC 2012 Posters 8 Demonstrations Track. Retrieved from http://ceur-ws.org/Vol-914/paper_14.pdf.
[60]
HyeongSik Kim, Padmashree Ravindra, and Kemafor Anyanwu. 2013. Optimizing RDF(S) queries on cloud platforms. In Proceedings of the 22nd International World Wide Web Conference (WWW’13). 261--264. Retrieved from http://dl.acm.org/citation.cfm?id=2487917.
[61]
Jinha Kim, Hyungyu Shin, Wook-Shin Han, Sungpack Hong, and Hassan Chafi. 2015. Taming subgraph isomorphism for RDF query processing. Proc. VLDB 8, 11 (2015), 1238--1249. Retrieved from http://www.vldb.org/pvldb/vol8/p1238-kim.pdf.
[62]
Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. Graphchi: Large-scale graph computation on just a pc. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12), Vol. 8. 31--46.
[63]
Günter Ladwig and Andreas Harth. 2011. CumulusRDF: Linked data management on nested key-value stores. In Proceedings of the 7th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’11). 30.
[64]
Avinash Lakshman and Prashant Malik. 2010. Cassandra: A decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 2 (April 2010), 35--40.
[65]
Kisung Lee and Ling Liu. 2013. Scaling queries over big RDF graphs with semantic hash partitioning. Proc. VLDB Endow. 6, 14 (2013), 1894--1905.
[66]
Baolin Liu and Bo Hu. 2005. An evaluation of RDF storage systems for large data applications. In Proceedings of the 1st International Conference on Semantics, Knowledge and Grid. IEEE, 59--59.
[67]
Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein. 2012. Distributed graphlab: A framework for machine learning in the cloud. Proc. VLDB 5, 8 (2012), 716--727. Retrieved from http://vldb.org/pvldb/vol5/p716_yuchenglow_vldb2012.pdf.
[68]
Li Ma, Zhong Su, Yue Pan, Li Zhang, and Tao Liu. 2004. RStar: An RDF storage and query system for enterprise resource management. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management. ACM, 484--491.
[69]
Miguel A. Martínez-Prieto, Mario Arias, and Javier D. Fernandez. 2012. Exchange and consumption of huge RDF data. In The Semantic Web: Research and Applications. Springer, 437--452.
[70]
Brian McBride. 2002. Jena: A semantic web toolkit. IEEE Internet Comput. 6, 6 (2002), 55--59.
[71]
Mohamed Morsey, Jens Lehmann, Sören Auer, and Axel-Cyrille Ngonga Ngomo. 2011. DBpedia SPARQL benchmark--Performance assessment with real queries on real data. In Proceedings of the International Semantic Web Conference (ISWC’11). Springer, 454--469.
[72]
Raghava Mutharaju, Sherif Sakr, Alessandra Sala, and Pascal Hitzler. 2013. D-SPARQ: Distributed, scalable and efficient RDF query engine. In Proceedings of the ISWC 2013 Posters 8 Demonstrations Track. 261--264. Retrieved from http://ceur-ws.org/Vol-1035/iswc2013_poster_21.pdf.
[73]
Hubert Naacke, Olivier Curé, and Bernd Amann. 2016. SPARQL query processing with apache spark. CoRR abs/1604.08903 (2016). Retrieved from http://arxiv.org/abs/1604.08903.
[74]
Thomas Neumann and Gerhard Weikum. 2008. RDF-3X: A RISC-style engine for RDF. Proc. VLDB Endow. 1, 1 (2008), 647--659.
[75]
Thomas Neumann and Gerhard Weikum. 2010. The RDF-3X engine for scalable management of RDF data. VLDB J. 19, 1 (2010), 91--113.
[76]
Andriy Nikolov, Andreas Schwarte, and Christian Hütter. 2013. Fedsearch: Efficiently combining structured queries and full-text search in a SPARQL federation. In Proceedings of the International Semantic Web Conference. Springer, 427--443.
[77]
Damla Oguz, Belgin Ergenc, Shaoyi Yin, Oguz Dikenelli, and Abdelkader Hameurlain. 2015. Federated query processing on linked data: A qualitative survey and open challenges. Knowl. Eng. Rev. 30, 5 (2015), 545--563.
[78]
Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew Tomkins. 2008. Pig latin: A not-so-foreign language for data processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’08). 1099--1110.
[79]
M. Tamer Özsu. 2016. A survey of RDF data management systems. Front. Comput. Sci. 10, 3 (2016), 418--432.
[80]
Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, Panagiotis Karras, and Nectarios Koziris. 2013. H2RDF+: High-performance distributed joins over large-scale RDF graphs. In Proceedings of the 2013 IEEE International Conference on Big Data. 255--263.
[81]
Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, and Nectarios Koziris. 2012. H2RDF: Adaptive query processing on RDF data in the cloud. In Proceedings of the 21st World Wide Web Conference (WWW’12). 397--400.
[82]
Nikolaos Papailiou, Dimitrios Tsoumakos, Ioannis Konstantinou, Panagiotis Karras, and Nectarios Koziris. 2014. HRDF+: An efficient data management system for big RDF graphs. In Proceedings of the International Conference on Management of Data (SIGMOD’14). 909--912.
[83]
Peng Peng, Lei Zou, Lei Chen, and Dongyan Zhao. 2016. Query workload-based RDF graph fragmentation and allocation. In Proceedings of the 19th International Conference on Extending Database Technology (EDBT’16). 377--388.
[84]
Minh-Duc Pham, Peter Boncz, and Orri Erling. 2012. S3g2: A scalable structure-correlated social graph generator. In Proceedings of the Technology Conference on Performance Evaluation and Benchmarking. Springer, 156--172.
[85]
Roshan Punnoose, Adina Crainiceanu, and David Rapp. 2015. SPARQL in the cloud using Rya. Inf. Syst. 48 (2015), 181--195.
[86]
Nur Aini Rakhmawati, Jürgen Umbrich, Marcel Karnstedt, Ali Hasnain, and Michael Hausenblas. 2013. Querying over federated SPARQL endpoints—A state of the art survey. arXiv Preprint arXiv:1306.1723 (2013).
[87]
Louiqa Raschid and Stanley Y. W. Su. 1986. A parallel processing strategy for evaluating recursive queries. In Proceedings of the Conference on Very Large Data Bases (VLDB’86), Vol. 86. 412--419.
[88]
Padmashree Ravindra, HyeongSik Kim, and Kemafor Anyanwu. 2011. An intermediate algebra for optimizing RDF graph pattern matching on mapreduce. In Proceedings of the 8th Extended Semantic Web Conference: Research and Applications (ESWC’11). 46--61.
[89]
Kurt Rohloff and Richard E. Schantz. 2010. High-performance, massively scalable distributed systems using the mapreduce software framework: The SHARD triple-store. In Programming Support Innovations for Emerging Distributed Applications. ACM, 4.
[90]
Sherif Sakr, Anna Liu, Daniel M. Batista, and Mohammad Alomari. 2011. A survey of large scale data management approaches in cloud environments. IEEE Commun. Surveys Tutor. 13, 3 (2011), 311--336.
[91]
Sherif Sakr, Anna Liu, and Ayman G. Fayoumi. 2013. The family of mapreduce and large-scale data processing systems. Comput. Surveys 46, 1 (2013).
[92]
Muhammad Saleem, Yasar Khan, Ali Hasnain, Ivan Ermilov, and Axel-Cyrille Ngonga Ngomo. 2016. A fine-grained evaluation of SPARQL endpoint federation systems. Semantic Web 7, 5 (2016), 493--518.
[93]
Alexander Schätzle, Martin Przyjaciel-Zablocki, Thorsten Berberich, and Georg Lausen. 2015. S2X: Graph-parallel querying of RDF with graphX. In Proceedings of the 1st International Workshop on Big-Graphs Online Querying (BigOQ’15).
[94]
Alexander Schätzle, Martin Przyjaciel-Zablocki, Thomas Hornung, and Georg Lausen. 2013. PigSPARQL: A SPARQL query processing baseline for big data. In Proceedings of the ISWC 2013 Posters 8 Demonstrations Track. 241--244. Retrieved from http://ceur-ws.org/Vol-1035/iswc2013_poster_16.pdf.
[95]
Alexander Schätzle, Martin Przyjaciel-Zablocki, Simon Skilevic, and Georg Lausen. 2015. S2RDF: RDF querying with SPARQL on spark. CoRR abs/1512.07021 (2015). Retrieved from http://arxiv.org/abs/1512.07021.
[96]
M. Schmidt, T. Hornung, N. Küchlin, G. Lausen, and C. Pinkel. 2008. An experimental comparison of RDF data management approaches in a SPARQL benchmark scenario. In Proceedings of the International Semantic Web Conference (ISWC’08). 82--97.
[97]
M. Schmidt, T. Hornung, G. Lausen, and C. Pinkel. 2009. SPˆ 2bench: A SPARQL performance benchmark. In Proceedings of the IEEE 25th International Conference on Data Engineering (ICDE’09). IEEE, 222--233.
[98]
Andreas Schwarte, Peter Haase, Katja Hose, Ralf Schenkel, and Michael Schmidt. 2011. Fedx: Optimization techniques for federated query processing on linked data. In Proceedings of the International Semantic Web Conference. Springer, 601--616.
[99]
Bin Shao, Haixun Wang, and Yatao Li. 2013. Trinity: A distributed graph engine on a memory cloud. In Proceedings of the 2013 International Conference on Management of Data. ACM, 505--516.
[100]
Lefteris Sidirourgos, Romulo Goncalves, Martin Kersten, Niels Nes, and Stefan Manegold. 2008. Column-store support for RDF data management: Not all swans are white. Proc. VLDB Endow. 1, 2 (2008), 1553--1563.
[101]
Markus Stocker, Andy Seaborne, Abraham Bernstein, Christoph Kiefer, and Dave Reynolds. 2008. SPARQL basic graph pattern optimization using selectivity estimation. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). ACM, 595--604.
[102]
M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. R. Madden, E. O’Neil, P. O’Neil, A. Rasin, N. Tran, and S. Zdonik. 2005. C-store: A column oriented DBMS. In Proceedings of the International Conference on Very Large Data Bases (VLDB’05).
[103]
Philip Stutz, Abraham Bernstein, and William Cohen. 2010. Signal/collect: Graph algorithms for the (semantic) web. In Proceedings of the International Semantic Web Conference. Springer, 764--780.
[104]
Philip Stutz, Bibek Paudel, Mihaela Verman, and Abraham Bernstein. 2015. Random walk triplerush: Asynchronous graph querying and sampling. In Proceedings of the 24th International Conference on World Wide Web (WWW’15). ACM, 1034--1044.
[105]
Tolga Urhan and Michael J. Franklin. 2000. Xjoin: A reactively scheduled pipelined join operatorỳ. Bull. Tech. Committee (2000), 27.
[106]
Patrick Valduriez. 1987. Join indices. ACM Trans. Database Syst. 12, 2 (1987), 218--246.
[107]
Xin Wang, Thanassis Tiropanis, and Hugh C. Davis. 2013. Lhd: Optimising linked data query processing using parallelisation. LDOW. http://ceur-ws.org/Vol-996/papers/ldow2013-paper-06.pdf.
[108]
Cathrin Weiss, Panagiotis Karras, and Abraham Bernstein. 2008. Hexastore: Sextuple indexing for semantic web data management. Proc. VLDB Endow. 1, 1 (2008), 1008--1019.
[109]
Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, and Dave Reynolds. 2003. Efficient RDF storage and retrieval in jena2. In Proceedings of the International Conference on Semantic Web and Databases (SWDB’03). 131--150.
[110]
Kevin Wilkinson and Kevin Wilkinson. 2006. Jena property table implementation. In Proceedings of the International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’06).
[111]
Buwen Wu, Yongluan Zhou, Pingpeng Yuan, Hai Jin, and Ling Liu. 2014. SemStore: A semantic-preserving distributed RDF triple store. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM’14). 509--518.
[112]
Marcin Wylot and Philippe Cudré-Mauroux. 2016. DiploCloud: Efficient and scalable management of RDF data in the cloud. IEEE Trans. Knowl. Data Eng. 28, 3 (2016), 659--674.
[113]
Marcin Wylot, Jigé Pont, Mariusz Wisniewski, and Philippe Cudré-Mauroux. 2011. dipLODocus{RDF} - Short and long-tail RDF analytics for massive webs of data. In Proceedings of the International Semantic Web Conference. 778--793.
[114]
Pingpeng Yuan, Pu Liu, Buwen Wu, Hai Jin, Wenya Zhang, and Ling Liu. 2013. TripleBit: A fast and compact system for large scale RDF data. Proc. VLDB Endow. 6, 7 (2013), 517--528.
[115]
Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’10). Retrieved from https://www.usenix.org/conference/hotcloud-10/spark-cluster-computing-working-sets.
[116]
Kai Zeng, Jiacheng Yang, Haixun Wang, Bin Shao, and Zhongyuan Wang. 2013. A distributed graph engine for web scale RDF data. In Proceedings of the 39th International Conference on Very Large Data Bases. VLDB Endowment, 265--276.
[117]
Xiaofei Zhang, Lei Chen, Yongxin Tong, and Min Wang. 2013. EAGRE: Towards scalable I/O efficient SPARQL query evaluation on the cloud. In Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE’13). 565--576.
[118]
Lei Zou, M. Tamer Özsu, Lei Chen, Xuchuan Shen, Ruizhe Huang, and Dongyan Zhao. 2014. gStore: A graph-based SPARQL query engine. VLDB J. 23, 4 (2014), 565--590.

Cited By

View all
  • (2024)Data storage query and traceability method of electronic certificate based on cloud computing and blockchainIntelligent Decision Technologies10.3233/IDT-23015218:4(2643-2656)Online publication date: 1-Jan-2024
  • (2024)Symbolic Artificial Intelligence for Schema Therapy Using Knowledge GraphsConceptual Knowledge Structures10.1007/978-3-031-67868-4_22(327-333)Online publication date: 9-Sep-2024
  • (2023)LPG-Based Knowledge Graphs: A Survey, a Proposal and Current TrendsInformation10.3390/info1403015414:3(154)Online publication date: 1-Mar-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 51, Issue 4
July 2019
765 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3236632
  • Editor:
  • Sartaj Sahni
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 September 2018
Accepted: 01 December 2017
Revised: 01 December 2017
Received: 01 November 2016
Published in CSUR Volume 51, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. RDF
  2. SPARQL
  3. semi-structured data

Qualifiers

  • Survey
  • Research
  • Refereed

Funding Sources

  • Estonian Research Council

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)200
  • Downloads (Last 6 weeks)14
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Data storage query and traceability method of electronic certificate based on cloud computing and blockchainIntelligent Decision Technologies10.3233/IDT-23015218:4(2643-2656)Online publication date: 1-Jan-2024
  • (2024)Symbolic Artificial Intelligence for Schema Therapy Using Knowledge GraphsConceptual Knowledge Structures10.1007/978-3-031-67868-4_22(327-333)Online publication date: 9-Sep-2024
  • (2023)LPG-Based Knowledge Graphs: A Survey, a Proposal and Current TrendsInformation10.3390/info1403015414:3(154)Online publication date: 1-Mar-2023
  • (2023)Asgard: Are NoSQL databases suitable for ephemeral data in serverless workloads?Frontiers in High Performance Computing10.3389/fhpcp.2023.11278831Online publication date: 4-Sep-2023
  • (2023)Complex Information Objects Repository as a Component of the Semantic Analitic-Information Web-Oriented Systems DevelopmentKibernetika i vyčislitelʹnaâ tehnika10.15407/kvt214.04.0042023:4(214)(4-23)Online publication date: 20-Dec-2023
  • (2023)Scientometric review of Web 3.0Journal of Information Science10.1177/01655515231182073Online publication date: 30-Jun-2023
  • (2023)RDF Subgraph Query Based on Common Subgraph in Distributed EnvironmentWireless Communications & Mobile Computing10.1155/2023/71480712023Online publication date: 1-Jan-2023
  • (2023)HKG: A Novel Approach for Low Resource Indic Languages to Automatic Knowledge Graph ConstructionACM Transactions on Asian and Low-Resource Language Information Processing10.1145/3611306Online publication date: 2-Aug-2023
  • (2023)A Survey on Mapping Semi-Structured Data and Graph Data to Relational DataACM Computing Surveys10.1145/356744455:10(1-38)Online publication date: 2-Feb-2023
  • (2023)Governor: Turning Open Government Data Portals into Interactive DatabasesProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580868(1-16)Online publication date: 19-Apr-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media