survey

RDF Data Storage and Query Processing Schemes: A Survey

Authors:

Manfred Hauswirth,

Philippe Cudré-Mauroux,

Sherif SakrAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 51, Issue 4

Article No.: 84, Pages 1 - 36

https://doi.org/10.1145/3177850

Published: 06 September 2018 Publication History

Abstract

The Resource Description Framework (RDF) represents a main ingredient and data representation format for Linked Data and the Semantic Web. It supports a generic graph-based data model and data representation format for describing things, including their relationships with other things. As the size of RDF datasets is growing fast, RDF data management systems must be able to cope with growing amounts of data. Even though physically handling RDF data using a relational table is possible, querying a giant triple table becomes very expensive because of the multiple nested joins required for answering graph queries. In addition, the heterogeneity of RDF Data poses entirely new challenges to database systems. This article provides a comprehensive study of the state of the art in handling and querying RDF data. In particular, we focus on data storage techniques, indexing strategies, and query execution mechanisms. Moreover, we provide a classification of existing systems and approaches. We also provide an overview of the various benchmarking efforts in this context and discuss some of the open problems in this domain.

Supplementary Material

a84-wylot-apndx.pdf (wylot.zip)

Supplemental movie, appendix, image and software files for, RDF Data Storage and Query Processing Schemes: A Survey

Download
40.34 KB

References

[1]

Daniel J. Abadi, Adam Marcus, Samuel R. Madden, and Kate Hollenbach. 2007. Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd International Conference on Very Large Data Bases. VLDB Endowment, 411--422.

Digital Library

[2]

Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Alexander Rasin, and Avi Silberschatz. 2009. HadoopDB: An architectural hybrid of mapreduce and DBMS technologies for analytical workloads. Proc. VLDB 2, 1 (2009), 922--933. Retrieved from http://www.vldb.org/pvldb/2/vldb09-861.pdf.

Digital Library

[3]

Maribel Acosta, Maria-Esther Vidal, Tomas Lampo, Julio Castillo, and Edna Ruckhaus. 2011. ANAPSID: An adaptive query processing engine for SPARQL endpoints. Semant. Web (2011), 18--34.

Digital Library

[4]

Razen Al-Harbi, Ibrahim Abdelaziz, Panos Kalnis, Nikos Mamoulis, Yasser Ebrahim, and Majed Sahli. 2016. Accelerating SPARQL queries by exploiting hash-based locality and adaptive partitioning. VLDB J. 25, 3 (2016), 355--380.

Digital Library

[5]

Sofia Alexaki, Vassilis Christophides, Gregory Karvounarakis, and Dimitris Plexousakis. 2001. On storing voluminous RDF descriptions: The case of web portal catalogs. In Proceedings of the International Workshop on the Web and Databases (WebDB’01). 43--48.

[6]

Keith Alexander and Michael Hausenblas. 2009. Describing linked datasets—On the design and usage of void, the vocabulary of interlinked datasets. In Proceedings of the Linked Data on the Web Workshop (LDOW’09). Retrieved from http://richard.cyganiak.de/2008/papers/void-ldow2009.pdf.

[7]

Güneş Aluç, Olaf Hartig, M. Tamer Özsu, and Khuzaima Daudjee. 2014a. Diversified stress testing of RDF data management systems. In Proceedings of the International Semantic Web Conference. Springer, 197--212.

Digital Library

[8]

Güneş Aluç, M. Tamer Özsu, and Khuzaima Daudjee. 2014b. Workload matters: Why RDF databases need a new design. Proc. VLDB Endow. 7, 10 (2014), 837--840.

Digital Library

[9]

Güneş Aluç, M. Tamer Ozsu, Khuzaima Daudjee, and Olaf Hartig. 2013. Chameleon-db: A Workload-Aware Robust RDF Data Management System. Technical Report CS-2013-10. University of Waterloo.

[10]

Andrés Aranda-Andújar, Francesca Bugiotti, Jesús Camacho-Rodríguez, Dario Colazzo, François Goasdoué, Zoi Kaoudi, and Ioana Manolescu. 2012. AMADA: Web data repositories in the amazon cloud. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM’12). 2749--2751.

Digital Library

[11]

Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. 2015. Spark SQL: Relational data processing in spark. In Proceedings of the ACM International Conference on Management of Data (SIGMOD’15). 1383--1394.

Digital Library

[12]

Medha Atre and James A. Hendler. 2009. BitMat: A main memory bit-matrix of RDF triples. In Proceedings of the 5th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’09). Citeseer, 33.

[13]

Medha Atre, Jagannathan Srinivasan, and James A. Hendler. 2008. BitMat: A main-memory bit matrix of RDF triples for conjunctive triple pattern queries. In Proceedings of the Poster and Demonstration Session at the 7th International Semantic Web Conference (ISWC’08). Retrieved from http://ceur-ws.org/Vol-401/iswc2008pd_submission_16.pdf.

Digital Library

[14]

Anirudh Badam and Vivek S. Pai. 2011. SSDAlloc: Hybrid SSD/RAM memory management made easy. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation. USENIX Association, 16--16.

Digital Library

[15]

Tim Berners-Lee, James Hendler, Ora Lassila et al. 2001. The semantic web. Sci. Amer. 284, 5 (2001), 28--37.

[16]

Philip A. Bernstein and Dah-Ming W. Chiu. 1981. Using semi-joins to solve relational queries. J. ACM 28, 1 (1981), 25--40.

Digital Library

[17]

Christian Bizer, Tom Heath, and Tim Berners-Lee. 2009. Linked data-the story so far. https://eprints.soton.ac.uk/271285/.

[18]

Christian Bizer and Andreas Schultz. 2009. The Berlin SPARQL benchmark. Int. J. Semantic Web Inf. Syst. 5, 2 (2009), 1--24.

[19]

Mihaela A. Bornea, Julian Dolby, Anastasios Kementsietsidis, Kavitha Srinivas, Patrick Dantressangle, Octavian Udrea, and Bishwaranjan Bhattacharjee. 2013. Building an efficient RDF store over a relational database. In Proceedings of the 2013 International Conference on Management of Data. ACM, 121--132.

Digital Library

[20]

Jeen Broekstra, Arjohn Kampman, and Frank van Harmelen. 2002. Sesame: A generic architecture for storing and querying RDF and RDF schema. In Proceedings of the 1st International Semantic Web Conference on the Semantic Web (ISWC’02). Springer, 54--68.

Digital Library

[21]

Rick Cattell. 2011. Scalable SQL and NoSQL data stores. ACM SIGMOD Rec. 39, 4 (2011), 12--27.

Digital Library

[22]

Surajit Chaudhuri and Gerhard Weikum. 2000. Rethinking database system architecture: Toward a self-tuning RISC-style database system. In Proceedings of 26th International Conference on Very Large Data Bases (VLDB’00). 1--10.

Digital Library

[23]

Xi Chen, Huajun Chen, Ningyu Zhang, and Songyang Zhang. 2014. SparkRDF: Elastic discreted RDF graph processing engine with distributed memory. In Proceedings of the Posters 8 Demonstrations Track a Track Within the 13th International Semantic Web Conference (ISWC’14). 261--264. Retrieved from http://ceur-ws.org/Vol-1272/paper_43.pdf.

Digital Library

[24]

Xi Chen, Huajun Chen, Ningyu Zhang, and Songyang Zhang. 2015. SparkRDF: Elastic discreted RDF graph processing engine with distributed memory. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT’15). 292--300.

Digital Library

[25]

Long Cheng and Spyros Kotoulas. 2015. Scale-out processing of large RDF datasets. IEEE Trans. Big Data 1, 4 (2015), 138--150.

[26]

Eugene Inseok Chong, Souripriya Das, George Eadon, and Jagannathan Srinivasan. 2005. An efficient SQL-based RDF querying scheme. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB’05). VLDB Endowment, 1216--1227. Retrieved from http://portal.acm.org/citation.cfm?id=1083592.1083734.

Digital Library

[27]

World Wide Web Consortium. 2014a. RDF 1.1: On Semantics of RDF Datasets. https://www.w3.org/TR/rdf11-datasets/.

[28]

World Wide Web Consortium. 2014b. RDF 1.1 Primer.

[29]

George P. Copeland and Setrag Khoshafian. 1985. A decomposition storage model. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 268--279.

Digital Library

[30]

Philippe Cudré-Mauroux, Iliya Enchev, Sever Fundatureanu, Paul Groth, Albert Haque, Andreas Harth, Felix Leif Keppmann, Daniel Miranker, Juan F Sequeda, and Marcin Wylot. 2013. Nosql databases for rdf: An empirical evaluation. In Proceedings of the International Semantic Web Conference. Springer, 310--325.

Digital Library

[31]

Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51 (Jan. 2008), 107--113. Issue 1.

Digital Library

[32]

Gianluca Demartini, Iliya Enchev, Marcin Wylot, Joel Gapany, and Philippe Cudre-Mauroux. 2012. BowlognaBench—Benchmarking RDF analytics. In Data-Driven Process Discovery and Analysis, Karl Aberer, Ernesto Damiani, and Tharam Dillon (Eds.). Lecture Notes in Business Information Processing, Vol. 116. Springer, Berlin, 82--102.

[33]

Uwe Deppisch. 1986. S-tree: A dynamic balanced signature index for office retrieval. In Proceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 77--87.

Digital Library

[34]

Amol Deshpande, Zachary Ives, Vijayshankar Raman et al. 2007. Adaptive query processing. Foundations and Trends in Databases 1, 1 (2007), 1--140.

Digital Library

[35]

Benjamin Djahandideh, François Goasdoué, Zoi Kaoudi, Ioana Manolescu, Jorge-Arnulfo Quiané-Ruiz, and Stamatis Zampetakis. 2015. CliqueSquare in action: Flat plans for massively parallel RDF queries. In Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE’15). 1432--1435.

[36]

Orri Erling and Ivan Mikhailov. 2008. Towards web scale RDF. Proc. SSWS (2008). https://www.csee.umbc.edu/courses/graduate/691/spring13/01/papers/VOSArticleWebScaleRDF.pdf.

[37]

Dieter Fensel. 2003. Ontologies: A Silver Bullet for Knowledge Management and Electronic Commerce. Springer Science 8 Business Media.

Digital Library

[38]

Luis Galárraga, Katja Hose, and Ralf Schenkel. 2014. Partout: A distributed engine for efficient RDF processing. In 23rd International World Wide Web Conference (WWW’14). 267--268.

Digital Library

[39]

José M. Giménez-García, Javier D. Fernández, and Miguel A. Martínez-Prieto. 2015. HDT-MR: A scalable solution for RDF compression with HDT and MapReduce. In Proceedings of the European Semantic Web Conference. Springer, 253--268.

Digital Library

[40]

François Goasdoué, Zoi Kaoudi, Ioana Manolescu, Jorge-Arnulfo Quiané-Ruiz, and Stamatis Zampetakis. 2015. CliqueSquare: Flat plans for massively parallel RDF queries. In Proceedings of the 31st IEEE International Conference on Data Engineering (ICDE’15). 771--782.

[41]

Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. 2014. GraphX: Graph processing in a distributed dataflow framework. In Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI’14). 599--613. Retrieved from https://www.usenix.org/conference/osdi14/technical-sessions/presentation/gonzalez.

Digital Library

[42]

Eric L. Goodman and Dirk Grunwald. 2014. Using vertex-centric programming platforms to implement SPARQL queries on large graphs. In Proceedings of the 4th Workshop on Irregular Applications: Architectures and Algorithms (IA3’14). IEEE Press, Piscataway, NJ, 25--32.

Digital Library

[43]

Olaf Görlitz and Steffen Staab. 2011. Splendid: Sparql endpoint federation exploiting void descriptions. In Proceedings of the 2nd International Conference on Consuming Linked Data. CEUR-WS.org, 13--24.

Digital Library

[44]

Yuanbo Guo, Zhengxiang Pan, and Jeff Heflin. 2005. LUBM: A benchmark for OWL knowledge base systems. Web Semant. 3 (Oct. 2005), 158--182. Issue 2--3.

Digital Library

[45]

Sairam Gurajada, Stephan Seufert, Iris Miliaraki, and Martin Theobald. 2014. TriAD: A distributed shared-nothing RDF engine based on asynchronous message passing. In Proceedings of the International Conference on Management of Data (SIGMOD’14). 289--300.

Digital Library

[46]

Laura Haas, Donald Kossmann, Edward Wimmers, and Jun Yang. 1997. Optimizing queries across diverse data sources. VLDB. 276--285. http://www.vldb.org/conf/1997/P276.PDF.

Digital Library

[47]

Peter Haase, Katja Hose, Ralf Schenkel, Michael Schmidt, and Andreas Schwarte. 2014. Federated query processing over linked data. In Linked Data Management. 369--387. Retrieved from

[48]

Peter Haase, Tobias Mathäß, and Michael Ziller. 2010. An evaluation of approaches to federated query processing over linked data. In Proceedings of the 6th International Conference on Semantic Systems. ACM, 5.

Digital Library

[49]

Mohammad Hammoud, Dania Abed Rabbou, Reza Nouri, Seyed-Mehdi-Reza Beheshti, and Sherif Sakr. 2015. DREAM: Distributed RDF engine with adaptive query planner and minimal communication. Proc. VLDB 8, 6 (2015), 654--665. Retrieved from http://www.vldb.org/pvldb/vol8/p654-Hammoud.pdf.

Digital Library

[50]

Razen Harbi, Ibrahim Abdelaziz, Panos Kalnis, and Nikos Mamoulis. 2015. Evaluating SPARQL queries on massive RDF datasets. Proc. VLDB 8, 12 (2015), 1848--1851. Retrieved from http://www.vldb.org/pvldb/vol8/p1848-harbi.pdf.

Digital Library

[51]

Stephen Harris and Nicholas Gibbins. 2003. 3store: Efficient bulk RDF storage. In Proceedings of the 1st International Workshop on Practical and Scalable Semantic Systems (PSSS’03). CEUR-WS.org.

[52]

Steve Harris, Nick Lamb, and Nigel Shadbolt. 2009. 4store: The design and implementation of a clustered RDF store. In Proceedings of the 5th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’09). 94--109.

[53]

Andreas Harth and Stefan Decker. 2005. Optimized index structures for querying RDF from the web. In Proceedings of the IEEE Latin American Web Congress (LA-WEB’05). 71--80.

Digital Library

[54]

Aisha Hasan, Mohammad Hammoud, Reza Nouri, and Sherif Sakr. 2016. DREAM in action: A distributed and adaptive RDF system on the cloud. In Proceedings of the 25th International Conference on World Wide Web (WWW’16). 191--194.

Digital Library

[55]

Jiewen Huang, Daniel J. Abadi, and Kun Ren. 2011. Scalable SPARQL querying of large RDF graphs. Proc. VLDB 4, 11 (2011), 1123--1134.

Digital Library

[56]

Mohammad Husain, James McGlothlin, Mohammad M. Masud, Latifur Khan, and Bhavani M. Thuraisingham. 2011. Heuristics-based query processing for large RDF graphs using cloud computing. IEEE Trans. Knowl. Data Eng. 23, 9 (2011), 1312--1327.

Digital Library

[57]

Vijay Ingalalli, Dino Ienco, Pascal Poncelet, and Serena Villata. 2016. Querying RDF data using a multigraph-based approach. In Proceedings of the 19th International Conference on Extending Database Technology (EDBT’16). 245--256.

[58]

Zoi Kaoudi and Ioana Manolescu. 2015. RDF in the clouds: A survey. VLDB J. 24, 1 (2015), 67--91.

Digital Library

[59]

Vaibhav Khadilkar, Murat Kantarcioglu, Bhavani M. Thuraisingham, and Paolo Castagna. 2012. Jena-HBase: A distributed, scalable and effcient RDF triple store. In Proceedings of the ISWC 2012 Posters 8 Demonstrations Track. Retrieved from http://ceur-ws.org/Vol-914/paper_14.pdf.

Digital Library

[60]

HyeongSik Kim, Padmashree Ravindra, and Kemafor Anyanwu. 2013. Optimizing RDF(S) queries on cloud platforms. In Proceedings of the 22nd International World Wide Web Conference (WWW’13). 261--264. Retrieved from http://dl.acm.org/citation.cfm?id=2487917.

Digital Library

[61]

Jinha Kim, Hyungyu Shin, Wook-Shin Han, Sungpack Hong, and Hassan Chafi. 2015. Taming subgraph isomorphism for RDF query processing. Proc. VLDB 8, 11 (2015), 1238--1249. Retrieved from http://www.vldb.org/pvldb/vol8/p1238-kim.pdf.

Digital Library

[62]

Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. Graphchi: Large-scale graph computation on just a pc. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI’12), Vol. 8. 31--46.

Digital Library

[63]

Günter Ladwig and Andreas Harth. 2011. CumulusRDF: Linked data management on nested key-value stores. In Proceedings of the 7th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’11). 30.

[64]

Avinash Lakshman and Prashant Malik. 2010. Cassandra: A decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 2 (April 2010), 35--40.

Digital Library

[65]

Kisung Lee and Ling Liu. 2013. Scaling queries over big RDF graphs with semantic hash partitioning. Proc. VLDB Endow. 6, 14 (2013), 1894--1905.

Digital Library

[66]

Baolin Liu and Bo Hu. 2005. An evaluation of RDF storage systems for large data applications. In Proceedings of the 1st International Conference on Semantics, Knowledge and Grid. IEEE, 59--59.

Digital Library

[67]

Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein. 2012. Distributed graphlab: A framework for machine learning in the cloud. Proc. VLDB 5, 8 (2012), 716--727. Retrieved from http://vldb.org/pvldb/vol5/p716_yuchenglow_vldb2012.pdf.

Digital Library

[68]

Li Ma, Zhong Su, Yue Pan, Li Zhang, and Tao Liu. 2004. RStar: An RDF storage and query system for enterprise resource management. In Proceedings of the 13th ACM International Conference on Information and Knowledge Management. ACM, 484--491.

Digital Library

[69]

Miguel A. Martínez-Prieto, Mario Arias, and Javier D. Fernandez. 2012. Exchange and consumption of huge RDF data. In The Semantic Web: Research and Applications. Springer, 437--452.

Digital Library

[70]

Brian McBride. 2002. Jena: A semantic web toolkit. IEEE Internet Comput. 6, 6 (2002), 55--59.

Digital Library

[71]

Mohamed Morsey, Jens Lehmann, Sören Auer, and Axel-Cyrille Ngonga Ngomo. 2011. DBpedia SPARQL benchmark--Performance assessment with real queries on real data. In Proceedings of the International Semantic Web Conference (ISWC’11). Springer, 454--469.

Digital Library

[72]

Raghava Mutharaju, Sherif Sakr, Alessandra Sala, and Pascal Hitzler. 2013. D-SPARQ: Distributed, scalable and efficient RDF query engine. In Proceedings of the ISWC 2013 Posters 8 Demonstrations Track. 261--264. Retrieved from http://ceur-ws.org/Vol-1035/iswc2013_poster_21.pdf.

Digital Library

[73]

Hubert Naacke, Olivier Curé, and Bernd Amann. 2016. SPARQL query processing with apache spark. CoRR abs/1604.08903 (2016). Retrieved from http://arxiv.org/abs/1604.08903.

[74]

Thomas Neumann and Gerhard Weikum. 2008. RDF-3X: A RISC-style engine for RDF. Proc. VLDB Endow. 1, 1 (2008), 647--659.

Digital Library

[75]

Thomas Neumann and Gerhard Weikum. 2010. The RDF-3X engine for scalable management of RDF data. VLDB J. 19, 1 (2010), 91--113.

Digital Library

[76]

Andriy Nikolov, Andreas Schwarte, and Christian Hütter. 2013. Fedsearch: Efficiently combining structured queries and full-text search in a SPARQL federation. In Proceedings of the International Semantic Web Conference. Springer, 427--443.

Digital Library

[77]

Damla Oguz, Belgin Ergenc, Shaoyi Yin, Oguz Dikenelli, and Abdelkader Hameurlain. 2015. Federated query processing on linked data: A qualitative survey and open challenges. Knowl. Eng. Rev. 30, 5 (2015), 545--563.

[78]

Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, and Andrew Tomkins. 2008. Pig latin: A not-so-foreign language for data processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’08). 1099--1110.

Digital Library

[79]

M. Tamer Özsu. 2016. A survey of RDF data management systems. Front. Comput. Sci. 10, 3 (2016), 418--432.

Digital Library

[80]

Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, Panagiotis Karras, and Nectarios Koziris. 2013. H2RDF+: High-performance distributed joins over large-scale RDF graphs. In Proceedings of the 2013 IEEE International Conference on Big Data. 255--263.

[81]

Nikolaos Papailiou, Ioannis Konstantinou, Dimitrios Tsoumakos, and Nectarios Koziris. 2012. H2RDF: Adaptive query processing on RDF data in the cloud. In Proceedings of the 21st World Wide Web Conference (WWW’12). 397--400.

Digital Library

[82]

Nikolaos Papailiou, Dimitrios Tsoumakos, Ioannis Konstantinou, Panagiotis Karras, and Nectarios Koziris. 2014. HRDF+: An efficient data management system for big RDF graphs. In Proceedings of the International Conference on Management of Data (SIGMOD’14). 909--912.

Digital Library

[83]

Peng Peng, Lei Zou, Lei Chen, and Dongyan Zhao. 2016. Query workload-based RDF graph fragmentation and allocation. In Proceedings of the 19th International Conference on Extending Database Technology (EDBT’16). 377--388.

[84]

Minh-Duc Pham, Peter Boncz, and Orri Erling. 2012. S3g2: A scalable structure-correlated social graph generator. In Proceedings of the Technology Conference on Performance Evaluation and Benchmarking. Springer, 156--172.

[85]

Roshan Punnoose, Adina Crainiceanu, and David Rapp. 2015. SPARQL in the cloud using Rya. Inf. Syst. 48 (2015), 181--195.

Digital Library

[86]

Nur Aini Rakhmawati, Jürgen Umbrich, Marcel Karnstedt, Ali Hasnain, and Michael Hausenblas. 2013. Querying over federated SPARQL endpoints—A state of the art survey. arXiv Preprint arXiv:1306.1723 (2013).

[87]

Louiqa Raschid and Stanley Y. W. Su. 1986. A parallel processing strategy for evaluating recursive queries. In Proceedings of the Conference on Very Large Data Bases (VLDB’86), Vol. 86. 412--419.

Digital Library

[88]

Padmashree Ravindra, HyeongSik Kim, and Kemafor Anyanwu. 2011. An intermediate algebra for optimizing RDF graph pattern matching on mapreduce. In Proceedings of the 8th Extended Semantic Web Conference: Research and Applications (ESWC’11). 46--61.

Digital Library

[89]

Kurt Rohloff and Richard E. Schantz. 2010. High-performance, massively scalable distributed systems using the mapreduce software framework: The SHARD triple-store. In Programming Support Innovations for Emerging Distributed Applications. ACM, 4.

Digital Library

[90]

Sherif Sakr, Anna Liu, Daniel M. Batista, and Mohammad Alomari. 2011. A survey of large scale data management approaches in cloud environments. IEEE Commun. Surveys Tutor. 13, 3 (2011), 311--336.

[91]

Sherif Sakr, Anna Liu, and Ayman G. Fayoumi. 2013. The family of mapreduce and large-scale data processing systems. Comput. Surveys 46, 1 (2013).

Digital Library

[92]

Muhammad Saleem, Yasar Khan, Ali Hasnain, Ivan Ermilov, and Axel-Cyrille Ngonga Ngomo. 2016. A fine-grained evaluation of SPARQL endpoint federation systems. Semantic Web 7, 5 (2016), 493--518.

Digital Library

[93]

Alexander Schätzle, Martin Przyjaciel-Zablocki, Thorsten Berberich, and Georg Lausen. 2015. S2X: Graph-parallel querying of RDF with graphX. In Proceedings of the 1st International Workshop on Big-Graphs Online Querying (BigOQ’15).

[94]

Alexander Schätzle, Martin Przyjaciel-Zablocki, Thomas Hornung, and Georg Lausen. 2013. PigSPARQL: A SPARQL query processing baseline for big data. In Proceedings of the ISWC 2013 Posters 8 Demonstrations Track. 241--244. Retrieved from http://ceur-ws.org/Vol-1035/iswc2013_poster_16.pdf.

Digital Library

[95]

Alexander Schätzle, Martin Przyjaciel-Zablocki, Simon Skilevic, and Georg Lausen. 2015. S2RDF: RDF querying with SPARQL on spark. CoRR abs/1512.07021 (2015). Retrieved from http://arxiv.org/abs/1512.07021.

[96]

M. Schmidt, T. Hornung, N. Küchlin, G. Lausen, and C. Pinkel. 2008. An experimental comparison of RDF data management approaches in a SPARQL benchmark scenario. In Proceedings of the International Semantic Web Conference (ISWC’08). 82--97.

Digital Library

[97]

M. Schmidt, T. Hornung, G. Lausen, and C. Pinkel. 2009. SPˆ 2bench: A SPARQL performance benchmark. In Proceedings of the IEEE 25th International Conference on Data Engineering (ICDE’09). IEEE, 222--233.

Digital Library

[98]

Andreas Schwarte, Peter Haase, Katja Hose, Ralf Schenkel, and Michael Schmidt. 2011. Fedx: Optimization techniques for federated query processing on linked data. In Proceedings of the International Semantic Web Conference. Springer, 601--616.

Digital Library

[99]

Bin Shao, Haixun Wang, and Yatao Li. 2013. Trinity: A distributed graph engine on a memory cloud. In Proceedings of the 2013 International Conference on Management of Data. ACM, 505--516.

Digital Library

[100]

Lefteris Sidirourgos, Romulo Goncalves, Martin Kersten, Niels Nes, and Stefan Manegold. 2008. Column-store support for RDF data management: Not all swans are white. Proc. VLDB Endow. 1, 2 (2008), 1553--1563.

Digital Library

[101]

Markus Stocker, Andy Seaborne, Abraham Bernstein, Christoph Kiefer, and Dave Reynolds. 2008. SPARQL basic graph pattern optimization using selectivity estimation. In Proceedings of the 17th International Conference on World Wide Web (WWW’08). ACM, 595--604.

Digital Library

[102]

M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. R. Madden, E. O’Neil, P. O’Neil, A. Rasin, N. Tran, and S. Zdonik. 2005. C-store: A column oriented DBMS. In Proceedings of the International Conference on Very Large Data Bases (VLDB’05).

Digital Library

[103]

Philip Stutz, Abraham Bernstein, and William Cohen. 2010. Signal/collect: Graph algorithms for the (semantic) web. In Proceedings of the International Semantic Web Conference. Springer, 764--780.

Digital Library

[104]

Philip Stutz, Bibek Paudel, Mihaela Verman, and Abraham Bernstein. 2015. Random walk triplerush: Asynchronous graph querying and sampling. In Proceedings of the 24th International Conference on World Wide Web (WWW’15). ACM, 1034--1044.

Digital Library

[105]

Tolga Urhan and Michael J. Franklin. 2000. Xjoin: A reactively scheduled pipelined join operatorỳ. Bull. Tech. Committee (2000), 27.

[106]

Patrick Valduriez. 1987. Join indices. ACM Trans. Database Syst. 12, 2 (1987), 218--246.

Digital Library

[107]

Xin Wang, Thanassis Tiropanis, and Hugh C. Davis. 2013. Lhd: Optimising linked data query processing using parallelisation. LDOW. http://ceur-ws.org/Vol-996/papers/ldow2013-paper-06.pdf.

[108]

Cathrin Weiss, Panagiotis Karras, and Abraham Bernstein. 2008. Hexastore: Sextuple indexing for semantic web data management. Proc. VLDB Endow. 1, 1 (2008), 1008--1019.

Digital Library

[109]

Kevin Wilkinson, Craig Sayers, Harumi A. Kuno, and Dave Reynolds. 2003. Efficient RDF storage and retrieval in jena2. In Proceedings of the International Conference on Semantic Web and Databases (SWDB’03). 131--150.

Digital Library

[110]

Kevin Wilkinson and Kevin Wilkinson. 2006. Jena property table implementation. In Proceedings of the International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS’06).

[111]

Buwen Wu, Yongluan Zhou, Pingpeng Yuan, Hai Jin, and Ling Liu. 2014. SemStore: A semantic-preserving distributed RDF triple store. In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM’14). 509--518.

Digital Library

[112]

Marcin Wylot and Philippe Cudré-Mauroux. 2016. DiploCloud: Efficient and scalable management of RDF data in the cloud. IEEE Trans. Knowl. Data Eng. 28, 3 (2016), 659--674.

Digital Library

[113]

Marcin Wylot, Jigé Pont, Mariusz Wisniewski, and Philippe Cudré-Mauroux. 2011. dipLODocus{RDF} - Short and long-tail RDF analytics for massive webs of data. In Proceedings of the International Semantic Web Conference. 778--793.

Digital Library

[114]

Pingpeng Yuan, Pu Liu, Buwen Wu, Hai Jin, Wenya Zhang, and Ling Liu. 2013. TripleBit: A fast and compact system for large scale RDF data. Proc. VLDB Endow. 6, 7 (2013), 517--528.

Digital Library

[115]

Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In Proceedings of the 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’10). Retrieved from https://www.usenix.org/conference/hotcloud-10/spark-cluster-computing-working-sets.

Digital Library

[116]

Kai Zeng, Jiacheng Yang, Haixun Wang, Bin Shao, and Zhongyuan Wang. 2013. A distributed graph engine for web scale RDF data. In Proceedings of the 39th International Conference on Very Large Data Bases. VLDB Endowment, 265--276.

Digital Library

[117]

Xiaofei Zhang, Lei Chen, Yongxin Tong, and Min Wang. 2013. EAGRE: Towards scalable I/O efficient SPARQL query evaluation on the cloud. In Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE’13). 565--576.

Digital Library

[118]

Lei Zou, M. Tamer Özsu, Lei Chen, Xuchuan Shen, Ruizhe Huang, and Dongyan Zhao. 2014. gStore: A graph-based SPARQL query engine. VLDB J. 23, 4 (2014), 565--590.

Digital Library

Cited By

Chen HWei BHuang ZShankar A(2024)Data storage query and traceability method of electronic certificate based on cloud computing and blockchainIntelligent Decision Technologies10.3233/IDT-23015218:4(2643-2656)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.3233/IDT-230152
Croitoru MBlanc NAnders R(2024)Symbolic Artificial Intelligence for Schema Therapy Using Knowledge GraphsConceptual Knowledge Structures10.1007/978-3-031-67868-4_22(327-333)Online publication date: 9-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-67868-4_22
Di Pierro DFerilli SRedavid D(2023)LPG-Based Knowledge Graphs: A Survey, a Proposal and Current TrendsInformation10.3390/info1403015414:3(154)Online publication date: 1-Mar-2023
https://doi.org/10.3390/info14030154
Show More Cited By

Index Terms

RDF Data Storage and Query Processing Schemes: A Survey
1. Information systems
  1. Data management systems
    1. Database design and models
      1. Data model extensions
        Semi-structured data
      2. Graph-based database models
    2. Query languages

Recommendations

RDF, Jena, SparQL and the 'Semantic Web'
SIGUCCS '09: Proceedings of the 37th annual ACM SIGUCCS fall conference: communication and collaboration

The Resource Description Format (RDF) is used to represent information modeled as a "graph": a set of individual objects, along with a set of connections among those objects. In that role, RDF is one of the pillars of the so-called Semantic Web. This ...
The RDF foundry: call for an initiative to build enhanced RDF resources for biological data integration
WIMS '11: Proceedings of the International Conference on Web Intelligence, Mining and Semantics

Currently, the OBO Foundry plays an important role by setting guidelines to formalise the concepts within the biomedical domain. The ontologies within the OBO Foundry are usually represented in the OBO ontology language. While being human-readable, this ...
Don't like RDF reification?: making statements about statements using singleton property
WWW '14: Proceedings of the 23rd international conference on World wide web

Statements about RDF statements, or meta triples, provide additional information about individual triples, such as the source, the occurring time or place, or the certainty. Integrating such meta triples into semantic knowledge bases would enable the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 51, Issue 4

July 2019

765 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3236632

Editor:
Sartaj Sahni
Department of Computer and Information Science and Engineering / University of Florida / Gainesville, FL 32611

Issue’s Table of Contents

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 September 2018

Accepted: 01 December 2017

Revised: 01 December 2017

Received: 01 November 2016

Published in CSUR Volume 51, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Survey
Research
Refereed

Funding Sources

Estonian Research Council

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

76
Total Citations
View Citations
1,869
Total Downloads

Downloads (Last 12 months)200
Downloads (Last 6 weeks)14

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Chen HWei BHuang ZShankar A(2024)Data storage query and traceability method of electronic certificate based on cloud computing and blockchainIntelligent Decision Technologies10.3233/IDT-23015218:4(2643-2656)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.3233/IDT-230152
Croitoru MBlanc NAnders R(2024)Symbolic Artificial Intelligence for Schema Therapy Using Knowledge GraphsConceptual Knowledge Structures10.1007/978-3-031-67868-4_22(327-333)Online publication date: 9-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-67868-4_22
Di Pierro DFerilli SRedavid D(2023)LPG-Based Knowledge Graphs: A Survey, a Proposal and Current TrendsInformation10.3390/info1403015414:3(154)Online publication date: 1-Mar-2023
https://doi.org/10.3390/info14030154
Shankar KMahgoub AZhou ZPriyam UChaterji S(2023)Asgard: Are NoSQL databases suitable for ephemeral data in serverless workloads?Frontiers in High Performance Computing10.3389/fhpcp.2023.11278831Online publication date: 4-Sep-2023
https://doi.org/10.3389/fhpcp.2023.1127883
Gladun ARogushina JPryima S(2023)Complex Information Objects Repository as a Component of the Semantic Analitic-Information Web-Oriented Systems DevelopmentKibernetika i vyčislitelʹnaâ tehnika10.15407/kvt214.04.0042023:4(214)(4-23)Online publication date: 20-Dec-2023
https://doi.org/10.15407/kvt214.04.004
Kukreja DGupta SPatel DRai J(2023)Scientometric review of Web 3.0Journal of Information Science10.1177/01655515231182073Online publication date: 30-Jun-2023
https://doi.org/10.1177/01655515231182073
Huang QLai XSu QPan Y(2023)RDF Subgraph Query Based on Common Subgraph in Distributed EnvironmentWireless Communications & Mobile Computing10.1155/2023/71480712023Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1155/2023/7148071
Vats PSharma NSharma D(2023)HKG: A Novel Approach for Low Resource Indic Languages to Automatic Knowledge Graph ConstructionACM Transactions on Asian and Low-Resource Language Information Processing10.1145/3611306Online publication date: 2-Aug-2023
https://dl.acm.org/doi/10.1145/3611306
Yuan GLu JYan ZWu S(2023)A Survey on Mapping Semi-Structured Data and Graph Data to Relational DataACM Computing Surveys10.1145/356744455:10(1-38)Online publication date: 2-Feb-2023
https://dl.acm.org/doi/10.1145/3567444
Liu CUsta AZhao JSalihoglu S(2023)Governor: Turning Open Government Data Portals into Interactive DatabasesProceedings of the 2023 CHI Conference on Human Factors in Computing Systems10.1145/3544548.3580868(1-16)Online publication date: 19-Apr-2023
https://dl.acm.org/doi/10.1145/3544548.3580868
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents