Abstract
With the fast development of semantic web and some other areas, the amount of resource description framework (RDF) data has increased significantly. How to efficiently manage these masses of RDF data has become a challenging task, and has attracted many scholars to research. This paper introduces the state-of-the-art of the RDF storage and query technologies according to some classification criteria. In addition, several prevailing benchmark datasets are introduced and compared. Finally, research challenges and opportunities in future are discussed.







Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
GraphDB is the newest version of OWLIM.
Blazegraph is the newest version of Bigdata.
References
Abadi DJ, Marcus A, Madden SR, Hollenbach K (2007) Scalable semantic web data management using vertical partitioning. VLDB 2007:411–422
Abadi DJ, Marcus A, Madden SR, Hollenbach K (2009) Sw-store: a vertically partitioned DBMS for semantic web data management. VLDB J 18(2):385–406
Abadi DJ et al (2007) Column stores for wide and sparse data. CIDR 2007:292–297
Beckmann JL, Halverson A, Krishnamurthy R, Naughton JF (2006) Extending RDBMSs to support sparse datasets using an interpreted attribute storage format. ICDE 2006:58–58
Berners-Lee T, Hendler J, Lassila O et al (2001) The semantic web. Sci Am 284(5):28–37
Bizer C, Schultz A (2009) The Berlin SPARQL benchmark. Int J Semant Web Inf Syst 5(2):1–24
Broekstra J, Kampman A, Van Harmelen F (2002) Sesame: a generic architecture for storing and querying RDF and RDF schema. ISWC 2002:54–68
Carroll JJ, Dickinson I, Dollin C, Reynolds D, Seaborne A, Wilkinson K (2004) Jena: implementing the semantic web recommendations. WWW 2004:74–83
Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst 26(2):4–26
Chawla T, Singh G, Pilli ES, Govil M (2016) Research issues in RDF management systems. ETCT 2016:1–5
Chen Y, Ou J, Jiang Y, Meng X (2006) Hstar: a semantic repository for large scale OWL documents. ASWC 2006:415–428
Cheng J, Ma Z, Tong Q (2018) RDF storage and querying: a literature review. Information retrieval and management: concepts, methodologies, tools, and applications, IGI Global, pp 415–433
Dean J, Ghemawat S (2008) Mapreduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Duan S, Kementsietsidis A, Srinivas K, Udrea O (2011) Apples and oranges: a comparison of RDF benchmarks and real RDF datasets. SIGMOD 2011:145–156
Erling O, Mikhailov I (2009) RDF support in the virtuoso DBMS. In: Networked knowledge—Networked media—Integrating knowledge management 2009, pp 7–24
Galarraga L, Hose K, Schenkel R (2014) Partout: a distributed engine for efficient RDF processing. WWW 2014:267–268
Goasdoué F, Kaoudi Z, Manolescu I, Quiané-Ruiz JA (2015) Cliquesquare: flat plans for massively parallel RDF queries. ICDE 2015:771–782
Guo Y, Pan Z, Heflin J (2005) Lubm: a benchmark for OWL knowledge base systems. J Web Semant 3(2):158–182
Gurajada S, Seufert S, Miliaraki I, Theobald M (2014) Triad: a distributed shared-nothing RDF engine based on asynchronous message passing. SIGMOD 2014:289–300
Hammoud M, Rabbou DA, Nouri R, Beheshti SMR, Sakr S (2015) Dream: distributed RDF engine with adaptive query planner and minimal communication. Proc VLDB Endow 8(6):654–665
Han J, Haihong E, Le G, Du J (2011) Survey on NoSql database. ICPCA 2011:363–366
Harris S, Gibbins N (2003) 3store: efficient bulk RDF storage. PSSS 2003:1–15
Harris S, Lamb N, Shadbolt N (2009) 4store: the design and implementation of a clustered RDF store. SSWS 2009:94–109
Harth A, Decker S (2005) Optimized index structures for querying RDF from the web. LA-WEB 2005:10–19
Heese R, Znamirowski M (2012) Resource centered RDF data management. In: SSWS 2011 workshop, pp 138–153
Hertel A, Broekstra J, Stuckenschmidt H (2009) RDF storage and retrieval systems. In: Staab S, Studer R (eds) Handbook on ontologies. Springer, Berlin, Heidelberg, pp 489–508
Huang J, Abadi DJ, Ren K (2011) Scalable SPARQL querying of large RDF graphs. Proc VLDB Endow 4(11):1123–1134
Huang J, Venkatraman K, Abadi DJ (2014) Query optimization of distributed pattern matching. ICDE 2014:64–75
Husain M, McGlothlin J, Masud MM, Khan L, Thuraisingham BM (2011) Heuristics-based query processing for large RDF graphs using cloud computing. IEEE Trans Knowl Data Eng 23(9):1312–1327
Kiryakov A, Ognyanov D, Manov D (2005) Owlim: a pragmatic semantic repository for OWL. In: WISE 2005 workshops, pp 182–192
Ma L, Su Z, Pan Y, Zhang L, Liu T (2004) Rstar: an RDF storage and query system for enterprise resource management. CIKM 2004:484–491
Ma L, Yang Y, Qiu Z, Xie G, Pan Y, Liu S (2006) Towards a complete OWL ontology benchmark. Semant Web 2006:125–139
Ma Z, Yan L (2016) A review of RDF storage in nosql databases. In: Managing big data in cloud computing environments, IGI Global, pp 210–229
McBride B (2002) Jena: a semantic web toolkit. IEEE Internet Comput 6(6):55–59
Membrey P, Plugge E, Hawkins T (2010) The definitive guide to MongoDB: the noSQL database for cloud and desktop computing. O'Reilly Media, Inc.
Morsey M, Lehmann J, Auer S, Ngomo ACN (2011) Dbpedia SPARQL benchmark-performance assessment with real queries on real data. ISWC 2011:454–469
Murray C, Alexander N, Das S, Eadon G, Ravada S (2005) Oracle spatial resource description framework (RDF). Oracle Corporation
Neumann T, Weikum G (2010) The RDF-3X engine for scalable management of RDF data. VLDB J 19(1):91–113
Pan Z, Heflin J (2004) Dldb: extending relational databases to support semantic web queries. In: ISWC 2003 workshop
Papailiou N, Tsoumakos D, Konstantinou I, Karras P, Koziris N (2014) H2RDF+: an efficient data management system for big RDF graphs. In: SIGMOD 2014, pp 909–912
Prud E, Seaborne A, et al (2006) SPARQL query language for RDF. W3C working draft
Rohloff K, Schantz RE (2010) High-performance, massively scalable distributed systems using the mapreduce software framework: the shard triple-store. SPLASH 2010:4–8
Schmidt M, Hornung T, Lausen G, Pinkel C (2009) S\(P^2\)Bench: a SPARQL performance benchmark. ICDE 2009:222–233
Sidirourgos L, Goncalves R, Kersten M, Nes N, Manegold S (2008) Column-store support for RDF data management: not all swans are white. Proc VLDB Endow 1(2):1553–1563
Sivasubramanian S (2012) Amazon dynamodb: a seamlessly scalable non-relational database service. SIGMOD 2012:729–730
Webber J (2012) A programmatic introduction to neo4j. SPLASH 2012:217–218
Wood D, Gearon P, Adams T (2005) Kowari: a platform for semantic web storage and analysis. In: XTech 2005 conference, pp 05–0402
Yan Y, Wang C, Zhou A, Qian W, Ma L, Pan Y (2009) Efficient indices using graph partitioning in RDF triple stores. ICDE 2009:1263–1266
Zeng K, Yang J, Wang H, Shao B, Wang Z (2013) A distributed graph engine for web scale RDF data. Proc VLDB Endow 6(4):265–276
Zou L, Özsu MT (2017) Graph-based RDF data management. Data Sci Eng 2(1):56–70
Zou L, Mo J, Chen L, Özsu MT, Zhao D (2011) gStore: answering SPARQL queries via subgraph matching. Proc VLDB Endow 4(8):482–493
Zou L, Özsu MT, Chen L, Shen X, Huang R, Zhao D (2014) gStore: a graph-based SPARQL query engine. VLDB J 23(4):565–590
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Nos. 61471035, 61601129) and the double first class construct program of USC (No. 2017SYL16).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Pan, Z., Zhu, T., Liu, H. et al. A survey of RDF management technologies and benchmark datasets. J Ambient Intell Human Comput 9, 1693–1704 (2018). https://doi.org/10.1007/s12652-018-0876-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-018-0876-2