Abstract
Considering the scalability and semantic requirements, Resource Description Framework (RDF) and the de-facto query language SPARQL are well suited for managing and querying online social network (OSN) data. Despite some existing works have introduced distributed framework for querying large-scale data, how to improve online query performance is still a challenging task. To address this problem, this paper proposes a scalable RDF data framework, which uses key-value store for offline RDF storage and pipelined in-memory based query strategy. The proposed framework efficiently supports SPARQL Basic Graph Pattern (BGP) queries on large-scale datasets. Experiments on the benchmark dataset demonstrate that the online SPARQL query performance of our framework outperforms existing distributed RDF solutions.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Semantic Web. http://www.w3.org/standards/semanticweb/
FOAF-project. http://www.foaf-project.org/
SIOC project. http://rdfs.org/sioc/spec/
SPARQL Query Language for RDF. http://www.w3.org/TR/rdf-sparql-query/
Neumann, T., Weikum, G.: RDF-3X: A RISC-Style Engine for RDF. Proceedings of the VLDB Endowment 1(1), 647–659 (2008)
Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. In: PVLDB, pp. 1008–1019 (2008)
Sesame. http://www.openrdf.org
Husain, M., McGlothlin, J., Masud, M., Khan, L., Thuraisingham, B.: Heuristics-Based Querying Processing for Large RDF Graphs Using Cloud Computing. IEEE Transactions on Knowledge and Data Engineering 23, 1312–1327 (2011)
Myung, J., Yeon, J., Lee, S.: SPARQL basic graph pattern processing with iterative MapReduce. In: Proceedings of MDAC, pp. 6:1–6:6 (2010)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th Conference on Symposium on OSDI, vol. 6, p. 10 (2004)
Kellerman, J.: HBase: Structured storage of sparse data for hadoop (2009). http://hbase.apache.org/
Zaharia, M., Chowdhury, M., Franklin, M., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing (2010)
Hadoop. http://hadoop.apache.org/
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on NSDI (2012)
Jena. https://jena.apache.org/
Atre, M., Srinivasan, J., Hendler, J.: BitMat: a main-memory bit matrix of RDF triples for conjunctive triple pattern queries. In: ISWC (2008)
Erling, O., Mikhailov, I.: Virtuoso: RDF support in a native RDBMS. In: Semantic Web Information Management, pp. 501–519 (2009)
Papailiou, N., Konstantinou, I., Tsoumakos, D., Koziris, N.: H2RDF: adaptive query processing on RDF data in the cloud. In: Proc. of WWW, pp. 397–400 (2012)
Zeng, K., Yang, J., Wang, H., Shao, B., Wang, Z.: A distributed graph engine for web scale RDF data. In: PVLDB, pp. 265–276. VLDB Endowment (2013)
Chang, F., Dean, J., Ghemawat, S., Hsieh, W., Wallach, D., Burrows, M., Chandra, T., Fikes, A., Gruber, R.: Bigtable: a distributed storage system for structured data. In: Proceedings of the 7th USENIX Symposium on OSDI, pp. 305–314 (2006)
Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. J. Web Semantics 3, 158–182 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Xu, Z., Chen, W., Gai, L., Wang, T. (2015). SparkRDF: In-Memory Distributed RDF Management Framework for Large-Scale Social Data. In: Dong, X., Yu, X., Li, J., Sun, Y. (eds) Web-Age Information Management. WAIM 2015. Lecture Notes in Computer Science(), vol 9098. Springer, Cham. https://doi.org/10.1007/978-3-319-21042-1_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-21042-1_27
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21041-4
Online ISBN: 978-3-319-21042-1
eBook Packages: Computer ScienceComputer Science (R0)