Reference Hub1
Towards Big Linked Data: A Large-Scale, Distributed Semantic Data Storage

Towards Big Linked Data: A Large-Scale, Distributed Semantic Data Storage

Bo Hu, Nuno Carvalho, Takahide Matsutsuka
Copyright: © 2013 |Volume: 9 |Issue: 4 |Pages: 25
ISSN: 1548-3924|EISSN: 1548-3932|EISBN13: 9781466635883|DOI: 10.4018/ijdwm.2013100102
Cite Article Cite Article

MLA

Hu, Bo, et al. "Towards Big Linked Data: A Large-Scale, Distributed Semantic Data Storage." IJDWM vol.9, no.4 2013: pp.19-43. http://doi.org/10.4018/ijdwm.2013100102

APA

Hu, B., Carvalho, N., & Matsutsuka, T. (2013). Towards Big Linked Data: A Large-Scale, Distributed Semantic Data Storage. International Journal of Data Warehousing and Mining (IJDWM), 9(4), 19-43. http://doi.org/10.4018/ijdwm.2013100102

Chicago

Hu, Bo, Nuno Carvalho, and Takahide Matsutsuka. "Towards Big Linked Data: A Large-Scale, Distributed Semantic Data Storage," International Journal of Data Warehousing and Mining (IJDWM) 9, no.4: 19-43. http://doi.org/10.4018/ijdwm.2013100102

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

In light of the challenges of effectively managing Big Data, the authors are witnessing a gradual shift towards the increasingly popular Linked Open Data (LOD) paradigm. LOD aims to impose a machine-readable semantic layer over structured as well as unstructured data and hence automate some data analysis tasks that are not designed for computers. The convergence of Big Data and LOD is, however, not straightforward: the semantic layer of LOD and the Big Data large scale storage do not get along easily. Meanwhile, the sheer data size envisioned by Big Data denies certain computationally expensive semantic technologies, rendering the latter much less efficient than their performance on relatively small data sets. In this paper, the authors propose a mechanism allowing LOD to take advantage of existing large-scale data stores while sustaining its “semantic” nature. The authors demonstrate how RDF-based semantic models can be distributed across multiple storage servers and the authors examine how a fundamental semantic operation can be tuned to meet the requirements on distributed and parallel data processing. The authors' future work will focus on stress test of the platform in the magnitude of tens of billions of triples, as well as comparative studies in usability and performance against similar offerings.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.