Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce

Farhan Husain, Mohammad; Doshi, Pankil; Khan, Latifur; Thuraisingham, Bhavani

doi:10.1007/978-3-642-10665-1_72

Mohammad Farhan Husain¹⁹,
Pankil Doshi¹⁹,
Latifur Khan¹⁹ &
…
Bhavani Thuraisingham¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 5931))

Included in the following conference series:

IEEE International Conference on Cloud Computing

15k Accesses
36 Citations

Abstract

Handling huge amount of data scalably is a matter of concern for a long time. Same is true for semantic web data. Current semantic web frameworks lack this ability. In this paper, we describe a framework that we built using Hadoop to store and retrieve large number of RDF triples. We describe our schema to store RDF data in Hadoop Distribute File System. We also present our algorithms to answer a SPARQL query. We make use of Hadoop’s MapReduce framework to actually answer the queries. Our results reveal that we can store huge amount of semantic web data in Hadoop clusters built mostly by cheap commodity class hardware and still can answer queries fast enough. We conclude that ours is a scalable framework, able to handle large amount of RDF data efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Newman, A., Hunter, J., Li, Y.F., Bouton, C., Davis, M.: A Scale-Out RDF Molecule Store for Distributed Processing of Biomedical Data Semantic Web for Health Care and Life Sciences. In: Workshop WWW 2008, Beijing, China (2008)
Google Scholar
Chang, F., Dean, J., et al.: Bigtable: A Distributed Storage System for Structured Data. In: OSDI Seventh Symposium on Operating System Design and Implementation (November 2006)
Google Scholar
Moretti, C., Steinhaeuser, K., Thain, D., Chawla, N.V.: Scaling Up Classifiers to Cloud Computers. In: IEEE ICDM (2008)
Google Scholar
Chu, C.-T., Kim, S.K., Lin, Y.-A., Yu, Y., Bradski, G., Ng, A.Y., Olukotun, K.: Map-reduce for machine learning on multicore. In: NIPS 2007 (2007)
Google Scholar
Guo, Y., Pan, Z., Heflin, J.: LUBM: A Benchmark for OWL Knowledge Base Systems. Journal of Web Semantics 3(2), 158–182 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Texas at Dallas, Dallas, TX, 75080, USA
Mohammad Farhan Husain, Pankil Doshi, Latifur Khan & Bhavani Thuraisingham

Authors

Mohammad Farhan Husain
View author publications
You can also search for this author in PubMed Google Scholar
Pankil Doshi
View author publications
You can also search for this author in PubMed Google Scholar
Latifur Khan
View author publications
You can also search for this author in PubMed Google Scholar
Bhavani Thuraisingham
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

SINTEF ICT, NO-7465, Trondheim, Norway
Martin Gilje Jaatun
School of Computer Science, South China Normal University, Guangzhou, China
Gansen Zhao
Department of Electrical Engineering and Computer Science, University of Stavanger, NO- 4036, Stavanger, Norway
Chunming Rong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Farhan Husain, M., Doshi, P., Khan, L., Thuraisingham, B. (2009). Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce. In: Jaatun, M.G., Zhao, G., Rong, C. (eds) Cloud Computing. CloudCom 2009. Lecture Notes in Computer Science, vol 5931. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10665-1_72

Download citation

DOI: https://doi.org/10.1007/978-3-642-10665-1_72
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10664-4
Online ISBN: 978-3-642-10665-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics