Abstract
From past decade, the advancement in the field of RDF data management poses many challenges to researchers. Processing large volumes of RDF data is very difficult task in the cloud. The RDF data actually contains complex graphs along with large number of schemas. Distributing the RDF data with traditional approaches or partitioning them with conventional mechanism leads to faulty distribution as well as generated large number of join operations. To address the above issues, this paper developed architecture for distributed query processing using the adaptive hash partitioning approach along with hash join operation. This paper also developed an algorithm for executing the query by minimizing the joins. This paper presented an evaluation of the proposed model with other standard model. The experimental results proved that the proposed method had faster response time compared to the other standard models.
Similar content being viewed by others
References
Kaoudi Z, Manolescu I (2014) Cloud-based RDF data management. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data, ACM, pp 725–729
Prud E, Seaborne A (2006) SPARQL query language for RDF
Weiss C, Karras P, Bernstein A (2008) Hexastore: sextuple indexing for semantic web data management. Proc VLDB Endow 1(1):1008–1019
Neumann T, Weikum G (2008) RDF-3X: a RISC-style engine for RDF. Proc VLDB Endow 1(1):647–659
Das S, Agrawal D, El Abbadi A (2010) G-store: a scalable data store for transactional multi key access in the cloud. In Proceedings of the 1st ACM symposium on cloud computing, ACM, pp 163–174
Yuan P, Liu P, Wu B, Jin H, Zhang W, Liu L (2013) TripleBit: a fast and compact system for large scale RDF data. Proc VLDB Endow 6(7):517–528
Angles R, Gutierrez C (2016) The multiset semantics of SPARQL patterns. In: International semantic web conference, Springer International Publishing, pp 20–36
Dean J, Ghemawat S (2004) Mapreduce: simplified data processing on large clusters. In: OSDI
Rohloff K, Schantz RE (2010) High-performance, massively scalable distributed systems using the MapReduce software framework: the SHARD triple-store. In: PSI EtA
Zeng K, Yang J, Wang H, Shao B, Wang Z (2013) A distributed graph engine for web scale RDF data. PVLDB6(4)
Papailiou N, Konstantinou I, Tsoumakos D, Karras P, Koziris N (2013) H2rdf + : high-performance distributed joins over large-scale rdf graphs. In: IEEE Big Data
Punnoose R, Crainiceanu A, Rapp D (2012) Rya: a scalable RDF triple store for the clouds. In: Cloud-I
Wu B, Zhou Y, Yuan P, Liu L, Jin H (2015) scalable SPARQL querying using path partitioning. In: ICDE
Gurajada S, Seufert S, Miliaraki I, Theobald M (2014) TriAD: a distributed shared-nothing RDF engine basedon asynchronous message passing. In: SIGMOD
Zou L, Ozsu MT, Chen L, Shen X, Huang R, Zhao D (2014) gStore: a graph-based SPARQL query engine. VLDB J 23(4):565–590
Rietveld L, Hoekstra R, Schlobach S, Gu´eret C (2014) Structural properties as proxy for semantic relevancein RDF graph sampling. In: ISWC
Zhang X, Chen L, Tong Y, Wang M (2013) EAGRE: towardsscalable I/O efficient SPARQL query evaluationon the cloud. In: ICDE
Huang J, Abadi D, Ren K (2011) Scalable SPARQL queryingof large RDF graphs. PVLDB 4(11):1123–1134
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAMJ Sci Comput 20(1):359–392
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: USENIX
Dittrich J, Quian´e-Ruiz JA, Jindal A, Kargin Y, Setty V, Schad J (2010) Hadoop ++: making a yellow elephant run like a cheetah (without it even noticing). PVLDB 3(1–2)
Lee K, Liu L (2013) Scaling queries over big RDF graphswith Semantic Hash Partitioning. PVLDB 6(14) (2013)
Guo Y, Pan Z, Heflin J (2005) LUBM: a benchmark for OWLknowledge base systems. Web Semant 3:158–182
Kiryakov A, Ognyanov D, Manov D (2005) OWLIM–a pragmatic semantic repository for OWL. In: Web information systems engineering–WISE 2005 Workshops, Springer, pp 182–192
Owens A, Seaborne A, Gibbins N et al. (2008) Clustered TDB: a clusteredtriple store for Jena
Rohloff K, Schantz RE (2011) Clause-iteration with mapreduce to scalably query datagraphs in the shard graph-store. In: Proceedings of the 4th international workshop on data-intensive distributed computing. ACM, pp 35–44
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ranichandra, C., Tripathy, B.K. Architecture for distributed query processing using the RDF data in cloud environment. Evol. Intel. 14, 567–575 (2021). https://doi.org/10.1007/s12065-019-00315-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12065-019-00315-5