Skip to main content
Log in

Architecture for distributed query processing using the RDF data in cloud environment

  • Special Issue
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

From past decade, the advancement in the field of RDF data management poses many challenges to researchers. Processing large volumes of RDF data is very difficult task in the cloud. The RDF data actually contains complex graphs along with large number of schemas. Distributing the RDF data with traditional approaches or partitioning them with conventional mechanism leads to faulty distribution as well as generated large number of join operations. To address the above issues, this paper developed architecture for distributed query processing using the adaptive hash partitioning approach along with hash join operation. This paper also developed an algorithm for executing the query by minimizing the joins. This paper presented an evaluation of the proposed model with other standard model. The experimental results proved that the proposed method had faster response time compared to the other standard models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Kaoudi Z, Manolescu I (2014) Cloud-based RDF data management. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data, ACM, pp 725–729

  2. Prud E, Seaborne A (2006) SPARQL query language for RDF

  3. Weiss C, Karras P, Bernstein A (2008) Hexastore: sextuple indexing for semantic web data management. Proc VLDB Endow 1(1):1008–1019

    Article  Google Scholar 

  4. Neumann T, Weikum G (2008) RDF-3X: a RISC-style engine for RDF. Proc VLDB Endow 1(1):647–659

    Article  Google Scholar 

  5. Das S, Agrawal D, El Abbadi A (2010) G-store: a scalable data store for transactional multi key access in the cloud. In Proceedings of the 1st ACM symposium on cloud computing, ACM, pp 163–174

  6. Yuan P, Liu P, Wu B, Jin H, Zhang W, Liu L (2013) TripleBit: a fast and compact system for large scale RDF data. Proc VLDB Endow 6(7):517–528

    Article  Google Scholar 

  7. Angles R, Gutierrez C (2016) The multiset semantics of SPARQL patterns. In: International semantic web conference, Springer International Publishing, pp 20–36

  8. Dean J, Ghemawat S (2004) Mapreduce: simplified data processing on large clusters. In: OSDI

  9. Rohloff K, Schantz RE (2010) High-performance, massively scalable distributed systems using the MapReduce software framework: the SHARD triple-store. In: PSI EtA

  10. Zeng K, Yang J, Wang H, Shao B, Wang Z (2013) A distributed graph engine for web scale RDF data. PVLDB6(4)

  11. Papailiou N, Konstantinou I, Tsoumakos D, Karras P, Koziris N (2013) H2rdf + : high-performance distributed joins over large-scale rdf graphs. In: IEEE Big Data

  12. Punnoose R, Crainiceanu A, Rapp D (2012) Rya: a scalable RDF triple store for the clouds. In: Cloud-I

  13. Wu B, Zhou Y, Yuan P, Liu L, Jin H (2015) scalable SPARQL querying using path partitioning. In: ICDE

  14. Gurajada S, Seufert S, Miliaraki I, Theobald M (2014) TriAD: a distributed shared-nothing RDF engine basedon asynchronous message passing. In: SIGMOD

  15. Zou L, Ozsu MT, Chen L, Shen X, Huang R, Zhao D (2014) gStore: a graph-based SPARQL query engine. VLDB J 23(4):565–590

    Article  Google Scholar 

  16. Rietveld L, Hoekstra R, Schlobach S, Gu´eret C (2014) Structural properties as proxy for semantic relevancein RDF graph sampling. In: ISWC

  17. Zhang X, Chen L, Tong Y, Wang M (2013) EAGRE: towardsscalable I/O efficient SPARQL query evaluationon the cloud. In: ICDE

  18. Huang J, Abadi D, Ren K (2011) Scalable SPARQL queryingof large RDF graphs. PVLDB 4(11):1123–1134

    Google Scholar 

  19. Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAMJ Sci Comput 20(1):359–392

    Article  MathSciNet  Google Scholar 

  20. Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. In: USENIX

  21. Dittrich J, Quian´e-Ruiz JA, Jindal A, Kargin Y, Setty V, Schad J (2010) Hadoop ++: making a yellow elephant run like a cheetah (without it even noticing). PVLDB 3(1–2)

  22. Lee K, Liu L (2013) Scaling queries over big RDF graphswith Semantic Hash Partitioning. PVLDB 6(14) (2013)

  23. Guo Y, Pan Z, Heflin J (2005) LUBM: a benchmark for OWLknowledge base systems. Web Semant 3:158–182

    Article  Google Scholar 

  24. Kiryakov A, Ognyanov D, Manov D (2005) OWLIM–a pragmatic semantic repository for OWL. In: Web information systems engineering–WISE 2005 Workshops, Springer, pp 182–192

  25. Owens A, Seaborne A, Gibbins N et al. (2008) Clustered TDB: a clusteredtriple store for Jena

  26. Rohloff K, Schantz RE (2011) Clause-iteration with mapreduce to scalably query datagraphs in the shard graph-store. In: Proceedings of the 4th international workshop on data-intensive distributed computing. ACM, pp 35–44

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to C. Ranichandra.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ranichandra, C., Tripathy, B.K. Architecture for distributed query processing using the RDF data in cloud environment. Evol. Intel. 14, 567–575 (2021). https://doi.org/10.1007/s12065-019-00315-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-019-00315-5

Keywords

Navigation