Skip to main content

Collaborative SPARQL Query Processing for Decentralized Semantic Data

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2020)

Abstract

Decentralization allows users to regain freedom and control over their digital life. As a global shared data space, the Linked Data already supports decentralization. Data providers are free to publish their data on their web domains and users can execute decentralized SPARQL queries over multiple data sources. However, decentralization makes query processing challenging, raising well-known problems of source discovery, answer completeness and performance. Existing approaches for decentralized SPARQL query processing raise issues related to autonomy and answer completeness. In this paper, we propose Qasino, an original approach for querying decentralized RDF data that targets both answer completeness, and source autonomy. Qasino is based on a decentralized random service that allows for discovering all relevant data sources. To speed up query processing, sources executing similar queries cooperate by sharing their intermediate results. Our experimental results demonstrate that collaborative query processing can significantly speedup query processing in a decentralized setup.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/folkvir/qasino-simulation.

  2. 2.

    https://jena.apache.org.

  3. 3.

    https://old.datahub.io/dataset/fu-berlin-diseasome.

References

  1. Aberer, K., et al.: P-grid: a self-organizing structured p2p system. ACM SIGMOD Record 32(3), 29–33 (2003)

    Article  Google Scholar 

  2. Aberer, K., Cudré-Mauroux, P., Hauswirth, M., Van Pelt, T.: GridVine: building internet-scale semantic overlay networks. In: McIlraith, S.A., Plexousakis, D., van Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 107–121. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-30475-3_9

    Chapter  Google Scholar 

  3. Acosta, M., Vidal, M.-E., Lampo, T., Castillo, J., Ruckhaus, E.: ANAPSID: an adaptive query processing engine for SPARQL endpoints. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 18–34. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25073-6_2

    Chapter  Google Scholar 

  4. Aebeloe, C., Montoya, G., Hose, K.: A decentralized architecture for sharing and querying semantic data. In: Hitzler, P., et al. (eds.) ESWC 2019. LNCS, vol. 11503, pp. 3–18. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21348-0_1

    Chapter  Google Scholar 

  5. Crespo, A., Garcia-Molina, H.: Semantic overlay networks for P2P systems. In: Moro, G., Bergamaschi, S., Aberer, K. (eds.) AP2PC 2004. LNCS (LNAI), vol. 3601, pp. 1–13. Springer, Heidelberg (2005). https://doi.org/10.1007/11574781_1

    Chapter  Google Scholar 

  6. Diallo, O., Rodrigues, J.J., Sene, M., Lloret, J.: Distributed database management techniques for wireless sensor networks. IEEE Trans. Parallel Distrib. Syst. 26(2), 604–620 (2015)

    Article  Google Scholar 

  7. Doulkeridis, C., Vlachou, A., Nørvåg, K., Vazirgiannis, M.: Distributed semantic overlay networks. In: Shen, X., Yu, H., Buford, J., Akon, M. (eds.) Handbook of Peer-to-Peer Networking, pp. 463–494. Springer, Boston (2010)

    Chapter  Google Scholar 

  8. Eppstein, D., Goodrich, M.T., Uyeda, F., Varghese, G.: What’s the difference?: efficient set reconciliation without prior context. ACM SIGCOMM Comput. Commun. Rev. 41(4), 218–229 (2011)

    Article  Google Scholar 

  9. Goodrich, M.T., Mitzenmacher, M.: Invertible bloom lookup tables. arXiv preprint arXiv:1101.2245 (2011)

  10. Grall, A., et al.: Ladda: SPARQL queries in the fog of browsers. In: Blomqvist, E., Hose, K., Paulheim, H., Ławrynowicz, A., Ciravegna, F., Hartig, O. (eds.) ESWC 2017. LNCS, vol. 10577, pp. 126–131. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70407-4_24

    Chapter  Google Scholar 

  11. Grall, A., Molli, P., Skaf-Molli, H.: SPARQL query execution in networks of web browsers. In: Emerging Topics in Semantic Technologies - ISWC 2018 Satellite Events, Best Paper DeSemWeb@ISWC. pp. 55–68 (2018)

    Google Scholar 

  12. Hartig, O.: Zero-knowledge query planning for an iterator implementation of link traversal based query execution. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011. LNCS, vol. 6643, pp. 154–169. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21034-1_11

    Chapter  Google Scholar 

  13. Hartig, O.: SPARQL for a web of linked data: semantics and computability. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 8–23. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-30284-8_8

    Chapter  Google Scholar 

  14. Kermarrec, A.M., Van Steen, M.: Gossiping in distributed systems. ACM SIGOPS Oper. Syst. Rev. 41(5), 2–7 (2007)

    Article  Google Scholar 

  15. King, V., Saia, J.: Choosing a random peer. In: Twenty-Third Annual ACM Symposium on Principles of Distributed Computing, PODC (2004)

    Google Scholar 

  16. Ladwig, G., Tran, T.: Linked data query processing strategies. In: Patel-Schneider, P.F., et al. (eds.) ISWC 2010. LNCS, vol. 6496, pp. 453–469. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17746-0_29

    Chapter  Google Scholar 

  17. Le Merrer, E., Kermarrec, A.M., Massoulié, L.: Peer to peer size estimation in large and dynamic networks: a comparative study. In: 15th IEEE International Conference on High Performance Distributed Computing, pp. 7–17. IEEE (2006)

    Google Scholar 

  18. Lv, Q., Cao, P., Cohen, E., Li, K., Shenker, S.: Search and replication in unstructured peer-to-peer networks. In: Proceedings of the 16th International Conference on Supercomputing, pp. 84–95. ACM (2002)

    Google Scholar 

  19. Mansour, E., et al.: A demonstration of the solid platform for social web applications. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 223–226 (2016)

    Google Scholar 

  20. Marx, E., Saleem, M., Lytra, I., Ngomo, A.C.N.: A decentralized architecture for SPARQL query processing and RDF sharing: a position paper. In: 2th International Conference on Semantic Computing (ICSC), pp. 274–277 (2018)

    Google Scholar 

  21. Montresor, A., Jelasity, M.: PeerSim: a scalable P2P simulator. In: Proceedings of the 9th International Conference on Peer-to-Peer (P2P 2009), Seattle, WA, pp. 99–100, September 2009

    Google Scholar 

  22. Myers, A.N., Wilf, H.S.: Some new aspects of the coupon collector’s problem. SIAM Rev. 48(3), 549–565 (2006)

    Article  MathSciNet  Google Scholar 

  23. Nédelec, B., Tanke, J., Frey, D., Molli, P., Mostéfaoui, A.: An adaptive peer-sampling protocol for building networks of browsers. World Wide Web 21(3), 629–661 (2017). https://doi.org/10.1007/s11280-017-0478-5

    Article  Google Scholar 

  24. Nejdl, W., et al.: Super-peer-based routing and clustering strategies for RDF-based peer-to-peer networks. In: 12th international Conference on World Wide Web (2003)

    Google Scholar 

  25. Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 16 (2009)

    Article  Google Scholar 

  26. Polleres, A., Kamdar, M.R., Fernández, J.D., Tudorache, T., Musen, M.A.: A more decentralized vision for linked data. In: DeSemWeb@ISWC (2018)

    Google Scholar 

  27. Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: optimization techniques for federated query processing on linked data. In: ISWC (2011)

    Google Scholar 

  28. Shapiro, M., Preguiça, N., Baquero, C., Zawirski, M.: A comprehensive study of convergent and commutative replicated data types. Research Report RR-7506, INRIA (2011)

    Google Scholar 

  29. Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: 17th international conference on World Wide Web (2008)

    Google Scholar 

Download references

Acknowledgements

This work was partially funded by the French ANR projects O’Browser (ANR-16-CE25-0005-01) and DeKaloG (ANR-19-CE23-0014-01). Mr. Grall is funded by the GFI company, Nantes, France.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hala Skaf-Molli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Grall, A., Skaf-Molli, H., Molli, P., Perrin, M. (2020). Collaborative SPARQL Query Processing for Decentralized Semantic Data. In: Hartmann, S., Küng, J., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2020. Lecture Notes in Computer Science(), vol 12391. Springer, Cham. https://doi.org/10.1007/978-3-030-59003-1_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59003-1_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59002-4

  • Online ISBN: 978-3-030-59003-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics