Federated SPARQL Queries Processing with Replicated Fragments

Montoya, Gabriela; Skaf-Molli, Hala; Molli, Pascal; Vidal, Maria-Esther

doi:10.1007/978-3-319-25007-6_3

Gabriela Montoya^25,26,
Hala Skaf-Molli²⁵,
Pascal Molli²⁵ &
…
Maria-Esther Vidal²⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9366))

Included in the following conference series:

International Semantic Web Conference

2691 Accesses
15 Citations

Abstract

Federated query engines provide a unified query interface to federations of SPARQL endpoints. Replicating data fragments from different Linked Data sources facilitates data re-organization to better fit federated query processing needs of data consumers. However, existing federated query engines are not designed to support replication and replicated data can negatively impact their performance. In this paper, we formulate the source selection problem with fragment replication (SSP-FR). For a given set of endpoints with replicated fragments and a SPARQL query, the problem is to select the endpoints that minimize the number of tuples to be transferred. We devise the Fedra source selection algorithm that approximates SSP-FR. We implement Fedra in the state-of-the-art federated query engines FedX and ANAPSID, and empirically evaluate their performance. Experimental results suggest that Fedra efficiently solves SSP-FR, reducing the number of selected SPARQL endpoints as well as the size of query intermediate results.

Download to read the full chapter text

Chapter PDF

How Diverse Are Federated Query Execution Plans Really?

Saving Knowledge About Sources: An Efficient Method for Querying Distributed Data

Processing Aggregate Queries in a Federation of SPARQL Endpoints

Keywords

References

Acosta, M., Vidal, M., Lampo, T., Castillo, J., Ruckhaus, E.: ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints. In: Aroyo et al. [5], pp. 18–34
Google Scholar
Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified Stress Testing of RDF Data Management Systems. In: Mika et al. [15], pp. 197–212
Google Scholar
Aluç, G., Ozsu, M.T., Daudjee, K., Hartig, O.: chameleon-db: a Workload-Aware Robust RDF Data Management System. University of Waterloo, Tech. Rep. CS-2013-10 (2013)
Google Scholar
Buil-Aranda, C., Hogan, A., Umbrich, J., Vandenbussche, P.-Y.: SPARQL web-querying infrastructure: ready for action? In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 277–293. Springer, Heidelberg (2013)
Chapter Google Scholar
Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.): ISWC 2011, Part I. LNCS, vol. 7031. Springer, Heidelberg (2011)
Google Scholar
Basca, C., Bernstein, A.: Avalanche: putting the spirit of the web back into semantic web querying. In: Polleres, A., Chen, H. (eds.) ISWC Posters&Demos. CEUR Workshop Proceedings, vol. 658. CEUR-WS.org. (2010)
Google Scholar
Bizer, C., Heath, T., Berners-Lee, T.: Linked Data - The Story So Far. Int. J. Semantic Web Inf. Syst. 5(3), 1–22 (2009)
Article Google Scholar
Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-Wise Independent Permutations. J. Comput. Syst. Sci. 60(3), 630–659 (2000)
Article MathSciNet MATH Google Scholar
Görlitz, O., Staab, S.: SPLENDID: SPARQL Endpoint Federation Exploiting VOID Descriptions. In: Hartig, O., Harth, A., Sequeda, J. (eds.) COLD (2011)
Google Scholar
Gutierrez, C., Hurtado, C.A., Mendelzon, A.O., Pérez, J.: Foundations of Semantic Web databases. J. Comput. Syst. Sci. 77(3), 520–541 (2011)
Article MathSciNet MATH Google Scholar
Halevy, A.Y.: Answering queries using views: A survey. VLDB J. 10(4), 270–294 (2001)
Article MATH Google Scholar
Hose, K., Schenkel, R.: Towards benefit-based RDF source selection for SPARQL queries. In: Virgilio, R.D., Giunchiglia, F., Tanca, L. (eds.) SWIM, p. 2. ACM (2012)
Google Scholar
Ibáñez, L.D., Skaf-Molli, H., Molli, P., Corby, O.: Col-Graph: towards writable and scalable linked open data. In: Mika et al. [15], pp. 325–340
Google Scholar
Johnson, D.S.: Approximation algorithms for combinatorial problems. In: Aho, A.V., et al. (eds.) ACM Symposium on Theory of Computing, pp. 38–49. ACM (1973)
Google Scholar
Mika, P. (ed.): ISWC 2014, Part I. LNCS, vol. 8796. Springer, Heidelberg (2014)
Google Scholar
Montoya, G., Skaf-Molli, H., Molli, P., Vidal, M.E.: Fedra: Query Processing for SPARQL Federations with Divergence. Tech. rep., Université de Nantes (May 2014)
Google Scholar
Özsu, M.T., Valduriez, P.: Principles of distributed database systems. Springer (2011)
Google Scholar
Quilitz, B., Leser, U.: Querying distributed rdF data sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524–538. Springer, Heidelberg (2008)
Chapter Google Scholar
Saleem, M., Ngonga Ngomo, A.-C.: HiBISCuS: hypergraph-based source selection for SPARQL endpoint federation. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 176–191. Springer, Heidelberg (2014)
Chapter Google Scholar
Saleem, M., Ngonga Ngomo, A.-C., Xavier Parreira, J., Deus, H.F., Hauswirth, M.: DAW: duplicate-AWare federated query processing over the web of data. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 574–590. Springer, Heidelberg (2013)
Chapter Google Scholar
Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: Optimization Techniques for Federated Query Processing on Linked Data. In: Aroyo et al. [5], pp. 601–616
Google Scholar
Verborgh, R., Hartig, O., Meester, B.D., Haesendonck, G., Vocht, L.D., Sande, M.V., Cyganiak, R., Colpaert, P., Mannens, E., de Walle, R.V.: Querying Datasets on the Web with High Availability. In: Mika et al. [15], pp. 180–196
Google Scholar
Verborgh, R., Sande, M.V., Colpaert, P., Coppens, S., Mannens, E., de Walle, R.V.: Web-Scale querying through linked data fragments. In: Bizer, C., et al. (eds.) WWW Workshop on LDOW 2014. CEUR Workshop Proceedings, vol. 1184. CEUR-WS.org (2014)
Google Scholar
Wilcoxon, F.: Individual comparisons by ranking methods. In: Breakthroughs in Statistics, pp. 196–202. Springer (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

LINA – Nantes University, Nantes, France
Gabriela Montoya, Hala Skaf-Molli & Pascal Molli
Unit UMR6241 CNRS, Nantes, France
Gabriela Montoya
Universidad Simón Bolívar, Caracas, Venezuela
Maria-Esther Vidal

Authors

Gabriela Montoya
View author publications
You can also search for this author in PubMed Google Scholar
Hala Skaf-Molli
View author publications
You can also search for this author in PubMed Google Scholar
Pascal Molli
View author publications
You can also search for this author in PubMed Google Scholar
Maria-Esther Vidal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gabriela Montoya .

Editor information

Editors and Affiliations

Pontificia Universidad Católica de Chile, Santiago de Chile, Chile
Marcelo Arenas
Universidad Politecnica de Madrid, Boadilla del Monte, Spain
Oscar Corcho
University of Southampton, Southampton, United Kingdom
Elena Simperl
Department of Computational Social Science, GESIS Leibniz-Institut, Köln, Nordrhein-Westfalen, Germany
Markus Strohmaier
The Open University, Milton Keynes, United Kingdom
Mathieu d'Aquin
IBM Research, Yorktown Heights, New York, USA
Kavitha Srinivas
Elsevier Labs., Amsterdam, The Netherlands
Paul Groth
School of Medicine, Stanford University, Stanford, California, USA
Michel Dumontier
Lehigh University, Bethlehem, Pennsylvania, USA
Jeff Heflin
DAYTON, Ohio, USA
Krishnaprasad Thirunarayan
Wright State University, Dayton, Ohio, USA
Krishnaprasad Thirunarayan
University of Koblenz-Landau, Koblenz, Rheinland-Pfalz, Germany
Steffen Staab

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Montoya, G., Skaf-Molli, H., Molli, P., Vidal, ME. (2015). Federated SPARQL Queries Processing with Replicated Fragments. In: Arenas, M., et al. The Semantic Web - ISWC 2015. ISWC 2015. Lecture Notes in Computer Science(), vol 9366. Springer, Cham. https://doi.org/10.1007/978-3-319-25007-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-25007-6_3
Published: 30 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25006-9
Online ISBN: 978-3-319-25007-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics