skip to main content
10.1145/3587259.3627542acmconferencesArticle/Chapter ViewAbstractPublication Pagesk-capConference Proceedingsconference-collections
research-article

Tunable Query Optimizer for Web APIs and User Preferences

Published: 05 December 2023 Publication History

Abstract

To answer queries many SPARQL query processors use different sources, e.g., various knowledge bases (KBs) or end points. RESTful Web APIs are rarely the focus of those systems as they come with many limitations, like not being able to process SPARQL queries. Moreover, most existing approaches optimize their query plans only for performance, even though users often have additional preferences, e.g., coverage, reliability, or currency. Additionally, data is often provided with different levels of quality so that not all sources should be trusted equally. In this paper, we therefore present TunA, a query engine that is able to combine RESTful Web APIs and local RDF KBs in the form of triple stores while tuning its (query) plans towards user preferences. Erroneous information from Web APIs is detected using hierarchical agglomerative clustering. Our evaluation shows that TunA outperforms current state-of-the-art systems and is less vulnerable to erroneous information, even in settings where only unreliable sources are available.

References

[1]
I. Abdelaziz, E. Mansour, M. Ouzzani, A. Aboulnaga, and P. Kalnis. 2017. Lusail: A System for Querying Linked Data at Scale. PVLDB 11, 4 (2017), 485–498. https://doi.org/10.1145/3186728.3164144
[2]
H. Alili, K. Belhajjame, R. Drira, D. Grigori, and H. Ghézala. 2018. Quality Based Data Integration for Enriching User Data Sources in Service Lakes. In Proc. ICWS. 163–170. https://doi.org/10.1109/ICWS.2018.00028
[3]
L. Assaf. 2016. Names, Identifications, and Social Change : Naming Practices and the (Re-)Shaping of Identities and Relationships within German Jewish Communities in the Late Middle Ages. Ph. D. Dissertation. Universität Konstanz, Konstanz.
[4]
S. Baltes, L. Dumani, C. Treude, and S. Diehl. 2018. SOTorrent: reconstructing and analyzing the evolution of stack overflow posts. In Proc. MSR. 319–330. https://doi.org/10.1145/3196398.3196430
[5]
A. Charalambidis, A. Troumpoukis, and S. Konstantopoulos. 2015. SemaGrow: optimizing federated SPARQL queries. In Proc. SEMANTiCS. 121–128. https://doi.org/10.1145/2814864.2814886
[6]
S. Cheng and O. Hartig. 2019. OPT+: A Monotonic Alternative to OPTIONAL in SPARQL. Journal of Web Engineering 18 (2019), 169–206. https://doi.org/10.1016/j.websem.2016.03.003
[7]
S. Cheng and O. Hartig. 2022. Source Selection for SPARQL Endpoints: Fit for Heterogeneous Federations of RDF Data Sources?. In QuWeDa@ISWC. 5–16.
[8]
O. Görlitz and S. Staab. 2011. SPLENDID: SPARQL Endpoint Federation Exploiting VOID Descriptions. In Proc. COLD2011. http://ceur-ws.org/Vol-782/GoerlitzAndStaab_COLD2011.pdf
[9]
L. Heling and M. Acosta. 2022. Federated SPARQL Query Processing over Heterogeneous Linked Data Fragments. In Proc. WWW. 1047–1057. https://doi.org/10.1145/3485447.3511947
[10]
L. Heling and M. Acosta. 2022. Robust query processing for linked data fragments. Semantic Web 13, 4 (2022), 623–657. https://doi.org/10.3233/SW-212888
[11]
L. Heling and M. Acosta. 2022. Utility-aware Semantics for Alternative Service Expressions in Federated SPARQL Queries. In Proc. ICWS. 208–218. https://doi.org/10.1109/ICWS55610.2022.00042
[12]
K. Hose and R. Schenkel. 2012. Towards benefit-based RDF source selection for SPARQL queries. In Pro. SWIM. 2. https://doi.org/10.1145/2237867.2237869
[13]
M. Koutraki, N. Preda, and D. Vodislav. 2017. Online Relation Alignment for Linked Datasets. In Proc. ESWC. 152–168. https://doi.org/10.1007/978-3-319-58068-5_10
[14]
G. Montoya, H. Skaf-Molli, and K. Hose. 2017. The Odyssey Approach for Optimizing Federated SPARQL Queries. In Proc. ISWC. 471–489. https://doi.org/10.1007/978-3-319-68288-4_28
[15]
N. Preda, G. Kasneci, F. Suchanek, T. Neumann, W. Yuan, and G. Weikum. 2010. Active knowledge: dynamically enriching RDF knowledge bases by web services. In Proc. SIGMOD. 399–410. https://doi.org/10.1145/1807167.1807212
[16]
N. Preda, F. Suchanek, G. Kasneci, T. Neumann, M. Ramanath, and G. Weikum. 2009. ANGIE: Active Knowledge for Interactive Exploration. Proc. VLDB Endow. 2, 2 (2009), 1570–1573. https://doi.org/10.14778/1687553.1687594
[17]
M. Saleem and A. Ngomo. 2014. HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation. In Proc. ESWC. 176–191. https://doi.org/10.1007/978-3-319-07443-6_13
[18]
M. Saleem, A. Ngomo, J. Parreira, H. Deus, and M. Hauswirth. 2013. DAW: Duplicate-AWare Federated Query Processing over the Web of Data. In Proc. ISWC. 574–590.
[19]
A. Schwarte, P. Haase, K. Hose, R. Schenkel, and M. Schmidt. 2011. FedX: Optimization Techniques for Federated Query Processing on Linked Data. In Proc. ISWC. 601–616. https://doi.org/10.1007/978-3-642-25073-6_38
[20]
P. Vance. 1993. Knapsack Problems: Algorithms and Computer Implementations (S. Martello and P. Toth). SIAM Rev. 35, 4 (1993), 684–685. https://doi.org/10.1137/1035174
[21]
Y. H. Wang. 1993. On the number of Successes In Independent Trials. Statistica Sinica 3, 2 (1993), 295–312. http://www.jstor.org/stable/24304959
[22]
T. Zeimetz, M. Büsching, F. Birringer, C. Otter, D. Zeiler, and R. Schenkel. 2023. Evaluation toolkit for API and RDF alignment. In OM@ISWC. http://disi.unitn.it/ pavel/om2023/papers/om2023_LTpaper5.pdf
[23]
T. Zeimetz and R. Schenkel. 2021. FiLiPo: A Sample Driven Approach for Finding Linkage Points Between RDF Data and APIs. In Proc. ADBIS. 244–259. https://doi.org/10.1007/978-3-030-82472-3_18
[24]
L. Zeng, B. Benatallah, M. Dumas, J. Kalagnanam, and Q. Sheng. 2003. Quality driven web services composition. In Proc. WWW. 411–421. https://doi.org/10.1145/775152.775211

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
K-CAP '23: Proceedings of the 12th Knowledge Capture Conference 2023
December 2023
270 pages
ISBN:9798400701412
DOI:10.1145/3587259
  • Editors:
  • Brent Venable,
  • Daniel Garijo,
  • Brian Jalaian
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Quality Estimation
  2. Query Rewriting
  3. RESTful Web APIs

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Independent Research Fund Denmark
  • Poul Due Jensen Foundation
  • IFI Programme of the German Academic Exchange Service

Conference

K-CAP '23
Sponsor:
K-CAP '23: Knowledge Capture Conference 2023
December 5 - 7, 2023
FL, Pensacola, USA

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 49
    Total Downloads
  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media