research-article

Tunable Query Optimizer for Web APIs and User Preferences

Authors:

Tobias Zeimetz,

Ralf SchenkelAuthors Info & Claims

K-CAP '23: Proceedings of the 12th Knowledge Capture Conference 2023

Pages 92 - 100

https://doi.org/10.1145/3587259.3627542

Published: 05 December 2023 Publication History

Abstract

To answer queries many SPARQL query processors use different sources, e.g., various knowledge bases (KBs) or end points. RESTful Web APIs are rarely the focus of those systems as they come with many limitations, like not being able to process SPARQL queries. Moreover, most existing approaches optimize their query plans only for performance, even though users often have additional preferences, e.g., coverage, reliability, or currency. Additionally, data is often provided with different levels of quality so that not all sources should be trusted equally. In this paper, we therefore present TunA, a query engine that is able to combine RESTful Web APIs and local RDF KBs in the form of triple stores while tuning its (query) plans towards user preferences. Erroneous information from Web APIs is detected using hierarchical agglomerative clustering. Our evaluation shows that TunA outperforms current state-of-the-art systems and is less vulnerable to erroneous information, even in settings where only unreliable sources are available.

References

[1]

I. Abdelaziz, E. Mansour, M. Ouzzani, A. Aboulnaga, and P. Kalnis. 2017. Lusail: A System for Querying Linked Data at Scale. PVLDB 11, 4 (2017), 485–498. https://doi.org/10.1145/3186728.3164144

Digital Library

[2]

H. Alili, K. Belhajjame, R. Drira, D. Grigori, and H. Ghézala. 2018. Quality Based Data Integration for Enriching User Data Sources in Service Lakes. In Proc. ICWS. 163–170. https://doi.org/10.1109/ICWS.2018.00028

[3]

L. Assaf. 2016. Names, Identifications, and Social Change : Naming Practices and the (Re-)Shaping of Identities and Relationships within German Jewish Communities in the Late Middle Ages. Ph. D. Dissertation. Universität Konstanz, Konstanz.

[4]

S. Baltes, L. Dumani, C. Treude, and S. Diehl. 2018. SOTorrent: reconstructing and analyzing the evolution of stack overflow posts. In Proc. MSR. 319–330. https://doi.org/10.1145/3196398.3196430

Digital Library

[5]

A. Charalambidis, A. Troumpoukis, and S. Konstantopoulos. 2015. SemaGrow: optimizing federated SPARQL queries. In Proc. SEMANTiCS. 121–128. https://doi.org/10.1145/2814864.2814886

Digital Library

[6]

S. Cheng and O. Hartig. 2019. OPT+: A Monotonic Alternative to OPTIONAL in SPARQL. Journal of Web Engineering 18 (2019), 169–206. https://doi.org/10.1016/j.websem.2016.03.003

Digital Library

[7]

S. Cheng and O. Hartig. 2022. Source Selection for SPARQL Endpoints: Fit for Heterogeneous Federations of RDF Data Sources?. In QuWeDa@ISWC. 5–16.

[8]

O. Görlitz and S. Staab. 2011. SPLENDID: SPARQL Endpoint Federation Exploiting VOID Descriptions. In Proc. COLD2011. http://ceur-ws.org/Vol-782/GoerlitzAndStaab_COLD2011.pdf

[9]

L. Heling and M. Acosta. 2022. Federated SPARQL Query Processing over Heterogeneous Linked Data Fragments. In Proc. WWW. 1047–1057. https://doi.org/10.1145/3485447.3511947

Digital Library

[10]

L. Heling and M. Acosta. 2022. Robust query processing for linked data fragments. Semantic Web 13, 4 (2022), 623–657. https://doi.org/10.3233/SW-212888

[11]

L. Heling and M. Acosta. 2022. Utility-aware Semantics for Alternative Service Expressions in Federated SPARQL Queries. In Proc. ICWS. 208–218. https://doi.org/10.1109/ICWS55610.2022.00042

[12]

K. Hose and R. Schenkel. 2012. Towards benefit-based RDF source selection for SPARQL queries. In Pro. SWIM. 2. https://doi.org/10.1145/2237867.2237869

Digital Library

[13]

M. Koutraki, N. Preda, and D. Vodislav. 2017. Online Relation Alignment for Linked Datasets. In Proc. ESWC. 152–168. https://doi.org/10.1007/978-3-319-58068-5_10

Digital Library

[14]

G. Montoya, H. Skaf-Molli, and K. Hose. 2017. The Odyssey Approach for Optimizing Federated SPARQL Queries. In Proc. ISWC. 471–489. https://doi.org/10.1007/978-3-319-68288-4_28

Digital Library

[15]

N. Preda, G. Kasneci, F. Suchanek, T. Neumann, W. Yuan, and G. Weikum. 2010. Active knowledge: dynamically enriching RDF knowledge bases by web services. In Proc. SIGMOD. 399–410. https://doi.org/10.1145/1807167.1807212

Digital Library

[16]

N. Preda, F. Suchanek, G. Kasneci, T. Neumann, M. Ramanath, and G. Weikum. 2009. ANGIE: Active Knowledge for Interactive Exploration. Proc. VLDB Endow. 2, 2 (2009), 1570–1573. https://doi.org/10.14778/1687553.1687594

Digital Library

[17]

M. Saleem and A. Ngomo. 2014. HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation. In Proc. ESWC. 176–191. https://doi.org/10.1007/978-3-319-07443-6_13

[18]

M. Saleem, A. Ngomo, J. Parreira, H. Deus, and M. Hauswirth. 2013. DAW: Duplicate-AWare Federated Query Processing over the Web of Data. In Proc. ISWC. 574–590.

[19]

A. Schwarte, P. Haase, K. Hose, R. Schenkel, and M. Schmidt. 2011. FedX: Optimization Techniques for Federated Query Processing on Linked Data. In Proc. ISWC. 601–616. https://doi.org/10.1007/978-3-642-25073-6_38

[20]

P. Vance. 1993. Knapsack Problems: Algorithms and Computer Implementations (S. Martello and P. Toth). SIAM Rev. 35, 4 (1993), 684–685. https://doi.org/10.1137/1035174

[21]

Y. H. Wang. 1993. On the number of Successes In Independent Trials. Statistica Sinica 3, 2 (1993), 295–312. http://www.jstor.org/stable/24304959

[22]

T. Zeimetz, M. Büsching, F. Birringer, C. Otter, D. Zeiler, and R. Schenkel. 2023. Evaluation toolkit for API and RDF alignment. In OM@ISWC. http://disi.unitn.it/ pavel/om2023/papers/om2023_LTpaper5.pdf

[23]

T. Zeimetz and R. Schenkel. 2021. FiLiPo: A Sample Driven Approach for Finding Linkage Points Between RDF Data and APIs. In Proc. ADBIS. 244–259. https://doi.org/10.1007/978-3-030-82472-3_18

Digital Library

[24]

L. Zeng, B. Benatallah, M. Dumas, J. Kalagnanam, and Q. Sheng. 2003. Quality driven web services composition. In Proc. WWW. 411–421. https://doi.org/10.1145/775152.775211

Digital Library

Index Terms

Tunable Query Optimizer for Web APIs and User Preferences
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Database query processing
        Query planning
  2. World Wide Web
    1. Web applications
      1. Crowdsourcing

Recommendations

View-based query processing: On the relationship between rewriting, answering and losslessness

As a result of the extensive research in view-based query processing, three notions have been identified as fundamental, namely rewriting, answering, and losslessness. Answering amounts to computing the tuples satisfying the query in all databases ...
Query Folding
ICDE '96: Proceedings of the Twelfth International Conference on Data Engineering

Query folding refers to the activity of determining if and how a query can be answered using a given set of resources, which might be materialized views, cached results of previous queries, or queries answerable by other databases. We investigate query ...
Determinacy and query rewriting for conjunctive queries and views

Answering queries using views is the problem which examines how to derive the answers to a query when we only have the answers to a set of views. Constructing rewritings is a widely studied technique to derive those answers. In this paper we consider ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

K-CAP '23: Proceedings of the 12th Knowledge Capture Conference 2023

December 2023

270 pages

ISBN:9798400701412

DOI:10.1145/3587259

Editors:
Brent Venable
University of West Florida and Institute for Human and Machine Cognition, Pensacola, FL, USA
,
Daniel Garijo
Ontology Engineering Group, Universidad Politécnica de Madrid, Spain
,
Brian Jalaian
University of West Florida and Institute for Human & Machine Cognition, Pensacola, FL, USA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Independent Research Fund Denmark
Poul Due Jensen Foundation
IFI Programme of the German Academic Exchange Service

Conference

K-CAP '23

Sponsor:

SIGAI

K-CAP '23: Knowledge Capture Conference 2023

December 5 - 7, 2023

FL, Pensacola, USA

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
49
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten