research-article

A Parallel Processing Architecture to Optimize Runtime in Aggregated SPARQL Queries

Authors:

Rachida Fissoune,

Hassan BadirAuthors Info & Claims

MEDES '22: Proceedings of the 14th International Conference on Management of Digital EcoSystems

Pages 9 - 15

https://doi.org/10.1145/3508397.3564836

Published: 08 December 2022 Publication History

Abstract

The search for information becomes a primordial need nowadays and it is possible that the information sought cannot be found by searching in a single data source, actually, an information may require collecting its parts from several distributed data sources. Our work aims to set up an aggregated search engine able to respond to a query by collecting data from independent data sources via a single user interface, and query processing in our system goes through several steps before returning final answers. Process speed is one of the main qualities of any search engine, and this speed can be affected if the search engine interacts with several data sources, which is the case of our work. In this regard, we propose in this paper a solution to optimize runtime in our aggregated search system, firstly, we present runtime evaluation of each process step in order to identify the costliest in terms of execution time, then, we propose a parallel processing architecture to optimize runtime without any data loss. The experimental results confirm the efficiency of our proposed architecture.

References

[1]

Matteo Cossu, Michael Färber, and Georg Lausen. 2018. Prost: Distributed execution of sparql queries using mixed partitioning strategies. arXiv preprint arXiv:1802.05898 (2018).

[2]

Fan Feng, Weikang Zhou, Ding Zhang, and Jinhui Pang. 2020. Highly Parallel SPARQL Engine for RDF. In International Conference of Pioneering Computer Scientists, Engineers and Educators. Springer, 61--71.

[3]

Olaf Görlitz and Steffen Staab. 2011. SPLENDID: SPARQL endpoint federation exploiting VOID descriptions. COLD 782 (2011).

[4]

Razen Harbi, Ibrahim Abdelaziz, Panos Kalnis, and Nikos Mamoulis. 2015. Evaluating SPARQL queries on massive RDF datasets. Proceedings of the VLDB Endowment 8, 12 (2015), 1848--1851.

Digital Library

[5]

Mahmudul Hassan and Srividya K Bansal. 2019. Data partitioning scheme for efficient distributed RDF querying using apache spark. In 2019 IEEE 13th International Conference on Semantic Computing (ICSC). IEEE, 24--31.

[6]

Xiang Kang, Yuying Zhao, Pingpeng Yuan, and Hai Jin. 2021. Grace: An Efficient Parallel SPARQL Query System over Large-Scale RDF Data. In 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD). IEEE, 769--774.

[7]

Hashim Khan. 2019. Towards more intelligent SPARQL querying interfaces. In International Semantic Web Conference.

[8]

Thanh-Huy Le, Haytham Elghazel, and Mohand-Saíd Hacid. 2012. A relational-based approach for aggregated search in graph databases. In International Conference on Database Systems for Advanced Applications. Springer, 33--47.

Digital Library

[9]

Marios Meimaris, George Papastefanatos, Nikos Mamoulis, and Ioannis Anagnostopoulos. 2017. Extended characteristic sets: graph indexing for SPARQL query optimization. In 2017 IEEE 33rd International Conference on Data Engineering (ICDE). IEEE, 497--508.

[10]

Bastian Quilitz and Ulf Leser. 2008. Querying distributed RDF data sources with SPARQL. In European semantic web conference. Springer, 524--538.

[11]

Ahmed Rabhi and Rachida Fissoune. 2020. WODII: a solution to process SPAARQL queries over distributed data sources. Cluster Computing 23, 3 (2020), 2315--2322.

Digital Library

[12]

Ahmed Rabhi, Rachida Fissoune, Mohamed Tabaa, and Hassan Badir. 2021. Intermediate results processing for aggregated SPAARQL queries. In 2021 IEEE/ACS 18th International Conference on Computer Systems and Applications (AICCSA). IEEE, 1--8.

[13]

Muhammad Saleem, Axel-Cyrille Ngonga Ngomo, Josiane Xavier Parreira, Helena F Deus, and Manfred Hauswirth. 2013. Daw: Duplicate-aware federated query processing over the web of data. In International Semantic Web Conference. Springer, 574--590.

Digital Library

[14]

Alexander Schätzle, Martin Przyjaciel-Zablocki, Simon Skilevic, and Georg Lausen. 2015. S2RDF: RDF querying with SPAARQL on spark. arXiv preprint arXiv:1512.07021 (2015).

[15]

Shanu Sushmita, Hideo Joho, Mounia Lalmas, and Robert Villa. 2010. Factors affecting click-through behavior in aggregated search interfaces. In Proceedings of the 19th ACM international conference on Information and knowledge management. 519--528.

Digital Library

[16]

Maria-Esther Vidal, Edna Ruckhaus, Tomas Lampo, Amadís Martínez, Javier Sierra, and Axel Polleres. 2010. Efficiently joining group patterns in SPAARQL queries. In Extended Semantic Web Conference. Springer, 228--242.

[17]

Yuxiang Wang, Arijit Khan, Xiaoliang Xu, Jiahui Jin, Qifan Hong, and Tao Fu. 2022. Aggregate queries on knowledge graphs: Fast approximation with semantic-aware sampling. arXiv preprint arXiv:2203.03792 (2022).

Index Terms

A Parallel Processing Architecture to Optimize Runtime in Aggregated SPARQL Queries
1. Computer systems organization
  1. Architectures
    1. Parallel architectures

Recommendations

Aggregated Search and Interleaving Methods: A survey
BDAW '16: Proceedings of the International Conference on Big Data and Advanced Wireless Technologies

Aggregated search attempts to satisfy user's need by searching and assembling information from variety verticals and placing them into a single result page. Aggregated search has two research directions namely, cross-vertical Aggregated Search (cvAS) ...
Interest and Evaluation of Aggregated Search
WI-IAT '11: Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Major search engines perform what is known as Aggregated Search (AS). They integrate results coming from different vertical search engines (images, videos, news, etc.) with typical Web search results. Aggregated search is relatively new and its ...
ROSIE: Runtime Optimization of SPARQL Queries over RDF Using Incremental Evaluation
Knowledge Science, Engineering and Management
Abstract
RDF (Resource Description Framework) is a proposed standard for knowledge representation, with relational databases wildly adopted in RDF data management. For efficient evaluation of SPARQL queries over RDF data, the legacy query optimizer needs ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MEDES '22: Proceedings of the 14th International Conference on Management of Digital EcoSystems

October 2022

172 pages

ISBN:9781450392198

DOI:10.1145/3508397

General Chairs:
Ernesto Damiani
Khalifa University, UAE
,
Claudio Silvestri
Università Ca' Foscari di Venezia, Italy
,
Mirjana Ivanovic
University of Novi Sad, Serbia
,
Richard Chbeir
University of Pau and the Adour Region, France
,
Yannis Manolopoulos
Open University of Cyprus, Cyprus

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

SIGAPP: ACM Special Interest Group on Applied Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 December 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MEDES '22

MEDES '22: International Conference on Management of Digital EcoSystems

October 19 - 21, 2022

Venice, Italy

Acceptance Rates

Overall Acceptance Rate 267 of 682 submissions, 39%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
27
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten