skip to main content
10.1145/3508397.3564836acmconferencesArticle/Chapter ViewAbstractPublication PagesmedesConference Proceedingsconference-collections
research-article

A Parallel Processing Architecture to Optimize Runtime in Aggregated SPARQL Queries

Published: 08 December 2022 Publication History

Abstract

The search for information becomes a primordial need nowadays and it is possible that the information sought cannot be found by searching in a single data source, actually, an information may require collecting its parts from several distributed data sources. Our work aims to set up an aggregated search engine able to respond to a query by collecting data from independent data sources via a single user interface, and query processing in our system goes through several steps before returning final answers. Process speed is one of the main qualities of any search engine, and this speed can be affected if the search engine interacts with several data sources, which is the case of our work. In this regard, we propose in this paper a solution to optimize runtime in our aggregated search system, firstly, we present runtime evaluation of each process step in order to identify the costliest in terms of execution time, then, we propose a parallel processing architecture to optimize runtime without any data loss. The experimental results confirm the efficiency of our proposed architecture.

References

[1]
Matteo Cossu, Michael Färber, and Georg Lausen. 2018. Prost: Distributed execution of sparql queries using mixed partitioning strategies. arXiv preprint arXiv:1802.05898 (2018).
[2]
Fan Feng, Weikang Zhou, Ding Zhang, and Jinhui Pang. 2020. Highly Parallel SPARQL Engine for RDF. In International Conference of Pioneering Computer Scientists, Engineers and Educators. Springer, 61--71.
[3]
Olaf Görlitz and Steffen Staab. 2011. SPLENDID: SPARQL endpoint federation exploiting VOID descriptions. COLD 782 (2011).
[4]
Razen Harbi, Ibrahim Abdelaziz, Panos Kalnis, and Nikos Mamoulis. 2015. Evaluating SPARQL queries on massive RDF datasets. Proceedings of the VLDB Endowment 8, 12 (2015), 1848--1851.
[5]
Mahmudul Hassan and Srividya K Bansal. 2019. Data partitioning scheme for efficient distributed RDF querying using apache spark. In 2019 IEEE 13th International Conference on Semantic Computing (ICSC). IEEE, 24--31.
[6]
Xiang Kang, Yuying Zhao, Pingpeng Yuan, and Hai Jin. 2021. Grace: An Efficient Parallel SPARQL Query System over Large-Scale RDF Data. In 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD). IEEE, 769--774.
[7]
Hashim Khan. 2019. Towards more intelligent SPARQL querying interfaces. In International Semantic Web Conference.
[8]
Thanh-Huy Le, Haytham Elghazel, and Mohand-Saíd Hacid. 2012. A relational-based approach for aggregated search in graph databases. In International Conference on Database Systems for Advanced Applications. Springer, 33--47.
[9]
Marios Meimaris, George Papastefanatos, Nikos Mamoulis, and Ioannis Anagnostopoulos. 2017. Extended characteristic sets: graph indexing for SPARQL query optimization. In 2017 IEEE 33rd International Conference on Data Engineering (ICDE). IEEE, 497--508.
[10]
Bastian Quilitz and Ulf Leser. 2008. Querying distributed RDF data sources with SPARQL. In European semantic web conference. Springer, 524--538.
[11]
Ahmed Rabhi and Rachida Fissoune. 2020. WODII: a solution to process SPAARQL queries over distributed data sources. Cluster Computing 23, 3 (2020), 2315--2322.
[12]
Ahmed Rabhi, Rachida Fissoune, Mohamed Tabaa, and Hassan Badir. 2021. Intermediate results processing for aggregated SPAARQL queries. In 2021 IEEE/ACS 18th International Conference on Computer Systems and Applications (AICCSA). IEEE, 1--8.
[13]
Muhammad Saleem, Axel-Cyrille Ngonga Ngomo, Josiane Xavier Parreira, Helena F Deus, and Manfred Hauswirth. 2013. Daw: Duplicate-aware federated query processing over the web of data. In International Semantic Web Conference. Springer, 574--590.
[14]
Alexander Schätzle, Martin Przyjaciel-Zablocki, Simon Skilevic, and Georg Lausen. 2015. S2RDF: RDF querying with SPAARQL on spark. arXiv preprint arXiv:1512.07021 (2015).
[15]
Shanu Sushmita, Hideo Joho, Mounia Lalmas, and Robert Villa. 2010. Factors affecting click-through behavior in aggregated search interfaces. In Proceedings of the 19th ACM international conference on Information and knowledge management. 519--528.
[16]
Maria-Esther Vidal, Edna Ruckhaus, Tomas Lampo, Amadís Martínez, Javier Sierra, and Axel Polleres. 2010. Efficiently joining group patterns in SPAARQL queries. In Extended Semantic Web Conference. Springer, 228--242.
[17]
Yuxiang Wang, Arijit Khan, Xiaoliang Xu, Jiahui Jin, Qifan Hong, and Tao Fu. 2022. Aggregate queries on knowledge graphs: Fast approximation with semantic-aware sampling. arXiv preprint arXiv:2203.03792 (2022).

Index Terms

  1. A Parallel Processing Architecture to Optimize Runtime in Aggregated SPARQL Queries

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MEDES '22: Proceedings of the 14th International Conference on Management of Digital EcoSystems
    October 2022
    172 pages
    ISBN:9781450392198
    DOI:10.1145/3508397
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 December 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. SPARQL
    2. aggregated search
    3. multithreading
    4. parallel processing
    5. runtime optimization

    Qualifiers

    • Research-article

    Conference

    MEDES '22

    Acceptance Rates

    Overall Acceptance Rate 267 of 682 submissions, 39%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 27
      Total Downloads
    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media