research-article

SPECTRA: Continuous Query Processing for RDF Graph Streams Over Sliding Windows

Authors:
Syed Gillani

Univ Lyon, UJM Saint-Étienne, CNRS, Laboratoire Hubert Curien, Saint-Étienne, France

Univ Lyon, UJM Saint-Étienne, CNRS, Laboratoire Hubert Curien, Saint-Étienne, France
View Profile

,
Gauthier Picard

Univ Lyon, MINES Saint-Étienne, CNRS, Laboratoire Hubert Curien, Saint-Étienne, France

Univ Lyon, MINES Saint-Étienne, CNRS, Laboratoire Hubert Curien, Saint-Étienne, France
View Profile

,
Frédérique Laforest

Univ Lyon, UJM Saint-Étienne, CNRS, Laboratoire Hubert Curien, Saint-Étienne, France

Univ Lyon, UJM Saint-Étienne, CNRS, Laboratoire Hubert Curien, Saint-Étienne, France
View Profile

SSDBM '16: Proceedings of the 28th International Conference on Scientific and Statistical Database ManagementJuly 2016Article No.: 17Pages 1–12https://doi.org/10.1145/2949689.2949701

Published:18 July 2016Publication History

SSDBM '16: Proceedings of the 28th International Conference on Scientific and Statistical Database Management

Pages 1–12

ABSTRACT

This paper proposes a new approach for the the incremental evaluation of RDF graph streams over sliding windows. Our system, called "SPECTRA", combines a novel formof RDF graph summarisation, a new incremental evaluation method and adaptive indexing techniques. We materialise the summarised graph from each event using vertically partitioned views to facilitate the fast hash-joins for all types of queries. Our incremental and adaptive indexing is a byproduct of query processing, and thus provides considerable advantages over offline and online indexing. Furthermore, contrary to the existing approaches, we employ incremental evaluation of triples within a window. This results in considerable reduction in response time, while cutting the unnecessary cost imposed by recomputation models for each triple insertion and eviction within a defined window. We show that our resulting system is able to cope with complex queries and datasets with clear benefits. Our experimental results on both synthetic and real-world datasets show up to an order of magnitude of performance improvements as compared to state-of-the-art systems.

References

D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Sw-store: A vertically partitioned DBMS for semantic web data management. The VLDB Journal, 18(2):385--406, Apr. 2009. Google ScholarDigital Library
A. Arasu, S. Babu, and J. Widom. The cql continuous query language: Semantic foundations and query execution. The VLDB Journal, 15:121--142, 2006. Google ScholarDigital Library
M. Arias and J. D. Fernández. An empirical study of real-world SPARQL queries. CoRR, abs/1103.5043, 2011.Google Scholar
M. Atre and Chaoji. Matrix "bit" loaded: A scalable lightweight join query processor for RDF data. In WWW, pages 41--50, 2010. Google ScholarDigital Library
R. Avnur and J. M. Hellerstein. Eddies: Continuously adaptive query processing. In SIGMOD, pages 261--272, 2000. Google ScholarDigital Library
B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In SIGMOD-SIGACT-SIGART, pages 1--16, 2002. Google ScholarDigital Library
B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. Models and issues in data stream systems. In SIGMOD-PODS, pages 1--16, 2002. Google ScholarDigital Library
D. F. Barbieri and Braga. C-SPARQL: Sparql for continuous querying. In WWW, pages 1061--1062, 2009. Google ScholarDigital Library
H. R. Bazoobandi, S. Rooij, F. Harmelen, and H. Bal. A compact in-memory dictionary for RDF data. In ESWC, pages 205--220, 2015. Google ScholarDigital Library
J. Broekstra and Kampman. Sesame: A generic architecture for storing and querying RDF and RDF schema. In ISWC, pages 54--68, 2002. Google ScholarDigital Library
J.-P. Calbimonte, O. Corcho, and A. J. G. Gray. Enabling ontology-based access to streaming data sources. In ISWC, pages 96--111, 2010. Google ScholarDigital Library
S. Chaudhuri and V. Narasayya. Self-tuning database systems: A decade of progress. In VLDB, pages 3--14, 2007. Google ScholarDigital Library
L. Chen and C. Wang. Continuous subgraph pattern search over certain and uncertain graph streams. In IEEE Trans on Know. and Data Eng., pages 1093--1109, 2010. Google ScholarDigital Library
S. Choudhury, L. B. Holder, G. C. Jr., K. Agarwal, and J. Feo. A selectivity based approach to continuous pattern detection in streaming graphs. pages 157--168, 2015.Google Scholar
W. Fan, J. Li, J. Luo, Z. Tan, X. Wang, and Y. Wu. Incremental graph pattern matching. In SIGMOD, pages 925--936, 2011. Google ScholarDigital Library
A. Gubichev and M. Then. Graph pattern matching: Do we have to reinvent the wheel? In GRADES, pages 8:1--8:7, 2014. Google ScholarDigital Library
S. Gurajada, S. Seufert, I. Miliaraki, and M. Theobald. Triad: A distributed shared-nothing rdf engine based on asynchronous message passing. In SIGMOD, pages 289--300, 2014. Google ScholarDigital Library
A. Hogan, M. Arenas, A. Mallea, and A. Polleres. Everything you always wanted to know about blank nodes. Web Semantics: Science, Services and Agents on the World Wide Web, 27--28:42--69, 2014. Google ScholarDigital Library
S. Idreos, M. L. Kersten, and S. Manegold. Database cracking. In CIDR, pages 68--78, 2007.Google Scholar
S. Idreos, M. L. Kersten, and S. Manegold. Updating a cracked database. In SIGMOD, pages 413--424, 2007. Google ScholarDigital Library
S. Komazec, D. Cerri, and D. Fensel. Sparkwave: Continuous schema-enhanced pattern matching over RDF data streams. In DEBS, pages 58--68, 2012. Google ScholarDigital Library
J. Krämer and B. Seeger. Semantics and implementation of continuous sliding window queries over data streams. In ACM Trans. Database Syst., volume 34, pages 4:1--4:49, 2009. Google ScholarDigital Library
D. Le-Phuoc, M. Dao-Tran, J. X. Parreira, and M. Hauswirth. A native and adaptive approach for unified processing of linked streams and linked data. In ISWC, pages 370--388. 2011. Google ScholarDigital Library
F. Liu and S. Blanas. Forecasting the cost of processing multi-join queries via hashing for main-memory databases. In soCC, pages 153--166, 2015. Google ScholarDigital Library
B. McBride. Jena: Implementing the RDF model and syntax specification. In SemWeb, pages 23--28, 2001. Google ScholarDigital Library
Y. Nenov, R. Piro, B. Motik, I. Horrocks, Z. Wu, and J. Banerjee. RDFox: A highly-scalable RDF store. In ISWC, 2015.Google ScholarCross Ref
T. Neumann and G. Weikum. The RDF-3X engine for scalable management of RDF data. In VLDB, pages 91--113, 2010. Google ScholarDigital Library
J. Pérez, M. Arenas, and C. Gutierrez. Semantics and complexity of SPARQL. In ACM Transactions on Database Systems, volume 34, pages 1--45, 2009. Google ScholarDigital Library
F. Picalausa, Y. Luo, G. H. L. Fletcher, J. Hidders, and S. Vansummeren. A structural approach to indexing triples. In ESWC, pages 406--421, 2012. Google ScholarDigital Library
K. Schnaitter, S. Abiteboul, T. Milo, and N. Polyzotis. Colt: Continuous on-line tuning. In SIGMOD, pages 793--795, 2006. Google ScholarDigital Library
U. Srivastava and J. Widom. Flexible time management in data stream systems. In PODs, pages 263--274, 2004. Google ScholarDigital Library
C. Weiss, P. Karras, and A. Bernstein. Hexastore: Sextuple indexing for semantic web data management. In VLDB Endow., volume 1, pages 1008--1019, 2008. Google ScholarDigital Library
K. Wilkinson. Jena Property Table Implementation. In SSWS, 2006.Google Scholar
D. Wood, M. Lanthaler, and R. Cyganiak. RDF 1.1 concepts and abstract syntax. In W3C Recommendation, Technical Report, 2014.Google Scholar
L. Zou, M. T. Ozsu, L. Chen, X. Shen, R. Huang, and D. Zhao. gStore: a graph-based SPARQL query engine. In VLDB, pages 565--590, 2014. Google ScholarDigital Library

SPECTRA: Continuous Query Processing for RDF Graph Streams Over Sliding Windows
1. Information systems
  1. Data management systems
    1. Database management system engines
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory

Recommendations

Time- and Space-Efficient Sliding Window Top-k Query Processing

A sliding window top-k (top-k/w) query monitors incoming data stream objects within a sliding window of size w to identify the k highest-ranked objects with respect to a given scoring function over time. Processing of such queries is challenging because,...
Read More
A Structure for Sliding Window Equijoins in Data Stream Processing
CSE '13: Proceedings of the 2013 IEEE 16th International Conference on Computational Science and Engineering

Sliding window equijoins are commonly used in data stream applications. In their implementation, a hash table is generally allocated for each stream source. However, this structure may degrade join performance because all tuples in the hash tables need ...
Read More
Continuous monitoring of top-k queries over sliding windows
SIGMOD '06: Proceedings of the 2006 ACM SIGMOD international conference on Management of data

Given a dataset P and a preference function f, a top-k query retrieves the k tuples in P with the highest scores according to f. Even though the problem is well-studied in conventional databases, the existing methods are inapplicable to highly dynamic ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SSDBM '16: Proceedings of the 28th International Conference on Scientific and Statistical Database Management
July 2016
290 pages
ISBN:9781450342155
DOI:10.1145/2949689
Editors:
Peter Baumann,
Ioana Manolescu-Goujot,
Luca Trani,
Yannis Ioannidis,
Gergely Gábor Barnaföldi,
László Dobos,
Evelin Bányai
Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 July 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Incremental Stream Processing
RDF Graphs
Sliding Windows
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate56of146submissions,38%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 133
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

SPECTRA: Continuous Query Processing for RDF Graph Streams Over Sliding Windows

SSDBM '16: Proceedings of the 28th International Conference on Scientific and Statistical Database Management

ABSTRACT

References

Cited By

Recommendations

Time- and Space-Efficient Sliding Window Top-k Query Processing

A Structure for Sliding Window Equijoins in Data Stream Processing

Continuous monitoring of top-k queries over sliding windows