skip to main content
research-article

ChronicleDB: A High-Performance Event Store

Published: 15 October 2019 Publication History

Abstract

Reactive security monitoring, self-driving cars, the Internet of Things (IoT), and many other novel applications require systems for both writing events arriving at very high and fluctuating rates to persistent storage as well as supporting analytical ad hoc queries. As standard database systems are not capable of delivering the required write performance, log-based systems, key-value stores, and other write-optimized data stores have emerged recently. However, the drawbacks of these systems are a fair query performance and the lack of suitable instant recovery mechanisms in case of system failures.
In this article, we present ChronicleDB, a novel database system with a storage layout tailored for high write performance under fluctuating data rates and powerful indexing capabilities to support a variety of queries. In addition, ChronicleDB offers low-cost fault tolerance and instant recovery within milliseconds. Unlike previous work, ChronicleDB is designed either as a serverless library to be tightly integrated in an application or as a standalone database server. Our results of an experimental evaluation with real and synthetic data reveal that ChronicleDB clearly outperforms competing systems with respect to both write and query performance.

References

[1]
2011. BerlinMOD. Retrieved from: http://dna.fernuni-hagen.de/secondo/BerlinMOD/BerlinMOD.html.
[2]
2014. DEBS Grand Challenge 2014. Retrieved from http://debs.org/debs-2014-smart-homes/.
[3]
2015. KairosDB. Retrieved from https://kairosdb.github.io/.
[4]
2016. Apache Cassandra. Retrieved from http://cassandra.apache.org/.
[5]
2016. DEBS Grand Challenge 2013. Retrieved December 10, 2017 from http://debs.org/debs-2013-grand-challenge-soccer-monitoring/.
[6]
2016. ISO/IEC TR 19075-5:2016, Information technology — Database languages — SQL Technical Reports — Part 5: Row Pattern Recognition in SQL. Retrieved from: http://standards.iso.org/ittf/PubliclyAvailableStandards/.
[7]
2017. Apache Hadoop. Retrieved from: http://hadoop.apache.org/.
[8]
2017. Apache HBase. Retrieved from: http://hbase.apache.org/.
[9]
2017. InfluxDB. Retrieved from: https://github.com/influxdata/influxdb.
[10]
2017. LZ4 Compression. Retrieved from: https://github.com/lz4/lz4.
[11]
2017. OpenTSDB. Retrieved from: http://opentsdb.net/.
[12]
2017. PostgreSQL. Retrieved from: http://www.postgresql.org/.
[13]
2017. SafeCast. Retrieved from: http://blog.safecast.org/data/.
[14]
Anastassia Ailamaki, David J. DeWitt, Mark D. Hill, and Marios Skounakis. 2001. Weaving relations for cache performance. In Proceedings of the VLDB. 169--180.
[15]
Roger S. Barga, Jonathan Goldstein, Mohamed H. Ali, and Mingsheng Hong. 2007. Consistent streaming through time: A vision for event stream processing. In Proceedings of the CIDR. 363--374.
[16]
Lars Baumgärtner, Christian Strack, Bastian Hoßbach, Marc Seidemann, Bernhard Seeger, and Bernd Freisleben. 2015. Complex event processing for reactive security monitoring in virtualized computer systems. In Proceedings of the DEBS. 22--33.
[17]
Burton H. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13, 7 (1970), 422--426.
[18]
Zhao Cao, Shimin Chen, Feifei Li, Min Wang, and Xiaoyang Sean Wang. 2013. LogKV: Exploiting key-value stores for log processing. In Proceedings of the CIDR 2013.
[19]
Paris Carbone, Asterios Katsifodimos, Stephan Ewen, Volker Markl, Seif Haridi, and Kostas Tzoumas. 2015. Apache Flink™: Stream and batch processing in a single engine. IEEE Data Eng. Bull. 38, 4 (2015), 28--38.
[20]
Paris Carbone, Jonas Traub, Asterios Katsifodimos, Seif Haridi, and Volker Markl. 2016. Cutty: Aggregate sharing for user-defined windows. In Proceedings of the CIKM. 1201--1210.
[21]
Surajit Chaudhuri and Umeshwar Dayal. 1997. An overview of data warehousing and OLAP technology. SIGMOD Rec. 26, 1 (1997), 65--74.
[22]
Alan Demers, Johannes Gehrke, Mingsheng Hong, Biswanath Panda, Mirek Riedewald, Varun Sharma, and Walker White. 2007. Cayuga: A general purpose event monitoring system. In Proceedings of the CIDR. 412--422.
[23]
Luca Deri, Simone Mainardi, and Francesco Fusco. 2012. TSDB: A compressed database for time series. In Proceedings of the TMA. 143--156.
[24]
Lukasz Golab, Theodore Johnson, J. Spencer Seidel, and Vladislav Shkapenyuk. 2009. Stream warehousing with datadepot. In Proceedings of the SIGMOD. ACM, 847--854.
[25]
Bastian Hoßbach, Nikolaus Glombiewski, Andreas Morgen, Franz Ritter, and Bernhard Seeger. 2013. JEPC: The Java event processing connectivity. Daten.-Spekt. 13, 3 (2013), 167--178.
[26]
Theodore Johnson and Vladislav Shkapenyuk. 2015. Data stream warehousing in tidalrace. In Proceedings of the CIDR.
[27]
B. Kuszmaul. 2010. How TokuDB Fractal Tree Indexes Work. Technical Report. TokuTek.
[28]
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. 2005. No pane, no gain: Efficient evaluation of sliding-window aggregates over data streams. SIGMOD Rec. 34, 1 (2005), 39--44.
[29]
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, and Peter A. Tucker. 2005. Semantics and evaluation techniques for window aggregates in data streams. In Proceedings of the SIGMOD. 311--322.
[30]
Mo Liu, Ming Li, Denis Golovnya, Elke A. Rundensteiner, and Kajal Claypool. 2009. Sequence pattern query processing over out-of-order event streams. In Proceedings of the ICDE. IEEE, 784--795.
[31]
Charles Loboz, Slawek Smyl, and Suman Nath. 2010. DataGarage: Warehousing massive performance data on commodity servers. PVLDB 3, 1--2 (2010), 1447--1458.
[32]
Yuan Mei and Samuel Madden. 2009. ZStream: A cost-based query processor for adaptively detecting composite events categories and subject descriptors. In Proceedings of the SIGMOD. 193--206.
[33]
Guido Moerkotte. 1998. Small materialized aggregates: A lightweight index structure for data warehousing. In Proceedings of the VLDB. Morgan Kaufmann Publishers Inc., San Francisco, CA, 476--487.
[34]
Peter Muth, Patrick O’Neil, Achim Pick, and Gerhard Weikum. 2000. The LHAM log-structured history data access method. VLDB J. 8, 3-4 (2000), 199--221.
[35]
Patrick O’Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O’Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Inf. 33, 4 (1996), 351--385.
[36]
Tuomas Pelkonen, Scott Franklin, Justin Teller, Paul Cavallaro, Qi Huang, Justin Meza, and Kaushik Veeraraghavan. 2015. Gorilla: A fast, scalable, in-memory time series database. PVLDB 8, 12 (2015), 1816--1827.
[37]
Jun Rao and Kenneth A. Ross. 2000. Making B+- trees cache conscious in main memory. In Proceedings of the SIGMOD. 475--486.
[38]
Marc Seidemann and Bernhard Seeger. 2017. ChronicleDB: A high-performance event store. In Proceedings of the EDBT. 144--155.
[39]
Kanat Tangwongsan and Martin Hirzel. 2015. General incremental sliding-window aggregation. PVLDB 8, 7 (2015), 702--713.
[40]
Kanat Tangwongsan, Martin Hirzel, and Scott Schneider. 2017. Low-latency sliding-window aggregation in worst-case constant time. In Proceedings of the DEBS. 66--77.
[41]
Peter A. Tucker, David Maier, Tim Sheard, and Leonidas Fegaras. 2003. Exploiting punctuation semantics in continuous data streams. IEEE Trans. Knowl. Data Eng. 15, 3 (2003), 555--568.
[42]
Fabio Valdés and Ralf Hartmut Güting. 2014. Index-supported pattern matching on symbolic trajectories. In Proceedings of the SIGSPATIAL 2014. ACM Press, New York, New York, 53--62.
[43]
Hoang Tam Vo, Sheng Wang, Divyakant Agrawal, Gang Chen, and Beng Chin Ooi. 2012. LogBase: A scalable log-structured database system in the cloud. PVLDB 5, 10 (2012), 1004--1015.
[44]
Sheng Wang, David Maier, and Beng Chin Ooi. 2014. Lightweight indexing of observational data in log-structured storage. In PVLDB, Vol. 7. 529--540.
[45]
Eugene Wu, Yanlei Diao, and Shariq Rizvi. 2006. High-performance complex event processing over streams. In Proceedings of the SIGMOD. ACM, 407--418.
[46]
Jun Yang and Jennifer Widom. 2003. Incremental computation and maintenance of temporal aggregates. VLDB J. 12, 3 (2003), 262--283.
[47]
Haopeng Zhang, Yanlei Diao, and Neil Immerman. 2014. On complexity and optimization of expensive queries in complex event processing. In Proceedings of the SIGMOD. ACM, 217--228.

Cited By

View all
  • (2025)Efficient Event Processing on Modern HardwareScalable Data Management for Future Hardware10.1007/978-3-031-74097-8_3(65-89)Online publication date: 24-Jan-2025
  • (2024)Benchmarking Learned and LSM Indexes for Data SortednessProceedings of the Tenth International Workshop on Testing Database Systems10.1145/3662165.3662764(16-22)Online publication date: 9-Jun-2024
  • (2024)ACER: Accelerating Complex Event Recognition via Two-Phase Filtering under Range Bitmap-Based IndexesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671814(1933-1943)Online publication date: 25-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 44, Issue 4
Best of EDBT 2017, Best of EDBT 2018, Best of ICDT 2018 and Regular Papers
December 2019
249 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/3366712
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2019
Accepted: 01 May 2019
Revised: 01 March 2019
Received: 01 December 2017
Published in TODS Volume 44, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Event processing
  2. aggregation queries
  3. indexing
  4. recovery
  5. storage layout
  6. time travel queries

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)2
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Efficient Event Processing on Modern HardwareScalable Data Management for Future Hardware10.1007/978-3-031-74097-8_3(65-89)Online publication date: 24-Jan-2025
  • (2024)Benchmarking Learned and LSM Indexes for Data SortednessProceedings of the Tenth International Workshop on Testing Database Systems10.1145/3662165.3662764(16-22)Online publication date: 9-Jun-2024
  • (2024)ACER: Accelerating Complex Event Recognition via Two-Phase Filtering under Range Bitmap-Based IndexesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671814(1933-1943)Online publication date: 25-Aug-2024
  • (2023)Mask–Mediator–Wrapper: A Revised Mediator–Wrapper Architecture for Heterogeneous Data Source IntegrationApplied Sciences10.3390/app1304247113:4(2471)Online publication date: 14-Feb-2023
  • (2023)Out-of-Order Sliding-Window Aggregation with Efficient Bulk Evictions and InsertionsProceedings of the VLDB Endowment10.14778/3611479.361152116:11(3227-3239)Online publication date: 24-Aug-2023
  • (2023)Workload-Aware Performance Tuning for Multimodel Databases Based on Deep Reinforcement LearningInternational Journal of Intelligent Systems10.1155/2023/88351112023Online publication date: 1-Jan-2023
  • (2021)Index-Accelerated Pattern Matching in Event StoresProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457245(1023-1036)Online publication date: 9-Jun-2021
  • (2021)A conceptual framework for establishing trust in real world intelligent systemsCognitive Systems Research10.1016/j.cogsys.2021.04.00168(143-155)Online publication date: Aug-2021
  • (2020)Designing an Event Store for a Modern Three-layer Storage HierarchyDatenbank-Spektrum10.1007/s13222-020-00356-620:3(211-222)Online publication date: 16-Oct-2020
  • (2020)GridTables: A One-Size-Fits-Most H2TAP Data StoreDatenbank-Spektrum10.1007/s13222-019-00330-x20:1(43-56)Online publication date: 31-Jan-2020

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media