Abstract
To balance the requirements for data consistency and availability, organisations increasingly migrate towards hybrid data persistence architectures (called polystores throughout this paper) comprising both relational and NoSQL databases. The EC-funded H2020 TYPHON project offers facilities for designing and deploying such polystores, otherwise a complex, technically challenging and error-prone task. In addition, it is nowadays increasingly important for organisations to be able to extract business intelligence by monitoring data stored in polystores. In this paper, we propose a novel approach that facilitates the extraction of analytics in a distributed manner by monitoring polystore queries as these arrive for execution. Beyond the analytics architecture, we presented a pre-execution authorisation mechanism. We also report on preliminary scalability evaluation experiments which demonstrate the linear scalability of the proposed architecture.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
An example TyphonQL “select” query: from User u select u.age where u.id == 1.
- 2.
- 3.
- 4.
AMD Opteron(tm) Processor 4226 – 6-cores @ 2.7 GHz, \(4 \times 16\) GB DD3 1066 MHz RAM.
References
Confluent Inc.: Confluent: Apache Kafka and Event Streaming Platform for Enterprise. https://www.confluent.io/
Confluent.io: Kafka Connect. https://docs.confluent.io/current/connect/index.html
Debezium Community: Debezium. https://debezium.io/
Garg, N.: Apache Kafka. Packt Publishing Ltd., Birmingham (2013)
Hueske, F., Kalavri, V.: Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications. O’Reilly Media, Newton (2019)
Kolovos, D., et al.: Domain-specific languages for the design, deployment and manipulation of heterogeneous databases. In: 2019 IEEE/ACM 11th International Workshop on Modelling in Software Engineering (MiSE), pp. 89–92. IEEE (2019)
Oracle Corporation: Real-time access to realtime Information, Oracle White Paper (2015)
Rooney, S., et al.: Kafka: the database inverted, but not garbled or compromised. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 3874–3880. IEEE (2019)
Strimzi: Strimzi - Apache Kafka on Kubernetes. https://strimzi.io/
The Apache Software Foundation: Apache Flink Clusters and Deployment. https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/
The Apache Software Foundation: Apache Flink Side Outputs. https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html
ZenDesk: Maxwell’s Daemon. https://maxwells-daemon.io/
Acknowledgements
This work is funded by the European Union Horizon 2020 TYPHON project (#780251).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Zolotas, A., Barmpis, K., Medhat, F., Neubauer, P., Kolovos, D., Paige, R.F. (2021). An Architecture for the Development of Distributed Analytics Based on Polystore Events. In: Gadepally, V., et al. Heterogeneous Data Management, Polystores, and Analytics for Healthcare. DMAH Poly 2020 2020. Lecture Notes in Computer Science(), vol 12633. Springer, Cham. https://doi.org/10.1007/978-3-030-71055-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-71055-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71054-5
Online ISBN: 978-3-030-71055-2
eBook Packages: Computer ScienceComputer Science (R0)