Skip to main content

An Architecture for the Development of Distributed Analytics Based on Polystore Events

  • Conference paper
  • First Online:
Heterogeneous Data Management, Polystores, and Analytics for Healthcare (DMAH 2020, Poly 2020)

Abstract

To balance the requirements for data consistency and availability, organisations increasingly migrate towards hybrid data persistence architectures (called polystores throughout this paper) comprising both relational and NoSQL databases. The EC-funded H2020 TYPHON project offers facilities for designing and deploying such polystores, otherwise a complex, technically challenging and error-prone task. In addition, it is nowadays increasingly important for organisations to be able to extract business intelligence by monitoring data stored in polystores. In this paper, we propose a novel approach that facilitates the extraction of analytics in a distributed manner by monitoring polystore queries as these arrive for execution. Beyond the analytics architecture, we presented a pre-execution authorisation mechanism. We also report on preliminary scalability evaluation experiments which demonstrate the linear scalability of the proposed architecture.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    An example TyphonQL “select” query: from User u select u.age where u.id == 1.

  2. 2.

    https://hub.docker.com/r/wurstmeister/zookeeper/.

  3. 3.

    https://hub.docker.com/r/wurstmeister/kafka/.

  4. 4.

    AMD Opteron(tm) Processor 4226 – 6-cores @ 2.7 GHz, \(4 \times 16\) GB DD3 1066 MHz RAM.

References

  1. Confluent Inc.: Confluent: Apache Kafka and Event Streaming Platform for Enterprise. https://www.confluent.io/

  2. Confluent.io: Kafka Connect. https://docs.confluent.io/current/connect/index.html

  3. Debezium Community: Debezium. https://debezium.io/

  4. Garg, N.: Apache Kafka. Packt Publishing Ltd., Birmingham (2013)

    Google Scholar 

  5. Hueske, F., Kalavri, V.: Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications. O’Reilly Media, Newton (2019)

    Google Scholar 

  6. Kolovos, D., et al.: Domain-specific languages for the design, deployment and manipulation of heterogeneous databases. In: 2019 IEEE/ACM 11th International Workshop on Modelling in Software Engineering (MiSE), pp. 89–92. IEEE (2019)

    Google Scholar 

  7. Oracle Corporation: Real-time access to realtime Information, Oracle White Paper (2015)

    Google Scholar 

  8. Rooney, S., et al.: Kafka: the database inverted, but not garbled or compromised. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 3874–3880. IEEE (2019)

    Google Scholar 

  9. Strimzi: Strimzi - Apache Kafka on Kubernetes. https://strimzi.io/

  10. The Apache Software Foundation: Apache Flink Clusters and Deployment. https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/

  11. The Apache Software Foundation: Apache Flink Side Outputs. https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html

  12. ZenDesk: Maxwell’s Daemon. https://maxwells-daemon.io/

Download references

Acknowledgements

This work is funded by the European Union Horizon 2020 TYPHON project (#780251).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Athanasios Zolotas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zolotas, A., Barmpis, K., Medhat, F., Neubauer, P., Kolovos, D., Paige, R.F. (2021). An Architecture for the Development of Distributed Analytics Based on Polystore Events. In: Gadepally, V., et al. Heterogeneous Data Management, Polystores, and Analytics for Healthcare. DMAH Poly 2020 2020. Lecture Notes in Computer Science(), vol 12633. Springer, Cham. https://doi.org/10.1007/978-3-030-71055-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-71055-2_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-71054-5

  • Online ISBN: 978-3-030-71055-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics