Journals & Magazines >IEEE Transactions on Parallel... >Volume: 31 Issue: 8

Evaluation of Stream Processing Frameworks

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

The increasing need for real-time insights in data sparked the development of multiple stream processing frameworks. Several benchmarking studies were conducted in an eff...Show More

Metadata

Abstract:

The increasing need for real-time insights in data sparked the development of multiple stream processing frameworks. Several benchmarking studies were conducted in an effort to form guidelines for identifying the most appropriate framework for a use case. In this article, we extend this research and present the results gathered. In addition to Spark Streaming and Flink, we also include the emerging frameworks Structured Streaming and Kafka Streams. We define four workloads with custom parameter tuning. Each of these is optimized for a certain metric or for measuring performance under specific scenarios such as bursty workloads. We analyze the relationship between latency, throughput, and resource consumption and we measure the performance impact of adding different common operations to the pipeline. To ensure correct latency measurements, we use a single Kafka broker. Our results show that the latency disadvantages of using a micro-batch system are most apparent for stateless operations. With more complex pipelines, customized implementations can give event-driven frameworks a large latency advantage. Due to its micro-batch architecture, Structured Streaming can handle very high throughput at the cost of high latency. Under tight latency SLAs, Flink sustains the highest throughput. Additionally, Flink shows the least performance degradation when confronted with periodic bursts of data. When a burst of data needs to be processed right after startup, however, micro-batch systems catch up faster while event-driven systems output the first events sooner.

Published in: IEEE Transactions on Parallel and Distributed Systems ( Volume: 31, Issue: 8, 01 August 2020)

Page(s): 1845 - 1858

Date of Publication: 05 March 2020

ISSN Information:

DOI: 10.1109/TPDS.2020.2978480

Contents

References is not available for this document.

Evaluation of Stream Processing Frameworks

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Evaluation of Stream Processing Frameworks

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?