Skip to main content

Stream Benchmarks

Encyclopedia of Big Data Technologies

Synonyms

Stream performance evaluation

Definitions

Stream benchmarks deal with performance evaluation techniques and define related metrics for stream data processing systems.

Overview

Firstly, we discuss background information on database benchmarking, foundations, main metrics, and main features for stream data benchmarking. Then, we provide related stream benchmarks and categorize them with respect to the application area.

Historical Background

Lee et al. (1997) initiated one of the first works in the related area, MediaBench, by evaluating and synthesizing multimedia and communications systems. Abadi et al. (2003) and Motwani et al. (2003) pioneered one of the first stream data processing systems, Aurora and STREAM, respectively. The need to compare the performance characteristics of streaming systems relative to each other and to alternative (e.g., Relational Database) systems endorsed the development of Linear Road benchmark (Arasu et al. 2004).

Foundations

Stream data processing...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Abadi DJ, Carney D, Çetintemel U, Cherniack M, Convey C, Lee S, Stonebraker M, Tatbul N, Zdonik S (2003) Aurora: a new model and architecture for data stream management. VLDB J Int J Very Large Data Bases 12(2):120–139

    Google Scholar 

  • Alevizos E, Artikis A (2014) Being logical or going with the flow? A comparison of complex event processing systems. In: SETN. Springer, pp 460–474

    Google Scholar 

  • Ali MI, Gao F, Mileo A (2015) Citybench: a configurable benchmark to evaluate RSP engines using smart city datasets. In: International semantic web conference. Springer, pp 374–389

    Google Scholar 

  • Arasu A, Cherniack M, Galvez E, Maier D, Maskey AS, Ryvkina E, Stonebraker M, Tibbetts R (2004) Linear road: a stream data management benchmark. In: Proceedings of the thirtieth international conference on very large data bases, vol 30. VLDB Endowment, pp 480–491

    Google Scholar 

  • Artisans D (2017) Extending the Yahoo! streaming benchmark. https://data-artisans.com/blog/extending-the-yahoo-streaming-benchmark. Online: Accessed 1 Nov 2017

  • Buzzwords B (2017) Nexmark: using apache beam to create a unified benchmarking suite. https://berlin- buzzwords.de/sites/berlinbuzzwords.de/files/media/ documents/nexmark_suite_for_apache_beam_berlin_ buzzwords_1.pdf. Online: Accessed 1 Nov 2017

  • Chauhan J, Chowdhury SA, Makaroff D (2012) Performance evaluation of yahoo! s4: a first look. In: 2012 seventh international conference on P2P, parallel, grid, cloud and internet computing (3PGCIC). IEEE, pp 58–65

    Google Scholar 

  • Chintapalli S, Dagit D, Evans B, Farivar R, Graves T, Holderbaugh M, Liu Z, Nusbaum K, Patil K, Peng BJ et al (2016a) Benchmarking streaming computation engines: storm, flink and spark streaming. In: 2016 IEEE international parallel and distributed processing symposium workshops. IEEE, pp 1789–1792

    Google Scholar 

  • Chintapalli S, Dagit D, Evans R, Farivar R, Liu Z, Nusbaum K, Patil K, Peng B (2016b) Pacemaker: when zookeeper arteries get clogged in storm clusters. In: 2016 IEEE 9th international conference on cloud computing (CLOUD). IEEE, pp 448–455

    Google Scholar 

  • Dayarathna M, Suzumura T (2013) A performance analysis of system s, s4, and esper via two level benchmarking. In: International conference on quantitative evaluation of systems. Springer, pp 225–240

    Google Scholar 

  • Dayarathna M, Li Y, Wen Y, Fan R (2017) Energy consumption analysis of data stream processing: a benchmarking approach. Softw Pract Exp 47(10): 1443–1462

    Google Scholar 

  • DellAglio D, Calbimonte JP, Balduini M, Corcho O, Della Valle E (2013) On correctness in RDF stream processor benchmarking. In: International semantic web conference. Springer, pp 326–342

    Google Scholar 

  • Friedrich S, Wingerath W, Ritter N (2017) Coordinated omission in NoSQL database benchmarking. In: BTW (Workshops), pp 215–225

    Google Scholar 

  • Gama J, Sebastião R, Rodrigues PP (2009) Issues in evaluation of stream learning algorithms. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 329–338

    Google Scholar 

  • Gama J, Sebastião R, Rodrigues PP (2013) On evaluating stream learning algorithms. Mach Learn 90(3): 317–346

    Google Scholar 

  • Gradvohl ALS, Senger H, Arantes L, Sens P (2014) Comparing distributed online stream processing systems considering fault tolerance issues. J Emerg Technol Web Intell 6(2):174–179

    Google Scholar 

  • Hochreiner C (2017) Visp testbed-a toolkit for modeling and evaluating resource provisioning algorithms for stream processing applications. Strategies 1(6):9–17

    Google Scholar 

  • Imai S, Patterson S, Varela CA (2017) Maximum sustainable throughput prediction for data stream processing over public clouds. In: Proceedings of the 17th IEEE/ACM international symposium on cluster, cloud and grid computing. IEEE Press, pp 504–513

    Google Scholar 

  • Jain N, Amini L, Andrade H, King R, Park Y, Selo P, Venkatramani C (2006) Design, implementation, and evaluation of the linear road benchmark on the stream processing core. In: Proceedings of the 2006 ACM SIGMOD international conference on Management of data. ACM, pp 431–442

    Google Scholar 

  • Karimov J, Rabl T, Katsifodimos A, Samarev R, Heiskanen H, Markl V (2018) Benchmarking distributed stream data processing systems. In: ICDE

    Google Scholar 

  • Kipf A, Pandey V, Böttcher J, Braun L, Neumann T, Kemper A (2017) Analytics on fast data: main-memory database systems versus modern streaming systems. In: EDBT, pp 49–60

    Google Scholar 

  • Le-Phuoc D, Dao-Tran M, Pham MD, Boncz P, Eiter T, Fink M (2012) Linked stream data processing engines: facts and figures. In: The semantic Web–ISWC 2012, pp 300–312

    Google Scholar 

  • Lee C, Potkonjak M, Mangione-Smith WH (1997) Mediabench: a tool for evaluating and synthesizing multimedia and communications systems. In: Proceedings of the 30th annual ACM/IEEE international symposium on microarchitecture. IEEE Computer Society, pp 330–335

    Google Scholar 

  • Lopez MA, Lobato AGP, Duarte OCM (2016a) A performance comparison of open-source stream processing platforms. In: 2016 IEEE global communications conference (GLOBECOM). IEEE, pp 1–6

    Google Scholar 

  • Lopez MA, Lobato AGP, Duarte OCM, Pujolle G (2016b) Design and performance evaluation of a virtualized network function for real-time threat detection using stream processing

    Google Scholar 

  • Lu R, Wu G, Xie B, Hu J (2014) Stream bench: towards benchmarking modern distributed stream computing frameworks. In: 2014 IEEE/ACM 7th international conference on utility and cloud computing (UCC). IEEE, pp 69–78

    Google Scholar 

  • Mendes MR, Bizarro P, Marques P (2009) A performance study of event processing systems. In: Technology conference on performance evaluation and benchmarking. Springer, pp 221–236

    Google Scholar 

  • Mendes M, Bizarro P, Marques P (2013) Towards a standard event processing benchmark. In: Proceedings of the 4th ACM/SPEC international conference on performance engineering. ACM, pp 307–310

    Google Scholar 

  • Mohamed S, Forshaw M, Thomas N, Dinn A (2017) Performance and dependability evaluation of distributed event-based systems: a dynamic code-injection approach. In: Proceedings of the 8th ACM/SPEC on international conference on performance engineering. ACM, pp 349–352

    Google Scholar 

  • Motwani R, Widom J, Arasu A, Babcock B, Babu S, Datar M, Manku G, Olston C, Rosenstein J, Varma R (2003) Query processing, resource management, and approximation in a data stream management system. In: CIDR

    Google Scholar 

  • Nazhandali L, Minuth M, Austin T (2005) Sensebench: toward an accurate evaluation of sensor network processors. In: Proceedings of the IEEE international workload characterization symposium, 2005. IEEE, pp 197–203

    Google Scholar 

  • Perera S, Perera A, Hakimzadeh K (2016) Reproducible experiments for comparing apache flink and apache spark on public clouds. arXiv preprint arXiv:161004493

    Google Scholar 

  • Qian S, Wu G, Huang J, Das T (2016) Benchmarking modern distributed streaming platforms. In: 2016 IEEE international conference on industrial technology (ICIT). IEEE, pp 592–598

    Google Scholar 

  • Samosir J, Indrawan-Santiago M, Haghighi PD (2016) An evaluation of data stream processing systems for data driven applications. Proc Comput Sci 80:439–449

    Google Scholar 

  • Schmidt AR, Waas F, Kersten ML, Florescu D, Manolescu I, Carey MJ, Busse R (2001) The XML benchmark project

    Google Scholar 

  • Schroeder B, Wierman A, Harchol-Balter M (2006) Open versus closed: a cautionary tale. In: NSDI, vol 6, pp 18–18

    Google Scholar 

  • Shukla A, Simmhan Y (2016) Benchmarking distributed stream processing platforms for IoT applications. In: Technology conference on performance evaluation and benchmarking. Springer, pp 90–106

    Google Scholar 

  • Shukla A, Chaturvedi S, Simmhan Y (2017) Riotbench: a real-time IoT benchmark for distributed stream processing platforms. arXiv preprint arXiv:170108530

    Google Scholar 

  • Teng M, Sun Q, Deng B, Sun L, Qin X (2017) A tool of benchmarking realtime analysis for massive behavior data. In: Asia-Pacific web (APWeb) and web-age information management (WAIM) joint conference on web and big data. Springer, pp 345–348

    Google Scholar 

  • Trivedi A, Stuedi P, Pfefferle J, Stoica R, Metzler B, Koltsidas I, Ioannou N (2016) On the [ir] relevance of network performance for data processing. Network 40:60

    Google Scholar 

  • Tucker P, Tufte K, Papadimos V, Maier D (2008) Nexmark–a benchmark for queries over data streams (draft). Technical report, OGI School of Science & Engineering at OHSU

    Google Scholar 

  • Wolf T, Franklin M (2000) Commbench-a telecommunications benchmark for network processors. In: 2000 IEEE international symposium on performance analysis of systems and software (ISPASS). IEEE, pp 154–162

    Google Scholar 

  • Zhang Y, Duc PM, Corcho O, Calbimonte JP (2012) Srbench: a streaming RDF/SPARQL benchmark. In: International semantic web conference. Springer, pp 641–657

    Google Scholar 

  • Zhang S, He B, Dahlmeier D, Zhou AC, Heinze T (2017) Revisiting the design of data stream processing systems on multi-core processors. In: 2017 IEEE 33rd international conference on data engineering (ICDE). IEEE, pp 659–670

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeyhun Karimov .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Karimov, J. (2018). Stream Benchmarks. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_299-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63962-8_299-1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63962-8

  • Online ISBN: 978-3-319-63962-8

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Chapter history

  1. Latest

    Stream Benchmarks
    Published:
    24 May 2022

    DOI: https://doi.org/10.1007/978-3-319-63962-8_299-2

  2. Original

    Stream Benchmarks
    Published:
    27 March 2018

    DOI: https://doi.org/10.1007/978-3-319-63962-8_299-1