Skip to main content

Sliding-Window Aggregation Algorithms

Encyclopedia of Big Data Technologies

Abstract

Sliding-window aggregation summarizes a collection of recent streaming data, capturing the most recent happenings as well as some history. Algorithms for this problem are required to maintain an aggregate value as new data items are inserted into the window when they arrive, and old data items are evicted from the window when they expire. Supporting this efficiently poses algorithmic challenges, especially for non-invertible aggregation functions such asmax, for which there is no way to “subtract off” expiring items. This chapter provides a brief overview of this area of research and explores a number of sliding-window aggregation algorithms, including both simple and sophisticated algorithms. Real-world use cases are also given to showcase problem scenarios where sliding-window aggregation can be applicable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Arasu A, Widom J (2004) Resource sharing in continuous sliding window aggregates. In: Conference on very large data bases (VLDB), pp 336–347

    Google Scholar 

  • Arasu A, Cherniack M, Galvez E, Maier D, Maskey AS, Ryvkina E, Stonebraker M, Tibbetts R (2004) Linear road: a stream data management benchmark. In: Conference on very large data bases (VLDB), pp 480–491

    Google Scholar 

  • Arasu A, Babu S, Widom J (2006) The CQL continuous query language: semantic foundations and query execution. J Very Large Data Bases 15(2):121–142

    Article  Google Scholar 

  • Bloom BH (1970) Space/time trade-offs in hash coding with allowable errors. Commun ACM 13(7):422–426

    Article  MATH  Google Scholar 

  • Blount M, Ebling MR, Eklund JM, James AG, McGregor C, Percival N, Smith K, Sow D (2010) Real-time analysis for intensive care: development and deployment of the Artemis analytic system. IEEE Eng Med Biol Mag 29:110–118

    Article  Google Scholar 

  • Carbone P, Traub J, Katsifodimos A, Haridi S, Markl V (2016) Cutty: aggregate sharing for user-defined windows. In: Conference on information and knowledge management (CIKM), pp 1201–1210

    Google Scholar 

  • Cormode G, Muthukrishnan S (2005) An improved data stream summary: the count-min sketch and its applications. J Algorithms 55(1):58–75

    Article  MathSciNet  MATH  Google Scholar 

  • Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: Symposium on operating systems design and implementation (OSDI), pp 137–150

    Google Scholar 

  • Flajolet P, Fusy E, Gandouet O, Meunier F (2007) HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm. In: Conference on analysis of algorithms (AofA), pp 127–146

    Google Scholar 

  • Garcia-Molina H, Ullman JD, Widom J (2008) Database systems: the complete book, 2nd edn. Pearson/Prentice Hall, New Dehli

    Google Scholar 

  • Gedik B (2013) Generic windowing support for extensible stream processing systems. Softw Pract Exp 44(9): 1105–1128

    Article  Google Scholar 

  • Gray J, Bosworth A, Layman A, Pirahesh H (1996) Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-total. In: International conference on data engineering (ICDE), pp 152–159

    Google Scholar 

  • Hirzel M, Rabbah R, Suter P, Tardieu O, Vaziri M (2016) Spreadsheets for stream processing with unbounded windows and partitions. In: Conference on distributed event-based systems (DEBS), pp 49–60

    Google Scholar 

  • Hutton G (1999) A tutorial on the universality and expressiveness of fold. J Funct Program 9(1):355–372

    Article  MathSciNet  MATH  Google Scholar 

  • Krishnamurthy S, Wu C, Franklin M (2006) On-the-fly sharing for streamed aggregation. In: International conference on management of data (SIGMOD), pp 623–634

    Google Scholar 

  • Krishnamurthy S, Franklin MJ, Davis J, Farina D, Golovko P, Li A, Thombre N (2010) Continuous analytics over discontinuous streams. In: International conference on management of data (SIGMOD), pp 1081–1092

    Google Scholar 

  • Li J, Maier D, Tufte K, Papadimos V, Tucker PA (2005) No pane, no gain: efficient evaluation of sliding-window aggregates over data streams. ACM SIGMOD Rec 34(1):39–44

    Article  Google Scholar 

  • Okasaki C (1995) Simple and efficient purely functional queues and deques. J Funct Program 5(4): 583–592

    Article  Google Scholar 

  • Sajaniemi J, Pekkanen J (1988) An empirical analysis of spreadsheet calculation. Softw Pract Exp 18(6):583–596

    Article  Google Scholar 

  • Schneider S, Hirzel M, Gedik B, Wu KL (2015) Safe data parallelism for general streaming. IEEE Trans Comput 64(2):504–517

    Article  MathSciNet  MATH  Google Scholar 

  • Shein AU, Chrysanthis PK, Labrinidis A (2017) FlatFIT: accelerated incremental sliding-window aggregation for real-time analytics. In: Conference on scientific and statistical database management (SSDBM), pp 5:1–5:12

    Google Scholar 

  • Srivastava U, Widom J (2004) Flexible time management in data stream systems. In: Principles of database systems (PODS), pp 263–274

    Google Scholar 

  • Tangwongsan K, Hirzel M, Schneider S, Wu KL (2015) General incremental sliding-window aggregation. In: Conference on very large data bases (VLDB), pp 702–713

    Google Scholar 

  • Tangwongsan K, Hirzel M, Schneider S (2017) Low-latency sliding-window aggregation in worst-case constant time. In: Conference on distributed event-based systems (DEBS), pp 66–77

    Google Scholar 

  • Treleaven P, Galas M, Lalchand V (2013) Algorithmic trading review. Commun ACM 56(11):76–85

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kanat Tangwongsan .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Tangwongsan, K., Hirzel, M., Schneider, S. (2018). Sliding-Window Aggregation Algorithms. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_157-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63962-8_157-1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63962-8

  • Online ISBN: 978-3-319-63962-8

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Chapter history

  1. Latest

    Sliding-Window Aggregation Algorithms
    Published:
    17 March 2022

    DOI: https://doi.org/10.1007/978-3-319-63962-8_157-2

  2. Original

    Sliding-Window Aggregation Algorithms
    Published:
    05 February 2018

    DOI: https://doi.org/10.1007/978-3-319-63962-8_157-1