skip to main content
10.1145/2612669.2612701acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
abstract

Concurrent data structures for efficient streaming aggregation

Published: 21 June 2014 Publication History

Abstract

We briefly describe our study on the problem of streaming multiway aggregation, where large data volumes are received from multiple input streams. Multiway aggregation is a fundamental computational component in data stream management systems, requiring low-latency and high throughput solutions.We focus on the problem of designing concurrent data structures enabling for low-latency and high-throughput multiway aggregation; an issue that has been overlooked in the literature. We propose two new concurrent data structures and their lock-free linearizable implementations, supporting both order-sensitive and order-insensitive aggregate functions.Results from an extensive evaluation show significant improvement in the aggregation performance,in terms of both processing throughput and latency over the commonly-used techniques based on queues.

References

[1]
D. J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, and S. Zdonik. Aurora: a new model and architecture for data stream management. The International Journal on Very Large Data Bases, 2003.
[2]
S. Akram, M. Marazakis, and A. Bilas. Understanding and improving the cost of scaling distributed event processing. In Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems, 2012.
[3]
M. Balazinska, H. Balakrishnan, S. R. Madden, and M. Stonebraker. Fault-tolerance in the Borealis distributed stream processing system. ACM Transactions on Database Systems (TODS), 2008.
[4]
C. Balkesen, N. Tatbul, and M. T. Özsu. Adaptive input admission and management for parallel stream processing. In Proceedings of the 7th ACM international conference on Distributed event-based systems, DEBS '13, pages 15--26. ACM, 2013.
[5]
D. Cederman, V. Gulisano, Y. Nikolakopoulos, M. Papatriantafilou, and P. Tsigas. Concurrent data structures for efficient streaming aggregation. Report, Chalmers University of Technology, 2013.
[6]
A. Dobra, M. Garofalakis, J. Gehrke, and R. Rastogi. Processing complex aggregate queries over data streams. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, 2002.
[7]
B. Gedik, R. R. Bordawekar, and S. Y. Philip. Celljoin: a parallel stream join operator for the cell processor. The VLDB Journal, 2009.
[8]
V. Gulisano, R. Jimenez-Peris, M. Patino-Martinez, C. Soriente, and P. Valduriez. Streamcloud: An elastic and scalable data streaming system. IEEE Transactions on Parallel and Distributed Systems, 2012.
[9]
M. P. Herlihy and J. M. Wing. Linearizability: a Correctness Condition for Concurrent Objects. ACM Transactions on Programming Languages and Systems, 1990.
[10]
S. Loesing, M. Hentschel, T. Kraska, and D. Kossmann. Stormy: an elastic and highly available streaming service in the cloud. In Proceedings of the 2012 Joint EDBT/ICDT Workshops, 2012.
[11]
M. M. Michael. The balancing act of choosing nonblocking features. Commun. ACM, 2013.
[12]
A. L. Shenoda Guirguis, Panos K. Chrysanthis and M. A. Sharaf. Three-level processing of multiple aggregate continuous queries. Proc. of the 28th IEEE International Conference on Data Engineering, 2012.
[13]
M. Stonebraker, U. Çetintemel, and S. Zdonik. The 8 requirements of real-time stream processing. ACM SIGMOD Record, 2005.
[14]
H. Sundell and P. Tsigas. Fast and lock-free concurrent priority queues for multi-thread systems. Journal of Parallel and Distributed Computing, 2005.

Cited By

View all
  • (2021)ScaleJoin: A Deterministic, Disjoint-Parallel and Skew-Resilient Stream JoinIEEE Transactions on Big Data10.1109/TBDATA.2016.26242747:2(299-312)Online publication date: 1-Jun-2021
  • (2021)Shared-Memory Parallel Hash-Based Stream Join in Continuous Data StreamsDatabase and Expert Systems Applications10.1007/978-3-030-86475-0_30(313-318)Online publication date: 1-Sep-2021
  • (2018)Viper: Communication-Layer Determinism and Scaling in Low-Latency Stream ProcessingEuro-Par 2017: Parallel Processing Workshops10.1007/978-3-319-75178-8_11(129-140)Online publication date: 8-Feb-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '14: Proceedings of the 26th ACM symposium on Parallelism in algorithms and architectures
June 2014
356 pages
ISBN:9781450328210
DOI:10.1145/2612669
  • General Chair:
  • Guy Blelloch,
  • Program Chair:
  • Peter Sanders
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 2014

Check for updates

Author Tags

  1. data streaming
  2. data structures
  3. lock-free synchronization

Qualifiers

  • Abstract

Funding Sources

Conference

SPAA '14

Acceptance Rates

SPAA '14 Paper Acceptance Rate 30 of 122 submissions, 25%;
Overall Acceptance Rate 447 of 1,461 submissions, 31%

Upcoming Conference

SPAA '25
37th ACM Symposium on Parallelism in Algorithms and Architectures
July 28 - August 1, 2025
Portland , OR , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2021)ScaleJoin: A Deterministic, Disjoint-Parallel and Skew-Resilient Stream JoinIEEE Transactions on Big Data10.1109/TBDATA.2016.26242747:2(299-312)Online publication date: 1-Jun-2021
  • (2021)Shared-Memory Parallel Hash-Based Stream Join in Continuous Data StreamsDatabase and Expert Systems Applications10.1007/978-3-030-86475-0_30(313-318)Online publication date: 1-Sep-2021
  • (2018)Viper: Communication-Layer Determinism and Scaling in Low-Latency Stream ProcessingEuro-Par 2017: Parallel Processing Workshops10.1007/978-3-319-75178-8_11(129-140)Online publication date: 8-Feb-2018
  • (2018)Power‐aware pipelining with automatic concurrency controlConcurrency and Computation: Practice and Experience10.1002/cpe.465231:5Online publication date: 14-Aug-2018
  • (2017)Maximizing Determinism in Stream Processing Under Latency ConstraintsProceedings of the 11th ACM International Conference on Distributed and Event-based Systems10.1145/3093742.3093921(112-123)Online publication date: 8-Jun-2017
  • (2016)New techniques to curtail the tail latency in stream processing systemsProceedings of the 4th Workshop on Distributed Cloud Computing10.1145/2955193.2955206(1-6)Online publication date: 25-Jul-2016
  • (2016)Highly Concurrent Stream Synchronization in Many-core Embedded SystemsProceedings of the Third ACM International Workshop on Many-core Embedded Systems10.1145/2934495.2934496(2-9)Online publication date: 19-Jun-2016
  • (2016)BESProceedings of the 2nd ACM International Workshop on Cyber-Physical System Security - CPSS '1610.1145/2899015.2899021(59-69)Online publication date: 2016
  • (2016)A Systematic Methodology for Optimization of Applications Utilizing Concurrent Data StructuresIEEE Transactions on Computers10.1109/TC.2015.247960465:7(2019-2031)Online publication date: 1-Jul-2016
  • (2016)Understanding the data-processing challenges in Intelligent Vehicular Systems2016 IEEE Intelligent Vehicles Symposium (IV)10.1109/IVS.2016.7535450(611-618)Online publication date: Jun-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media