Abstract
It is often a challenge to keep input/output tasks/results in order for parallel computations over data streams, particularly when stateless task operators are replicated to increase parallelism when there are irregular tasks. Maintaining input/output order requires additional coding effort and may significantly impact the application’s actual throughput. Thus, we propose a new implementation technique designed to be easily integrated with any of the existing C++ parallel programming frameworks that support stream parallelism. In this paper, it is first implemented and studied using SPar, our high-level domain-specific language for stream parallelism. We discuss the results of a set of experiments with real-world applications revealing how significant performance improvements may be achieved when our proposed solution is integrated within SPar, especially for data compression applications. Also, we show the results of experiments performed after integrating our solution within FastFlow and TBB, revealing no significant overheads.








Notes
The SPar home page https://gmap.pucrs.br/spar.
Used in continuous data stream processing to mark substream or stream items.
The K-slack algorithm deals with out-of-order data arrivals by delaying an event for at most K time units (K must be known in advance).
References
14882:2014-ISO/IEC. Information technology—programming languages—C++. Technical report, International Standard, Geneva, Switzerland, December (2014)
Aldinucci M, Danelutto M, Kilpatrick P, Torquati M (March 2014) FastFlow: high-level and efficient streaming on multi-core. In: Pllana S, Xhafa F (eds) Programming multi-core and many-core computing systems, vol 1 of PDC, p 14. Wiley
Aldinucci M, Pezzi GP, Drocco M, Spampinato C, Torquati M (2015) Parallel visual data restoration on Multi-GPGPUs using stencil-reduce pattern. Int J High Perform Comput Appl 29:461–472
Aldinucci M, Spampinato C, Drocco M, Torquati M, Palazzo S (2012) A parallel edge preserving algorithm for salt and pepper image denoising. In: 2th International Conference on Image Processing Theory Tools and Applications (IPTA), Istambul, Turkey. IEEE, pp 97–102
Andrade HCM, Gedik B, Turaga DS (2014) Fundamentals of stream processing. Cambridge University Press, New York
Babu S, Srivastava U, Widom J (2004) Exploiting K-constraints to reduce memory overhead in continuous queries over data streams. ACM Trans Database Syst 29(3):545–580
Bienia C, Kumar S, Singh JP, Li K (2008) The PARSEC benchmark suite: characterization and architectural implications. In: 17th International Conference on Parallel Architectures and Compilation Techniques (PACT ’08), pp 72–81, Toronto, Ontario. ACM
Brito A, Fetzer C, Sturzrehm H, Felber P (2008) Speculative out-of-order event processing with software transaction memory. In: Proceedings of the Second International Conference on Distributed Event-based Systems (DEBS ’08), pp 265–275, New York, NY. ACM
De Sensi D, De Matteis T, Torquati M, Mencagli G, Danelutto M (2017) Bringing parallel patterns out of the corner: the p\(^{3}\)arsec benchmark suite. ACM Trans Archit Code Optim 14(4):33
Griebler D (June 2016) Domain-specific language & support tool for high-level stream parallelism. Ph.D. thesis, Faculdade de Informática - PPGCC - PUCRS, Porto Alegre, Brazil
Griebler D, Danelutto M, Torquati M, Fernandes LG (2017) SPar: a DSL for high-level and productive stream parallelism. Parallel Process Lett 27(01):1740005
Griebler D, Hoffmann RB, Danelutto M, Fernandes LG (2017) Higher-level parallelism abstractions for video applications with SPar. In: Parallel Computing is Everywhere, Proceedings of the International Conference on Parallel Computing (ParCo’17), pp 698–707, Bologna, Italy. IOS Press
Griebler D, Hoffmann RB, Danelutto M, Fernandes LG (2018) High-level and productive stream parallelism for Dedup, Ferret, and Bzip2. Int J Parallel Program 1–19
Ji Y, Zhou H, Jerzak Z, Nica A, Hackenbroich G, Fetzer C (2015) Quality-driven processing of sliding window aggregates over out-of-order data streams. In: Proceedings of the 9th ACM International Conference on Distributed Event-based Systems (DEBS ’15), pp 68–79, New York, NY, USA. ACM
McCool M, Robison AD, Reinders J (2012) Structured parallel programming: patterns for efficient computation. Morgan Kaufmann, Burlington
Mencagli G, Torquati M, Danelutto M, Matteis TD (2017) Parallel continuous preference queries over out-of-order and bursty data streams. IEEE Trans Parallel Distrib Syst 28(9):2608–2624
Mutschler C, Philippsen M (May 2013) Distributed low-latency out-of-order event processing for high data rate sensor streams. In: 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, pp 1133–1144
Reinders J (2007) Intel threading building blocks. O’Reilly, California
Thies W (2009) Language and compiler support for stream programs. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge
Thies W, Karczmarek M, Amarasinghe SP (April 2002) StreamIt: a language for streaming applications. In: Proceedings of the 11th International Conference on Compiler Construction, pp 179–196, Grenoble, France. Springer
Acknowledgements
The authors would like to thank the partial financial support from CAPES and FAPERGS Brazilian research institutions. Moreover, this work has also received partial financial support from the EU H2020-ICT-2014-1 project RePhrase (No. 644235).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Griebler, D., Hoffmann, R.B., Danelutto, M. et al. Stream parallelism with ordered data constraints on multi-core systems. J Supercomput 75, 4042–4061 (2019). https://doi.org/10.1007/s11227-018-2482-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2482-7