Skip to main content
Log in

Data stream processing via code annotations

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Time-to-solution is an important metric when parallelizing existing code. The REPARA approach provides a systematic way to instantiate stream and data parallel patterns by annotating the sequential source code with \({\mathtt {C}}\)++\({\mathtt {11}}\) attributes. Annotations are automatically transformed in a target parallel code that uses existing libraries for parallel programming (e.g., FastFlow). In this paper, we apply this approach for the parallelization of a data stream processing application. The description shows the effectiveness of the approach in easily and quickly prototyping several parallel variants of the sequential code by obtaining good overall performance in terms of both throughput and latency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. At the time of writing, this phase is hand-made and not fully automatized.

  2. REPARA imposes restrictions on the source code when targeting specific hardware.

  3. http://apps.jcns.fz-juelich.de/lmfit.

  4. The trades and quotes NASDAQ tracefile of 30 Oct 2014, downloadable at http://www.nyxdata.com.

References

  1. Andrade H, Gedik B, Turaga D (2014) Fundamentals of stream processing. Cambridge University Press, Cambridge

    Book  Google Scholar 

  2. Cugola G, Margara A (2012) Processing flows of information: From data stream to complex event processing. ACM Comput Surv 44(3):15:1–15:62

    Article  Google Scholar 

  3. Castro Fernandez R, Migliavacca M, Kalyvianaki E, Pietzuch P (2013) Integrating scale out and fault tolerance in stream processing using operator state management. In: Proc. of the 2013 ACM SIGMOD international conference on management of data, SIGMOD ’13. ACM, New York, pp 725–736

  4. Chapman B, Jost G, Pas Rvd (2007) Using OpenMP: portable shared memory parallel programming (scientific and engineering computation). The MIT Press, USA

    Google Scholar 

  5. Danelutto M, De Matteis T., Mencagli G, Torquati M (2015) Parallelizing high-frequency trading applications by using c++11 attributes. In: Proc. of the 1st IEEE Inter. workshop on reengineering for parallelism in heterogeneous parallel platforms

  6. Danelutto M, Garcia JD, Sanchez LM, Sotomayor R, Torquati, M (2016) Introducing parallelism by using repara c++11 attributes. In: Proc. of the 17th Euromicro PDP 2016: parallel distributed and network-based processing. IEEE, Crete

  7. Danelutto M, Torquati M (2015) Structured parallel programming with “core” fastflow. In: Zsók V, Horváth Z, Csató L (eds) Central European functional programming school. vol 8606, Springer, LNCS, pp 29–75

    Google Scholar 

  8. De Matteis T, Mencagli G (2016) Keep calm and react with foresight: strategies for low- latency and energy-efficient elastic data stream processing. In: Proceedings of the 21th ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP 2016. ACM, New York

  9. Enterprise C, Inc. (2011) C, NVIDIA, the Portland Group: The OpenACC Application Programming Interface, v1.0a

  10. FastFlow website (2015). http://mc-fastflow.sourceforge.net/

  11. Gulisano V, Jimenez-Peris R, Patino-Martinez M, Soriente C, Valduriez P (2012) Streamcloud: An elastic and scalable data streaming system. IEEE Trans Parallel Distrib Syst 23(12):2351–2365

    Article  Google Scholar 

  12. IBM Infosphere Streams website (2015). http://www-03.ibm.com/software/products/en/ibm-streams

  13. Apache Spark Streaming website (2015). https://spark.apache.org/streaming

  14. Apache Storm website (2015). https://storm.apache.org

  15. Intel\(\textregistered \) TBB website (2015). http://threadingbuildingblocks.org

  16. Leijen D, Schulte W, Burckhardt S (2009) The design of a task parallel library. In: Proc. of the 24th ACM SIGPLAN conference on object oriented programming systems languages and applications, OOPSLA ’09, ACM, New York, pp 227–242

  17. Blumofe RD, Joerg CF, Kuszmaul BC, Leiserson CE, Randall KH, Zhou Y (1995) Cilk: an efficient multithreaded runtime system. SIGPLAN Not 30(8):207–216

    Article  Google Scholar 

  18. Kramer P, Egloff D, Blaser L (2016) The alea reactive dataflow system for gpu parallelization. In: Proc. of the HLGPU 2016 Workshop, HiPEAC 2016, Prague

  19. REPARA website (2016). http://repara-project.eu/

  20. ISO/IEC (2011) Information technology—Programming languages—C++. International Standard ISO/IEC 14882:20111, ISO/IEC, Geneva

  21. REPARA Project Deliverable, “D2.1: REPARA C++ Open Specification document” (2015)

  22. Andrade H, Gedik B, Wu KL, Yu PS (2011) Processing high data rate streams in system s. J Parallel Distrib Comput 71(2):145–156

    Article  Google Scholar 

  23. Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream systems. In: Proc. of the 21st ACM SIGMOD-SIGACT-SIGART Symp. on principles of database systems, PODS ’02, ACM, New York, pp 1–16

  24. Aldinucci M, Campa S, Danelutto M, Kilpatrick P, Torquati M (2014) Design patterns percolating to parallel programming framework implementation. Int J Parallel Program 42(6):1012–1031

    Article  Google Scholar 

  25. Balkesen C, Tatbul N (2011) Scalable data partitioning techniques for parallel sliding window processing over data streams. In: VLDB Inter. workshop on data management for sensor networks (DMSN’11), Seattle

  26. Mattson T, Sanders B, Massingill B (2004) Patterns for parallel programming, 1st edn. Addison-Wesley Professional, USA

  27. Thies W, Karczmarek M, Amarasinghe SP (2002) Streamit: a language for streaming applications. In: Proc. of the 11th Inter. conference on compiler construction, CC ’02. Springer-Verlag, London, pp 179–196

    Google Scholar 

  28. REPARA Project Deliverable, “D2.2: Static analysis techniques for AIR generation”. Available at: http://repara-project.eu/

  29. REPARA Project Deliverable, “D3.3: Static partitioning tool” (2015)

Download references

Acknowledgments

This work was partially supported by the EU FP7 project REPARA (ICT-609666).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massimo Torquati.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Danelutto, M., De Matteis, T., Mencagli, G. et al. Data stream processing via code annotations. J Supercomput 74, 5659–5673 (2018). https://doi.org/10.1007/s11227-016-1793-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-016-1793-9

Keywords

Navigation