Skip to main content
Log in

DSParLib: A C++ Template Library for Distributed Stream Parallelism

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Stream processing applications deal with millions of data items continuously generated over time. Often, they must be processed in real-time and scale performance, which requires the use of distributed parallel computing resources. In C/C++, the current state-of-the-art for distributed architectures and High-Performance Computing is Message Passing Interface (MPI). However, exploiting stream parallelism using MPI is complex and error-prone because it exposes many low-level details to the programmer. In this work, we introduce a new parallel programming abstraction for implementing distributed stream parallelism named DSParLib. Our abstraction of MPI simplifies parallel programming by providing a pattern-based and building block-oriented development to inter-connect, model, and parallelize data streams found in modern applications. Experiments conducted with five different stream processing applications and the representative PARSEC Ferret benchmark revealed that DSParLib is efficient and flexible. Also, DSParLib achieved similar or better performance, required less coding, and provided simpler abstractions to express parallelism with respect to handwritten MPI programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. Available in: https://github.com/GMAP/DSParLib.

References

  1. Aldinucci, M., Campa, S., Danelutto, M., Kilpatrick, P., Torquati, M.: Targeting Distributed Systems in Fastflow. In: Proceedings of the 18th International Conference on Parallel Processing Workshops, Euro-Par’12, pp. 47–56. Springer, Berlin (2013)

  2. Aldinucci, M., Danelutto, M., Kilpatrick, P., Torquati, M.: Fastflow: High-Level and Efficient Streaming on Multicore, chapter 13, pp. 261–280. Wiley (2017)

  3. Anderson, M., Smith, S., Sundaram, N., Capotă, M., Zhao, Z., Dulloor, S., Satish, N., Willke, T.L.: Bridging the gap between HPC and big data frameworks. Proc. VLDB Endow. 10(8), 901–912 (2017)

    Article  Google Scholar 

  4. Andrade, H.C.M., Gedik, B., Turaga, D.S.: Fundamentals of Stream Processing: Application Design, Systems, and Analytics, 1st edn. Cambridge University Press, New York (2014)

    Book  Google Scholar 

  5. Apache Storm.: Apache Storm. https://storm.apache.org, July 2019. Accessed 16 May 2021

  6. Bienia, C., Kumar, S., Singh, J. P., Li, K.: The PARSEC benchmark suite: characterization and architectural implications. In: 17th International Conference on Parallel Architectures and Compilation Techniques, PACT ’08, pp 72–81. ACM, Toronto (2008)

  7. Bingmann, T., Axtmann, M., Jöbstl, E., Lamm, S., Nguyen, H. C., Noe, A., Schlag, S., Stumpp, M., Sturm, T., Sanders, P.: Thrill: high-performance algorithmic distributed batch data processing with C++. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 172–183 (2016)

  8. Boost committee: Boost C++ library: Serialization. https://www.boost.org/doc/libs/1_79_0/libs/serialization/doc/index.html (2004)

  9. Carbone, P., Katsifodimos, A., Ewen, S., Markl, V., Haridi, S., Tzoumas, K.: Apache Flink\(^{\rm TM}\): stream and batch processing in a single engine. IEEE Data Eng. Bull. 38, 28–38 (2015)

    Google Scholar 

  10. Cole, M.: Algorithmic Skeletons: Structured Management of Parallel Computation. University of Glasgow, Glasgow (1989)

    MATH  Google Scholar 

  11. Cole, M.: Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming. Paral. Comput. 30(3), 389–406 (2004)

    Article  Google Scholar 

  12. del Rio Astorga, D., Dolz, M.F., Fernández, J., García, J.D.: A generic parallel pattern interface for stream and data processing. Concurr. Comput.: Pract. Exp. 29(24), 1–12 (2017)

    Google Scholar 

  13. Ernsting, S., Kuchen, H.: A scalable farm skeleton for hybrid parallel and distributed programming. Int. J. Parallel Program. 42, 968–987 (2013)

    Article  Google Scholar 

  14. Falcou, J., Sérot, J., Chateau, T., Lapresté, J.: Quaff: efficient C++ design for parallel skeletons. Parallel Comput. 32(7), 604–615 (2006). (Algorithmic Skeletons)

    Article  Google Scholar 

  15. González-Vélez, H., Leyton, M.: A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers. Softw.: Pract. Exp. 40(12), 1135–1160 (2010)

    Google Scholar 

  16. Grant, W. S., Voorhies, R.: Cereal—a C++11 library for serialization. https://github.com/USCiLab/cereal, 2017

  17. Griebler, D.: Domain-specific language & support tool for high-level stream parallelism. PhD thesis, Faculdade de Informática—PPGCC - PUCRS, Porto Alegre, Brazil (2016)

  18. Griebler, D., Danelutto, M., Torquati, M., Fernandes, L. G.: An Embedded C++ domain-specific language for stream parallelism. In: Parallel Computing: On the Road to Exascale, Proceedings of the International Conference on Parallel Computing, ParCo’15, pp. 317–326. IOS Press, Edinburgh (2015)

  19. Griebler, D., Danelutto, M., Torquati, M., Fernandes, L.G.: SPar: a DSL for high-level and productive stream parallelism. Parallel Process. Lett. 27(01), 1740005 (2017)

    Article  MathSciNet  Google Scholar 

  20. Griebler, D., Fernandes, L. G.: Towards Distributed Parallel Programming Support for the SPar DSL. In: Proceedings of the International Conference on Parallel Computing, ParCo’17, pp. 563–572. IOS Press, Bologna (2017)

  21. Griebler, D., Hoffmann, R. B., Danelutto, M., Fernandes, L. G.: Higher-level parallelism abstractions for video applications with SPar. In: Proceedings of the International Conference on Parallel Computing, ParCo’17, pp. 698–707. IOS Press, Bologna (2017)

  22. Griebler, D., Hoffmann, R.B., Danelutto, M., Fernandes, L.G.: High-Level and Productive Stream Parallelism for Dedup, Ferret, and Bzip2. Int. J. Parallel Program. 47(1), 253–271 (2018)

    Google Scholar 

  23. Griebler, D., Hoffmann, R.B., Danelutto, M., Fernandes, L.G.: Stream Parallelism with Ordered Data Constraints on Multi-Core Systems. J. Supercomput. 75(8), 4042–4061 (2018)

    Article  Google Scholar 

  24. López-Gómez, J., Fernández Muñoz, J., del Rio Astorga, D., Dolz, M.F., Garcia, J.D.: Exploring stream parallel patterns in distributed MPI environments. Parallel Comput. 84, 24–36 (2019)

    Article  Google Scholar 

  25. Mancini, E.P., Marsh, G., Panda, D.K. An MPI-stream hybrid programming model for computational clusters. In 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pp. 323–330 (2010)

  26. Mattson, T., Sanders, B., Massingill, B.: Patterns for Parallel Programming, 1st edn. Addison-Wesley Professional (2004)

  27. McCool, M., Robison, A.D., Reinders, J.: Structured Parallel Programming: Patterns for Efficient Computation. Elsevier, Waltham, MA (2012)

    Google Scholar 

  28. Muñoz, J. F., Dolz, M. F., del Rio Astorga, D., Cepeda, J. P., García, J. D.: Supporting MPI-distributed stream parallel patterns in GrPPI. In: Proceedings of the 25th European MPI users’ group meeting, EuroMPI’18, pp. 17:1–10. ACM, New York (2018)

  29. Peng, I. B., Markidis, S., Gioiosa, R., Kestor, G., Laure, E.: MPI streams for HPC applications. In: New Frontiers in High Performance Computing and Big Data, number 30 in Advances in Parallel Computing, pp. 75–92 (2017)

  30. Peng, I. B., Markidis, S., Laure, E., Holmes, D., Bull, M.: A data streaming model in MPI. In Proceedings of the 3rd Workshop on Exascale MPI, ExaMPI ’15. Association for Computing Machinery, New York (2015)

  31. Pieper, R., Griebler, D., Fernandes, L. G.: Structured stream parallelism for rust. In: 23rd Brazilian Symposium on Programming Languages (SBLP), SBLP’19, pp. 54–61. ACM, Salvador (2019)

  32. Pieper, R., Löff, J., Hoffmann, R.B., Griebler, D., Fernandes, L.G.: High-level and efficient structured stream parallelism for rust on multi-cores. J. Comput. Lang. 65, 101054 (2021)

    Article  Google Scholar 

  33. Rayon: Rayon—Rust. https://docs.rs/rayon/1.4.0/rayon/, September 2020. Accessed 16 May 2021

  34. Reinders, J.: Intel Threading Building Blocks, 1st edn. O’Reilly & Associates Inc, Sebastopol (2007)

    Google Scholar 

  35. Vogel, A., Rista, C., Justo, G., Ewald, E., Griebler, D., Mencagli, G., Fernandes, L.G.: Parallel stream processing with MPI for video analytics and data visualization. In: High Performance Computing Systems, volume 1171 of Communications in Computer and Information Science (CCIS), pp. 102–116. Springer, Cham (2020)

  36. Wagner, A., Rostoker, C.: A lightweight stream-processing library using MPI. In: 2009 IEEE International Symposium on Parallel Distributed Processing, pp. 1–8 (2009)

Download references

Acknowledgements

We would like to acknowledge the support of LAD-PUCRS, GMAP research group, and PUCRS university. This research is partially funded by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001, FAPERGS 05/2019-PQG project ParAS (No 19/2551-0001895-9), FAPERGS 10/2020-ARD project SPar4.0 (No 21/2551-0000725-7), Universal MCTIC/CNPq call 28/2018 project SParCloud (No 437693/2018-0), and MCTIC/CNPq call 25/2020 (No 130484/2021-0)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dalvan Griebler.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Löff, J., Hoffmann, R.B., Pieper, R. et al. DSParLib: A C++ Template Library for Distributed Stream Parallelism. Int J Parallel Prog 50, 454–485 (2022). https://doi.org/10.1007/s10766-022-00737-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-022-00737-2

Keywords

Navigation