HitFlow: A Dataflow Programming Model for Hybrid Distributed- and Shared-Memory Systems

Fresno, Javier; Barba, Daniel; Gonzalez-Escribano, Arturo; Llanos, Diego R.

doi:10.1007/s10766-018-0561-2

HitFlow: A Dataflow Programming Model for Hybrid Distributed- and Shared-Memory Systems

Published: 15 February 2018

Volume 47, pages 3–23, (2019)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

477 Accesses
4 Citations
Explore all metrics

Abstract

Dataflow programming consists in developing a program by describing its sequential stages and the interactions between them. The runtime systems supporting this kind of programming are responsible for exploiting the parallelism by concurrently executing the different stages as soon as their dependencies are met. In this paper we introduce a new parallel programming model and framework based on the dataflow paradigm. It presents a new combination of features that allows to easily map programs to shared or distributed memory, exploiting data locality and affinity to obtain the same performance than optimized coarse-grain MPI programs. These features include: It is a unique one-tier model that supports hybrid shared- and distributed-memory systems with the same abstractions; it can express activities arbitrarily linked, including non-nested cycles; it uses internally a distributed work-stealing mechanism to allow Multiple-Producer/Multiple-Consumer configurations; and it has a runtime mechanism for the reconfiguration of the dependences and communication channels which also allows the creation of task-to-task data affinities. We present an evaluation using examples of different classes of applications. Experimental results show that programs generated using this framework deliver good performance in hybrid distributed- and shared-memory environments, with a similar development effort as other dataflow programming models oriented to shared-memory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn

Article 08 January 2021

Distributed-Memory FastFlow Building Blocks

Article Open access 02 December 2022

The Impact of Cache and Dynamic Memory Management in Static Dataflow Applications

Article 24 February 2022

Notes

In the current implementation, MPI communications are always used to move tokens from queue to queue, even when the queue objects are mapped into the same MPI process. Although this simplifies the implementation and MPI communications are highly optimized in shared-memory, this decision clearly opens possibilities for further optimization.

References

Gropp, W., Lusk, E., Skjellum, A.: Using MPI : Portable Parallel Programming With the Message-passing Interface, 2nd edn. MIT Press, Cambridge (1999)
Book MATH Google Scholar
Chandra, R., Dagum, L., Kohr, D., Maydan, D., McDonald, J., Menon, R.: Parallel Programming in OpenMP, 1st edn. Morgan Kaufmann, Burlington (2001)
Google Scholar
Aldinucci, M., Danelutto, M., Kilpatrick, P., Torquati, M.: FastFlow: high-level and efficient streaming on multi-core. In: Pllana, S., Xhafa, F. (eds.) Programming Multi-core and Many-core Computing Systems Parallel and Distributed Computing, 1st edn. Wiley, Hoboken (2017)
Google Scholar
Budimlić, Z., Burke, M., Cavé, V., Knobe, K., Lowney, G., Newton, R., Palsberg, J., Peixotto, D., Sarkar, V., Schlimbach, F., Tasirlar, S.: Concurrent collections. Sci. Programm. 18(3–4), 203–217 (2010). https://doi.org/10.1155/2010/521797
Google Scholar
Pop, A., Cohen, A.: A openstream: expressiveness and data-flow compilation of OpenMP streaming programs. ACM Trans. Archit. Code Optim. 9(4), 53:1–53:25 (2013)
Article Google Scholar
Grelck, C., Scholz, S.-B., Shafarenko, A.: A gentle introduction to S-Net: typed stream processing and declarative coordination of asynchronous components. Parallel Process. Lett. 18(2), 221–237 (2008)
Article MathSciNet Google Scholar
Fresno, J., Gonzalez-Escribano, A., Llanos, D.R.: Runtime support for dynamic skeletons implementation. In: Proceedings of the 19th International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), Las Vegas, NV, pp. 320–326. (2013)
Jensen, K., Kristensen, L.M., Wells, L.: Coloured Petri nets and CPN tools for modelling and validation of concurrent systems. Int. J. Softw. Tools Technol. Transf. 9(3–4), 213–254 (2007)
Article Google Scholar
Fresno, J., Gonzalez-Escribano, A., Llanos, D.R.: Additional material for the paper: Dataflow programming model for hybrid distributed and shared memory systems, Tech. Rep. IT-DI-2015-0003, Department of Computer Science, University of Valladolid, Spain (2015). URL http://www.infor.uva.es/jfresno/reports/IT-DI-2015-0003.pdf
Holmes, D.: Ideas for persistant point to point communication. Tech. rep., MPI Forum Meetings (2014). URL http://meetings.mpi-forum.org/2014-11-scbof-p2p.pdf
Dinan, J., Balaji, P., Goodell, D., Miller, D., Snir, M., Thakur, R.: Enabling MPI interoperability through flexible communication endpoints. In: 20th European MPI Users’s Group Meeting, EuroMPI ’13, Madrid, Spain, pp. 13–18. (2013). https://doi.org/10.1145/2488551.2488553
Szalkowski, A., Ledergerber, C., Krähenbühl, P., Dessimoz, C.: SWPS3–fast multi-threaded vectorized Smith-Waterman for IBM Cell/B.E. and x86/SSE2. BMC Res. Notes 1(107)
Clote, P.: Biologically significant sequence alignments using Boltzmann probabilities. Tech. rep. (2003). http://bioinformatics.bc.edu/ clote/pub/boltzmannParis03.pdf
Aldinucci, M., Campa, S., Danelutto, M.: Targeting distributed systems in FastFlow. In: Proceedings of the Euro-Par 2012 Parallel Processing Workshops, Vol. 7640 of Lecture Notes in Computer Science. Springer, Berlin, pp. 47–56. (2013)
Fresno, J., Gonzalez-Escribano, A., Llanos, D.R.: One tier dataflow programming model for hybrid distributed- and shared-memory systems. In: Proceedings of International Workshop on High-Level Programming for Heterogeneous and Hierarchical Parallel Systems (HLPGPU), Prague, Czech Republic (2016)
Aldinucci, M., Meneghin, M., Torquati, M.: Efficient smith-waterman on multi-core with FastFlow. In: Danelutto, M., Bourgeois, J., Gross, T. (eds.), Proceedings of the 18th Euromicro Conference on Parallel, Distributed and Network-based Processing (PDP). IEEE Computer Society, Pisa, Italy, pp. 195–199 (2010)
UniProt Knowledgebase (UniProtKB), www.uniprot.org/
McCabe, T.J.: A complexity measure. IEEE Trans. Softw. Eng. 2(4), 308–320 (1976). https://doi.org/10.1109/TSE.1976.233837
Article MathSciNet MATH Google Scholar
Halstead, M.H.: Elements of Software Science (Operating and Programming Systems Series). Elsevier Science Inc., New York (1977)
MATH Google Scholar
Fresno, J., Gonzalez-Escribano, A., Llanos, D.R.: Extending a hierarchical tiling arrays library to support sparse data partitioning. J. Supercomput. 64(1), 59–68 (2013)
Article Google Scholar
Gonzalez-Escribano, A., Torres, Y., Fresno, J., Llanos, D.R.: An extensible system for multilevel automatic data partition and mapping. IEEE Trans. Parallel Distrib. Syst. 25(5), 1145–1154 (2014)
Article Google Scholar
Moreton-Fernandez, A., Gonzalez-Escribano, A., Llanos, D.: Exploiting distributed and shared memory hierarchies with hitmap. In: International Conference on High Performance Computing Simulation (HPCS), pp. 278–286 (2014). https://doi.org/10.1109/HPCSim.2014.6903696
OpenMP Architecture Review Board, OpenMP application program interface version 4.5 (November 2015). http://www.openmp.org/mp-documents/openmp-4.5.pdf
Barcelona Supercomputing Center, OmpSs specification 4.5 (December 2015). http://pm.bsc.es/ompss-docs/specs
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.-A.: Starpu: A unified platform for task scheduling on heterogeneous multicore architectures. Concurr. Comput. Pract. Exper. 23(2), 187–198 (2011). https://doi.org/10.1002/cpe.1631
Article Google Scholar
Johan Enmyren, C.W.K.: Skepu: a multi-backend skeleton programming library for multi-gpu systems. In: Proceedings of the Fourth International Workshop on High-Level Parallel Programming and Applications (HLPP 2010), ACM, pp. 5–14 (2010)
Philipp Ciechanowicz, H.K.: Enhancing muesli’s data parallel skeletons for multi-core computer architectures. In: Proceedings of 12th IEEE International Conference on High Performance Computing and Communications (HPCC 2010) (2010)

Download references

Acknowledgements

This research has been partially supported by MICINN (Spain) and ERDF program of the European Union: HomProg-HetSys Project (TIN2014-58876-P), PCAS Project (TIN2017-88614-R), CAPAP-H6 (TIN2016-81840-REDT), and COST Program Action IC1305: Network for Sustainable Ultrascale Computing (NESUS). By Junta de Castilla y León, Project PROPHET (VA082P17). And by the computing facilities of Extremadura Research Centre for Advanced Technologies (CETA- CIEMAT), funded by the European Regional Development Fund (ERDF). CETA-CIEMAT belongs to CIEMAT and the Government of Spain.

Author information

Authors and Affiliations

Departamento de Informática, Universidad de Valladolid, Valladolid, Spain
Javier Fresno, Daniel Barba, Arturo Gonzalez-Escribano & Diego R. Llanos

Authors

Javier Fresno
View author publications
You can also search for this author inPubMed Google Scholar
Daniel Barba
View author publications
You can also search for this author inPubMed Google Scholar
Arturo Gonzalez-Escribano
View author publications
You can also search for this author inPubMed Google Scholar
Diego R. Llanos
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Arturo Gonzalez-Escribano.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fresno, J., Barba, D., Gonzalez-Escribano, A. et al. HitFlow: A Dataflow Programming Model for Hybrid Distributed- and Shared-Memory Systems. Int J Parallel Prog 47, 3–23 (2019). https://doi.org/10.1007/s10766-018-0561-2

Download citation

Received: 31 May 2017
Accepted: 31 January 2018
Published: 15 February 2018
Issue Date: 15 February 2019
DOI: https://doi.org/10.1007/s10766-018-0561-2

Keywords

Part of a collection:

Special Issue on High-Level Programming for Heterogeneous Parallel Systems

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HitFlow: A Dataflow Programming Model for Hybrid Distributed- and Shared-Memory Systems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn

Distributed-Memory FastFlow Building Blocks

The Impact of Cache and Dynamic Memory Management in Static Dataflow Applications

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now