Abstract
Multi-core processors and clusters of multi-core processors are ubiquitous. They provide scalable performance yet introducing complex and low-level programming models for shared and distributed memory programming. Thus, fully exploiting the potential of shared and distributed memory parallelization can be a tedious and error-prone task: programmers must take care of low-level threading and communication (e.g. message passing) details. In order to assist programmers in developing performant and reliable parallel applications Algorithmic Skeletons have been proposed. They encapsulate well-defined, frequently recurring parallel and distributed programming patterns, thus shielding programmers from low-level aspects of parallel and distributed programming. In this paper we take on the design and implementation of the well-known Farm skeleton. In order to address the hybrid architecture of multi-core clusters we present a two-tier implementation built on top of MPI and OpenMP. On the basis of three benchmark applications, including a simple ray tracer, an interacting particles system, and an application for calculating the Mandelbrot set, we illustrate the advantages of both skeletal programming in general and this two-tier approach in particular.
Similar content being viewed by others
Notes
Due to memory restrictions, GPU-enabled skeletons must be provided with functors as arguments.
References
Cole, M.: Algorithmic Skeletons: Structured Management of Parallel Computation. MIT Press, Cambridge (1989)
Cole, M.: Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming. Parallel Comput. 30(3), 389–406 (2004)
Kuchen, H.: A skeleton library. In: Proceedings of the 8th International Euro-Par Conference on Parallel Processing, Euro-Par ’02, London, UK, pp. 620–629. Springer (2002)
Benoit, A., Cole, B., Gilmore, S., Hillston, J.: Flexible skeletal programming with eskel. In: Proceedings of the 11th International Euro-Par Conference, vol. 3648, pp. 761–770. Springer (2005)
Matsuzaki, K., Iwasaki, H., Emoto, K., Hu, Z.: A library of constructive skeletons for sequential style of parallel programming. In: Proceedings of the 1st international conference on Scalable information systems, p. 13. ACM (2006)
Leyton, M., Piquer, J.: Skandium: multi-core programming with algorithmic skeletons. Euro-micro PDP 2010 (2010)
Nvidia Corp.: NVIDIA CUDA C Programming Guide 5.0. Nvidia Corporation (2012)
OpenCL Working Group: The OpenCL specification, Version 1.2 (2011)
Gropp, W., Lusk, W., Skjellum, A.: Using MPI: Portable Parallel Programming with the Message-Passing Interface. MIT Press, Cambridge (1996)
Chapman, B., Jost, G., van der Pas, R.: Using OpenMP: Portable Shared Memory Parallel Programming. Scientific and Engineering Computation. MIT Press, Cambridge (2008)
Ernsting, S., Kuchen, H.: Algorithmic skeletons for multi-core, multi-gpu systems and clusters. Int. J/ High Perform. Comput. Netw. 7(2), 129–138 (2012)
Poldner, M., Kuchen, H.: Skeletons for divide and conquer algorithms. In: Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN). ACTA Press (2008)
Poldner, M., Kuchen, H.: Algorithmic skeletons for branch and bound. In: Proceedings of the 1st International Conference on Software and Data Technology (ICSOFT), vol. 1, pp. 291–300 (2006)
Ciechanowicz, P., Kuchen, H.: Enhancing muesli’s data parallel skeletons for multi-core computer architectures. In: 12th IEEE International Conference on High Performance Computing and Communications (HPCC), pp. 108–113. IEEE (2010)
Ciechanowicz, P.: Algorithmic skeletons for general sparse matrices on multi-Core processors. In: Proceedings of the 20th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS), pp. 188–197 (2008)
Kuchen, H., Striegnitz, J.: Higher-order functions and partial applications for a c++ skeleton library. In: Proceedings of the 2002 Joint ACM-ISCOPE Conference on Java Grande, pp. 122–130. ACM (2002)
Poldner, M., Kuchen, H.: On implementing the farm skeleton. Parallel Process. Lett. 18(1), 117–131 (2008)
Appel, A.: Some techniques for shading machine renderings of solids. In: Proceedings of the April 30-May 2, 1968, Spring Joint Computer Conference, pp. 37–45. ACM (1968)
Farnsworth, M.: Basic ray tracer. http://renderspud.blogspot.de/2012/04/basic-ray-tracer-stage-3.html. Online, accessed 2013–06-13
Danelutto, M., Pasqualetti, F., Pelagatti, S.: Skeletons for Data Parallelism in p3l. In: Lecture Notes in Computer Science, vol. 1300, pp. 619–628. Springer (1997)
Fernando, A.P.J., Sobral, J.: Jaskel: a java skeleton-based framework for structured cluster and grid computing. In: 6th IEEE International Symposium on Cluster Computing and the Grid, Vol. 5. IEEE Press (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ernsting, S., Kuchen, H. A Scalable Farm Skeleton for Hybrid Parallel and Distributed Programming. Int J Parallel Prog 42, 968–987 (2014). https://doi.org/10.1007/s10766-013-0269-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-013-0269-2