Dynamic and Speculative Polyhedral Parallelization Using Compiler-Generated Skeletons

Jimborean, Alexandra; Clauss, Philippe; Dollinger, Jean-François; Loechner, Vincent; Martinez Caamaño, Juan Manuel

doi:10.1007/s10766-013-0259-4

Dynamic and Speculative Polyhedral Parallelization Using Compiler-Generated Skeletons

Published: 09 August 2013

Volume 42, pages 529–545, (2014)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Alexandra Jimborean¹,
Philippe Clauss²,
Jean-François Dollinger²,
Vincent Loechner² &
…
Juan Manuel Martinez Caamaño²

575 Accesses
25 Citations
Explore all metrics

Abstract

We propose a framework based on an original generation and use of algorithmic skeletons, and dedicated to speculative parallelization of scientific nested loop kernels, able to apply at run-time polyhedral transformations to the target code in order to exhibit parallelism and data locality. Parallel code generation is achieved almost at no cost by using binary algorithmic skeletons that are generated at compile-time, and that embed the original code and operations devoted to instantiate a polyhedral parallelizing transformation and to verify the speculations on dependences. The skeletons are patched at run-time to generate the executable code. The run-time process includes a transformation selection guided by online profiling phases on short samples, using an instrumented version of the code. During this phase, the accessed memory addresses are used to compute on-the-fly dependence distance vectors, and are also interpolated to build a predictor of the forthcoming accesses. Interpolating functions and distance vectors are then employed for dependence analysis to select a parallelizing transformation that, if the prediction is correct, does not induce any rollback during execution. In order to ensure that the rollback time overhead stays low, the code is executed in successive slices of the outermost original loop of the nest. Each slice can be either a parallel version which instantiates a skeleton, a sequential original version, or an instrumented version. Moreover, such slicing of the execution provides the opportunity of transforming differently the code to adapt to the observed execution phases, by patching differently one of the pre-built skeletons. The framework has been implemented with extensions of the LLVM compiler and an x86-64 runtime system. Significant speed-ups are shown on a set of benchmarks that could not have been handled efficiently by a compiler.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online Dynamic Dependence Analysis for Speculative Polyhedral Parallelization

Exploitation of GPUs for the Parallelisation of Probably Parallel Legacy Code

TRACO Parallelizing Compiler

References

Bala, V., Duesterwald, E., Banerjia, S.: Dynamo: a transparent dynamic optimization system. In: PLDI ’00. ACM (2000)
Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral parallelizer and locality optimizer. In: PLDI ’08. ACM (2008)
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: IISWC, pp. 44–54. IEEE (2009)
Cole, M.: Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming. Parallel Comput. 30(3), 389–406 (2004)
Article Google Scholar
GOMP An OpenMP implementation for GCC—GNU Project. http://gcc.gnu.org/projects/gomp
http://www.ice.rwth-aachen.de/research/tools-projects/entry/detail/dspstone/
Jimborean, A., Clauss, P., Pradelle, B., Mastrangelo, L., Loechner, V.: Adapting the polyhedral model as a framework for efficient speculative parallelization. In: PPoPP ’12 (2012)
Jimborean, A., Mastrangelo, L., Loechner, V., Clauss, P.: VMAD: an advanced dynamic program analysis and instrumentation framework. In: OBoyle, M. (ed.) Compiler Construction, Lecture Notes in Computer Science, vol. 7210, pp. 220–239. Springer, Berlin, Heidelberg (2012)
Jimborean, A.: Adapting the polytope model for dynamic and speculative parallelization. PhD Thesis, University of Strasbourg, France (2012). http://tel.archives-ouvertes.fr/tel-00733850
Johnson, T.A., Eigenmann, R., Vijaykumar, T.N.: Speculative thread decomposition through empirical optimization. In: PPoPP ’07. ACM (2007)
Khan, M.A., Charles, H.P., Barthou, D.: Improving performance of optimized kernels through fast instantiations of templates. Concurr. Comput. Pract. Exp. 21(1), 59–70 (2009)
Google Scholar
Kim, H., Johnson, N.P., Lee, J.W., Mahlke, S.A., August, D.I.: Automatic speculative doall for clusters. In: CGO ’12. ACM (2012)
Kotzmann, T., Wimmer, C., Mössenböck, H., Rodriguez, T., Russell, K., Cox, D.: Design of the java hotspot client compiler for java 6. ACM Trans. Archit. Code Optim. 5, 7–32 (2008)
Google Scholar
Li, C., Gava, F., Hains, G.: Implementation of data-parallel skeletons: a case study using a coarse-grained hierarchical model. In: ISPDC, pp. 26–33 (2012)
Liu, W., Tuck, J., Ceze, L., Ahn, W., Strauss, K., Renau, J., Torrellas, J.: POSH: a TLS compiler that exploits program structure. In: PPoPP ’06. ACM (2006)
LLVM compiler infrastructure. http://llvm.org
Noël, F., Hornof, L., Consel, C., Lawall, J.L.: Automatic, template-based run-time specialization: implementation and experimental study. In: International Conference on Computer Languages. IEEE Computer Society Press (1998)
Nugteren, C., Corporaal, H.: Introducing ’Bones’: a parallelizing source-to-source compiler based on algorithmic skeletons. In: Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, GPGPU-5, pp. 1–10. ACM, New York, NY, USA (2012). doi:10.1145/2159430.2159431
Polybenchs. (2010). http://www-rocq.inria.fr/pouchet/software/polybenchs
Pouchet, L.N., Bondhugula, U., Bastoul, C., Cohen, A., Ramanujam, J., Sadayappan, P., Vasilache, N.: Loop transformations: convexity, pruning and optimization. In: POPL ’11. ACM (2011)
Pouchet, L.N.: FM: the Fourier-Motzkin library. (2008). http://www.cse.ohio-state.edu/pouchet/software/fm
Prabhu, M.K., Olukotun, K.: Using thread-level speculation to simplify manual parallelization. In: PPoPP ’03. ACM (2003)
Raman, E., Vachharajani, N., Rangan, R., August, D.I.: Spice: speculative parallel iteration chunk execution. In: CGO ’08. ACM (2008)
Rauchwerger, L., Padua, D.: The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization. In: PLDI ’95. ACM (1995)
Rosetta Codes. (2011). http://rosettacode.org/wiki/Rosetta_Code
Schrijver, A.: Theory of Linear and Integer Programming. Wiley, NY, USA (1986)
MATH Google Scholar
Smith, F., Grossman, D., Morrisett, G., Hornof, L., Jim, T.: Compiling for template-based run-time code generation. J. Funct. Program. 13(3), 677–708 (2003)
Google Scholar
Tian, C., Feng, M., Gupta, R.: Speculative parallelization using state separation and multiple value prediction. In: International Symposium on Memory Management, ISMM ’10. ACM (2010)

Download references

Author information

Authors and Affiliations

UPMARC, University of Uppsala, Uppsala, Sweden
Alexandra Jimborean
ICube, INRIA, CNRS, University of Strasbourg, Strasbourg, France
Philippe Clauss, Jean-François Dollinger, Vincent Loechner & Juan Manuel Martinez Caamaño

Authors

Alexandra Jimborean
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Clauss
View author publications
You can also search for this author in PubMed Google Scholar
Jean-François Dollinger
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Loechner
View author publications
You can also search for this author in PubMed Google Scholar
Juan Manuel Martinez Caamaño
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philippe Clauss.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jimborean, A., Clauss, P., Dollinger, JF. et al. Dynamic and Speculative Polyhedral Parallelization Using Compiler-Generated Skeletons. Int J Parallel Prog 42, 529–545 (2014). https://doi.org/10.1007/s10766-013-0259-4

Download citation

Received: 28 February 2013
Accepted: 27 July 2013
Published: 09 August 2013
Issue Date: August 2014
DOI: https://doi.org/10.1007/s10766-013-0259-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic and Speculative Polyhedral Parallelization Using Compiler-Generated Skeletons

Abstract

Access this article

Similar content being viewed by others

Online Dynamic Dependence Analysis for Speculative Polyhedral Parallelization

Exploitation of GPUs for the Parallelisation of Probably Parallel Legacy Code

TRACO Parallelizing Compiler

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic and Speculative Polyhedral Parallelization Using Compiler-Generated Skeletons

Abstract

Access this article

Similar content being viewed by others

Online Dynamic Dependence Analysis for Speculative Polyhedral Parallelization

Exploitation of GPUs for the Parallelisation of Probably Parallel Legacy Code

TRACO Parallelizing Compiler

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation