Ariadne — Directive-based parallelism extraction from recursive functions
Introduction
The rapid evolution of technology has led to the wide use of multi-core CPUs even in our home computers. Up to date, this evolution has not been fully utilized due to the lack of parallel software. Parallel program development is still expensive and complicated. However, there are innumerable sequential applications which can ideally be parallelized in an automatic way. Parallelization of already existing code is also a hard task and requires compilers developed especially for this purpose. For many years, the research community has focused on the automatic parallelization technology but the results are still not what we have all expected.
Semi-automatic parallelization can offer a solution to this problem. It makes the parallelization easier without becoming a heavy burden for the programmer. Its philosophy is based on directives inserted manually to the sequential code in order to express information related to the possibility of parallelism. For example, possibility of full parallelization of a loop or presence of data dependencies. Dependency detection is usually an easy task for the programmer, but very hard for the compiler when we need to do it reliably and for general cases. Using semi-automatic parallelization, the programmer can still write sequential code, annotate it with directives and let the compiler produce the parallel code which can efficiently utilize a multi-core system.
This paper presents Ariadne, a compiler that extracts parallelism from recursive functions written in C. Ariadne requires a simple and short directive for each recursive function to be inserted by the programmer. It is the only compiler that categorizes recursive functions based on the potential form of parallelism, and introduces simple directives for each of these categories. Ariadne extracts three forms of parallelism providing one transformation for each of them. We name these transformations as elimination, parallel-reduction and thread-safe. The elimination transformation converts a recursive function into an iterative one. The parallel-reduction transformation eliminates the recursion and distributes the workload into a number of independent tasks. The thread-safe transformation parallelizes recursive functions that contain independent recursive calls. The programmer selects the programming model on which the code is mapped from four supported options: the POSIX standard, the OpenMP model, the Cilk programming language and the SVP model.
The directives are independent of the target model. Thus, it is not necessary for the programmer to be aware of the underlying architecture and the target programming model, since Ariadne abstracts away all the details. It is also important to note that Ariadne handles parallel reduction with full support of subtraction and division. Other parallelizing tools can handle only the basic form of them or cannot handle them at all.
The experimental results demonstrate significant speedups in all benchmarks when comparing the recursive and the semi-automatically produced code. They also show that none of the target models outperforms all the other models for all benchmarks. Consequently, the fact that the recursive functions can be mapped onto a number of models constitutes an interesting feature of Ariadne.
Section snippets
Motivation
A lot of research has been done in the automatic parallelization during the last decades. It has mainly focused on parallelism extraction from loop structures. However, we can also extract parallelism from other parts of code. Recursive functions is such an example. The parallelization of recursive functions is interesting for two main reasons: (a) the use of recursion leads to performance degradation and (b) inherently independent computations can be executed in parallel.
In the literature, we
The SVP model
SVP [2], [6], [12] is a multi-core architecture and programming model that supports the parallel execution of threads organized in families. A family consists of a number of ordered identical threads. Each thread knows its position in the family through a unique number called index. Based on this index, each thread can be differentiated from other threads, executing a different part of the code. Fig. 1 depicts the architecture of SVP.
The communication among the threads of a family is achieved
Parallelization of recursive functions with directives
There are three ways to reduce the execution time of a recursive computation by: (a) eliminating the recursion and mapping it onto an iterative control structure, (b) extracting parallelism from independent tasks, and (c) combining these two approaches.
Ariadne supports transformations that speed up the execution of recursive functions based on all these three options. More specifically Ariadne supports: (a) a transformation that eliminates the recursion which is performed using the elimination
Evaluation
In this section, we present experimental results showing the speedups achieved by Ariadne in a number of benchmarks for the three transformations: elimination, parallel-reduction and thread-safe, when the code is mapped onto iterative C code, the POSIX standard, the OpenMP model, the Cilk programming language and the SVP model.
Related work
A lot of research work has been done in the area of parallelizing compilers during the last decades. Interesting parallelizing compilers include Cetus [10], PLUTO [7] and Oscar [17]. All of them perform a number of analysis techniques, aiming the code parallelization. Unfortunately, none of them provides parallelization techniques specialized for recursive functions while they mainly focus on loops and non-recursive function calls parallelization.
There are also various studies for the
Conclusions and future work
This paper presented Ariadne, a parallelizing compiler that maps recursive functions onto popular parallel programming models, aiming to extract inherent parallelism. It requires the programmer to insert a directive for each recursive function which is intended to be parallelized and to indicate an appropriate parallelizing transformation. The recursive functions are classified into three main categories. One specific-purpose directive is provided for each category. The directives are
Acknowledgments
We would like to thank Dimitris Saougkos and Zoltán Majó for their invaluable help in this work. We would also like to thank the anonymous reviewers for their helpful comments.
Aristeidis Mastoras is a Research Assistant at the Department of Computer Science, ETH Zurich, Switzerland, since 2013. He received his M.Sc. degree in Computer Science, with specialization in Software, from the Department of Computer Science and Engineering, University of Ioannina, Greece. He also received his first degree in Computer Science from the same Department. His research interests focus on the fields of Parallelizing Compilers, Parallel Processing, Compiler Design and Software
References (23)
- et al.
Implementation and evaluation of a microthread architecture
J. Syst. Archit. Embedded Syst. Des.
(2009) - et al.
An analytical method for parallelization of recursive functions
Parallel Process. Lett.
(2000) - T. Bernard, K. Bousias, L. Guang, C.R. Jesshope, M. Lankamp, M.W. van Tol, L. Zhang, A general model of concurrency and...
Notes on recursion elimination
Commun. ACM
(1977)Tabulation techniques for recursive programs
ACM Comput. Surv.
(1980)- R.D. Blumofe, C.F. Joerg, B.C. Kuszmaul, C.E. Leiserson, K.H. Randall, Y. Zhou, Cilk: An efficient multithreaded...
- U. Bondhugula, A. Hartono, J. Ramanujam, P. Sadayappan, A practical automatic polyhedral program optimization system,...
- N.H. Cohen, Characterization and elimination of redundancy in recursive programs, in: Proceedings of the Symposium on...
- R.L. Collins, B. Vellore, L.P. Carloni, Recursion-driven parallel code generation for multi-core platforms, in:...
- et al.
Cetus: A source-to-source compiler infrastructure for multicores
Computer
(2009)
Cited by (0)
Aristeidis Mastoras is a Research Assistant at the Department of Computer Science, ETH Zurich, Switzerland, since 2013. He received his M.Sc. degree in Computer Science, with specialization in Software, from the Department of Computer Science and Engineering, University of Ioannina, Greece. He also received his first degree in Computer Science from the same Department. His research interests focus on the fields of Parallelizing Compilers, Parallel Processing, Compiler Design and Software Engineering.
George Manis is an Assistant Professor in the Department of Computer Science and Engineering in the University of Ioannina, Greece. He is a member of the Academic Staff since 2002. His first degree (5 year Diploma degree) is from the Department of Electrical and Computer Engineering, National Technical University of Athens, Greece, his M.Sc. in Parallel and Distributed Systems from Queen Mary College in London, and his Ph.D. again from the same Department of the National Technical University of Athens. His research interests include Compilers, Parallel and Distributed Systems.