Abstract
Directive-driven programming models, such as OpenMP, are one solution for exploring the potential parallelism when targeting multicore architectures. Although these approaches significantly help developers, code parallelization is still a non-trivial and time-consuming process, requiring parallel programming skills. Thus, many efforts have been made toward automatic parallelization of the existing sequential code. This article presents AutoPar-Clava, an OpenMP-based automatic parallelization compiler which: (1) statically detects parallelizable loops in C applications; (2) classifies variables used inside the target loop based on their access pattern; (3) supports reduction clauses on scalar and array variables whenever it is applicable; and (4) generates a C OpenMP parallel code from the input sequential version. The effectiveness of AutoPar-Clava is evaluated by using the NAS and Polyhedral Benchmark suites and targeting a x86-based computing platform. The achieved results are very promising and compare favorably with closely related auto-parallelization compilers, such as Intel C/C++ Compiler (icc), ROSE, TRACO and CETUS.
Similar content being viewed by others
Notes
References
Acharya A, Bondhugula U, Cohen A (2018) Polyhedral auto-transformation with no integer linear programming. In: Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, pp 529–542
Amini M, Creusillet B, Even S, Keryell R, Goubier O, Guelton S, McMahon J, Pasquier F, Péan G, Villalon P (2012) Par4all: from convex array regions to heterogeneous computing. In: IMPACT 2012: International Workshop on Polyhedral Compilation Techniques
Arabnejad H, Bispo J, Barbosa JG, Cardoso JMP (2018) An openmp based parallelization compiler for c applications. In: 2018 IEEE International Conference on Parallel Distributed Processing with Applications (ISPA), pp 915–923
Arabnejad H, Bispo J, Barbosa JG, Cardoso JMP (2018) Autopar-clava: an automatic parallelization source-to-source tool for c code applications. In: Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms. ACM, pp 13–19
Bae H, Mustafa D, Lee J-W, Lin H, Dave C, Eigenmann R, Midkiff SP (2013) The cetus source-to-source compiler infrastructure: overview and evaluation. International Journal of Parallel Programming, pp 1–15
Bagnères L, Zinenko O, Huot S, Bastoul C (2016) Opening polyhedral compiler’s black box. In: Proceedings of the 2016 International Symposium on Code Generation and Optimization. ACM, pp 128–138
Banerjee U (2007) Loop transformations for restructuring compilers: the foundations. Springer, Berlin
Bastoul C, Cohen A, Girbal S, Sharma S, Temam O (2003) Putting polyhedral loop transformations to work. In: International Workshop on Languages and Compilers for Parallel Computing. Springer, pp 209–225
Bastoul C (2003) Efficient code generation for automatic parallelization and optimization. In: Proceedings of the Second International Conference on Parallel and Distributed Computing. IEEE Computer Society, pp 23–30
Blume W, Doallo R, Eigenmann R, Grout J, Hoeflinger J, Lawrence T (1996) Parallel programming with polaris. Computer 29(12):78–82
Blume W, Eigenmann R (1994) The range test: a dependence test for symbolic, non-linear expressions. In: Supercomputing’94., Proceedings. IEEE, pp 528–537
Bondhugula U, Bandishti V, Pananilath I (2017) Diamond tiling: tiling techniques to maximize parallelism for stencil computations. IEEE Trans Parallel Distrib Syst 28(5):1285–1298
Bondhugula U, Hartono A, Ramanujam J, Sadayappan P (2008) A practical automatic polyhedral parallelizer and locality optimizer. In: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, pp 101–113
Cardoso JMP, Coutinho JGF, Carvalho T, Diniz PC, Petrov Z, Luk W, Gonçalves F (2016) Performance-driven instrumentation and mapping strategies using the lara aspect-oriented programming approach. Softw Pract Exp 46(2):251–287
Chandra R (2001) Parallel programming in OpenMP. Morgan kaufmann, Burlington
Clang Clang: a C language family frontend for LLVM. http://clang.llvm.org/
Dave C, Bae H, Min S, Lee S, Eigenmann R, Midkiff S (2009) Cetus: a source-to-source compiler infrastructure for multicores. Computer 42(12):36–42
Intel. Intel C++ Compiler. https://software.intel.com/en-us/c-compilers/
Kelly W, Maslov V, Pugh W, Rosser E, Shpeisman T, Wonnacott D (1996) New user interface for petit and other extensions. User Guide. University of Maryland, pp 1–20
Kelly W, Pugh W, Rosser E, Shpeisman T (1996) Transitive closure of infinite graphs and its applications. Int J Parallel Program 24(6):579–598
Keryell R, Ancourt C, Coelho F, Eatrice B, Frann C, Irigoin F, Jouvelot P (1996) Pips: a workbench for building interprocedural parallelizers, compilers and optimizers. Technical report, École Nationale Supérieure des Mines de Paris, France., 04
Kremer U, Bast H-J, Gerndt M, Zima HP (1988) Advanced tools and techniques for automatic parallelization. Parallel Comput 7(3):387–393
Larsen P, Ladelsky R, Lidman J, McKee SA, Karlsson S, Zaks A (2012) Parallelizing more loops with compiler guided refactoring. In: 41st International Conference on Parallel Processing (ICPP). IEEE, pp 410–419
Lee S-I, Johnson T, Eigenmann R (2003) Cetus—an extensible compiler infrastructure for source-to-source transformation. In: International Workshop on Languages and Compilers for Parallel Computing. Springer, pp 539–553
Liao C, Quinlan DJ, Willcock JJ, Panas T (2008) Automatic parallelization using openmp based on stl semantics. Technical report, Lawrence Livermore National Lab.(LLNL), Livermore, CA (United States)
Maydan DE, Hennessy JL, Lam MS (1991) Efficient and exact data dependence analysis. In: ACM SIGPLAN ’91 Conference on Programming Language Design and Implementation. ACM, pp 1–14
Memeti S, Pllana S (2018) Papa: a parallel programming assistant powered by IBM Watson cognitive computing technology. J Comput Sci 26:275–284
Muchnick SS (1997) Advanced Compiler Design and Implementation. Morgan Kaufmann Publishers Inc., San Francisco
OpenCL. OpenCL. https://opencl.org
OpenMP. OpenMP. http://www.openmp.org
OpenMP Application Programming Interface, version 4.5. https://www.openmp.org/wp-content/uploads/openmp-4.5.pdf
Palkowski M, Bielecki W (2015) Traco parallelizing compiler. In: Soft Computing in Computer and Information Science. Springer, pp 409–421
Palkowski M, Bielecki W (2017) Traco: source-to-source parallelizing compiler. Comput Inform 35(6):1277–1306
Pinto P, Carvalho T, Bispo J, Ramalho MA, Cardoso JMP (2018) Aspect composition for multiple target languages using LARA. Comput Lang Syst Struct 53:1–26
The Portland Group. PGI Fortran & C. http://www.pgroup.com
Pouchet L-N Polybench: the polyhedral benchmark suite. http://www.cs.ucla.edu/pouchet/software/polybench. Accessed June 2019
Pugh W (1991) The omega test: a fast and practical integer programming algorithm for dependence analysis. In: Proceedings of the 1991 ACM/IEEE Conference on Supercomputing. ACM, pp 4–13
Quinlan D (2000) Rose: compiler support for object-oriented frameworks. Parallel Process Lett 10(02n03):215–226
Quinlan D, Liao C, Too J, Matzke RP, Schordan M Rose compiler infrastructure. http://rosecompiler.org/
Soundrarajan P, Nasre R, Jehadeesan R, Panigrahi BK A study on popular auto-parallelization frameworks. In: Concurrency and Computation: Practice and Experience, pp 1–28 (in press)
Stolte C, Tang D, Hanrahan P (2002) Polaris: a system for query, analysis, and visualization of multidimensional relational databases. IEEE Trans Vis Comput Graph 8(1):52–65
Top500. TOP500 supercomputer sites. http://www.top500.org
Tu P, Padua D (2001) Automatic array privatization. In: Compiler Optimizations for Scalable Parallel Systems, pp 247–281. Springer
Wilson RP, French RS, Wilson CS, Amarasinghe SP, Anderson JM, Tjiang SWK, Liao S-W, Tseng C-W, Hall MW, Lam MS, Hennessy JL (1994) SUIF: an infrastructure for research on parallelizing and optimizing compilers. ACM Sigplan Notices 29(12):31–37
Wolfe M (1989) Optimizing supercompilers for supercomputers. The MIT Press, Cambridge
Wolfe MJ (1995) High performance compilers for parallel computing. Addison-Wesley Longman Publishing Co., Inc., Boston
Acknowledgements
This work was partially funded by the ANTAREX project through the EU H2020 FET-HPC program under Grant No. 671623. João Bispo acknowledges the support provided by Fundação para a Ciência e a Tecnologia, Portugal, under Post-Doctoral Grant SFRH/BPD/118211/2016.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Arabnejad, H., Bispo, J., Cardoso, J.M.P. et al. Source-to-source compilation targeting OpenMP-based automatic parallelization of C applications. J Supercomput 76, 6753–6785 (2020). https://doi.org/10.1007/s11227-019-03109-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-019-03109-9