Abstract
While polyhedral optimization appeared in mainstream compilers during the past decade, its profitability in scenarios outside its classic domain of linear-algebra programs has remained in question. Recent implementations, such as the LLVM plugin Polly, produce promising speedups, but the restriction to affine loop programs with control flow known at compile time continues to be a limiting factor. PolyJIT combines polyhedral optimization with multi-versioning at run time, at which one has access to knowledge enabling polyhedral optimization, which is not available at compile time. By means of a fully-fledged implementation of a light-weight just-in-time compiler and a series of experiments on a selection of real-world and benchmark programs, we demonstrate that the consideration of run-time knowledge helps in tackling compile-time violations of affinity and, consequently, offers new opportunities of optimization at run time.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10766-018-0597-3/MediaObjects/10766_2018_597_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10766-018-0597-3/MediaObjects/10766_2018_597_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10766-018-0597-3/MediaObjects/10766_2018_597_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10766-018-0597-3/MediaObjects/10766_2018_597_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10766-018-0597-3/MediaObjects/10766_2018_597_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10766-018-0597-3/MediaObjects/10766_2018_597_Fig6_HTML.png)
Similar content being viewed by others
Notes
Geometrically, these objects are (\(\mathbb {Z}\)-)polyhedra.
References
Android Developers: Art and Dalvik (2016). https://source.android.com/devices/tech/dalvik/. Accessed 25 Feb 2018
Banerjee, U.: Loop nest parallelization. In: Padua, D., et al. (eds.) Encyclopedia of Parallel Computing, vol. 2, pp. 1068–1079. Springer, Berlin (2011)
Bastoul, C.: Code generation in the polyhedral model is easier than you think. In: Proceedings of 13th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 7–16. IEEE Computer Society (2004)
Bondhugula, U., Baskaran, M., Krishnamoorthy, S., Ramanujam, J., Rountev, A., Sadayappan, P.: Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In: Proceedings of 17th International Conference on Compiler Construction (CC). Springer (2008)
Bondhugula, U., Hartono, A., Ramanujam, J., Sadayappan, P.: A practical automatic polyhedral program optimization system. In: Proceedings of 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM (2008)
Caamaño, J.M.M., Selva, M., Clauss, P., Baloian, A., Wolff, W.: Full runtime polyhedral optimizing loop transformations with the generation, instantiation, and scheduling of code-bones. Concurr. Comput. Pract. Exp. 29(15), 4192:1–4192:16 (2016). (Special Issue on Euro-Par 2016)
Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.H., Skadron, K.: Rodinia: A benchmark suite for heterogeneous computing. In: Proceedings of IEEE International Symposium on Workload Characterization (IISWC), pp. 44–54. IEEE Computer Society (2009)
Cook, S.: CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs. Morgan Kaufmann (2013)
Davis, M.: Hilbert’s tenth problem is unsolvable. Am. Math. Mon. 80(3), 233–269 (1973)
FFmpeg Developers: FFmpeg Automated Testing Environment (2016). https://www.ffmpeg.org/fate.html. Accessed 25 Feb 2018
Duran, A., Teruel, X., Ferrer, R., Martorell, X., Ayguade, E.: Barcelona OpenMP tasks suite: a set of benchmarks targeting the exploitation of task parallelism in OpenMP. In: International Conference on Parallel Processing (ICPP), pp. 124–131 (2009)
Feautrier, P., Lengauer, C.: Polyhedron model. In: Padua, D., et al. (eds.) Encyclopedia of Parallel Computing, vol. 4, pp. 1581–1592. Springer, Berlin (2011)
Grosser, T., Cohen, A., Holewinski, J., Sadayappan, P., Verdoolaege, S.: Hybrid hexagonal/classical tiling for GPUs. In: Proceedings of 12th International Symposium on Code Generation and Optimization (CGO). ACM (2014). (Article 66, 10 pp)
Grosser, T., Größlinger, A., Lengauer, C.: Polly-Performing polyhedral optimizations on a low-level intermediate representation. Parallel Processing Letters (PPL) 22(4) 1250010:1-1250010:28 (2012)
Grosser, T., Ramanujam, J., Pouchet, L.N., Sadayappan, P., Pop, S.: Optimistic delinearization of parametrically sized arrays. In: Proceedings of 29th ACM International Conference on Supercomputing (ICS), pp. 351–360. ACM (2015)
Grosser, T., Zheng, H., Alor, R., Simbürger, A., Größlinger, A., Pouchet, L.N.: Polly—polyhedral optimization in LLVM. In: Alias, C., Bastoul, C. (eds.) Proceedings of First International Workshop on Polyhedral Compilation Techniques (IMPACT). INRIA Grenoble Rhône-Alpes (2011)
Größlinger, A.: The challenges of non-linear parameters and variables in automatic loop parallelisation. Doctoral thesis, Department of Computer Science and Mathematics, University of Passau (2009)
Hintze, J.L., Nelson, R.D.: Violin plots: a box plot-density trace synergism. Am. Stat. 52(2), 181–184 (1998)
Irigoin, F.: Tiling. In: Padua, D., et al. (eds.) Encyclopedia of Parallel Computing, vol. 4, pp. 2041–2049. Springer, Berlin (2011)
Jimborean, A.: Adapting the polytope model for dynamic and speculative parallelization. Doctoral thesis, Image Sciences, Computer Sciences and Remote Sensing Laboratory, University of Strasbourg (2012)
Jimborean, A., Loechner, V., Clauss, P.: Handling multi-versioning in LLVM: Code tracking and cloning. In: Proceedings of International Workshop on Intermediate Representations (WIR). IEEE Computer Society (2011)
Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program analysis & transformation. In: Proceedings of Second International Symposium on Code Generation and Optimization (CGO), pp. 75–86. IEEE Computer Society (2004)
Mehta, S., Beeraka, G., Yew, P.: Tile size selection revisited. ACM Trans. Archit. Code Optim. (TACO) 10(4), 35:1–35:27 (2013)
Paleczny, M., Vick, C., Click, C.: The Java Hotspot server compiler. In: Proceedings of 1st Symposium on Java Virtual Machine Research and Technology (JVM). USENIX Association (2001)
Pozo, R., Miller, B.R.: SciMark2 (2017). http://math.nist.gov/scimark2. Accessed 25 Feb 2018
Simbürger, A., Apel, S., Größlinger, A., Lengauer, C.: The potential of polyhedral optimization: An empirical study. In: Proceedings of 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 508–518. IEEE Computer Society (2013)
Simbürger, A., Größlinger, A.: On the variety of static control parts in real-world programs: from affine via multi-dimensional to polynomial and just-in-time. In: Proceedings of 4th International Workshop on Polyhedral Compilation Techniques (IMPACT) (2014)
Simbürger, A., Sattler, F., Größlinger, A., Lengauer, C.: BenchBuild: A large-scale empirical-research toolkit. Technical Report MIP-1602, Faculty of Computer Science and Mathematics, University of Passau (2016)
Stojanov, A., Toskov, I., Rompf, T., Püschel, M.: SIMD intrinsics on managed language runtimes. In: Proceedings of 15th International Symposium on Code Generation and Optimization (CGO), pp. 2–15. ACM (2018)
Stone, J.E., Gohara, D., Shi, G.: OpenCL: a parallel programming standard for heterogeneous computing systems. IEEE Des. Test 12(3), 66–73 (2010)
Streit, K., Hammacher, C., Zeller, A., Hack, S.: Sambamba: a runtime system for online adaptive parallelization. In: Franke, B. (ed.) Proceedings of 21st International Conference on Compiler Construction (CC), pp. 240–243. Springer, Berlin (2012)
Strzodka, R., Shaheen, M., Pajak, D., Seidel, H.P.: Cache accurate time skewing in iterative stencil computations. In: Proceedings of International Conference on Parallel Processing (ICPP), pp. 571–581. IEEE Computer Society (2011)
Tavarageri, S., Pouchet, L., Ramanujam, J., Rountev, A., Sadayappan, P.: Dynamic selection of tile sizes. In: Proceedings of 18th International Conference on High Performance Computing (HiPC), pp. 1–10 (2011)
Trifunovic, K., Cohen, A., Edelsohn, D., Li, F., Grosser, T., Jagasia, H., Ladelsky, R., Pop, S., Sjödin, J., Upadrasta, R.: GRAPHITE two years after: first lessons learned from real-world polyhedral compilation. In: Proceedings of International Workshop on GCC Research Opportunities (GROW), pp. 1–13 (2010). http://ctuning.org/workshop-grow10. Accessed 25 Feb 2018
Vanhatalo, J., Völzer, H., Koehler, J.: The refined process structure tree. Data Knowl. Eng. 68(9), 793–818 (2009)
Xue, J.: Loop Tiling for Parallelism, vol. 575. Springer, Berlin (2012)
Yuki, T., Renganarayanan, L., Rajopadhye, S.V., Anderson, C., Eichenberger, A.E., O’Brien, K.: Automatic creation of tile size selection models. In: Proceedings of 8th International Symposium on Code Generation and Optimization (CGO), pp. 190–199 (2010)
Acknowledgements
All four authors received finanical support by the Deutsche Forschungsgemeinschaft (DFG). The respective projects are PolyJIT (LE 912/14), SafeSPL (AP 206/4) and SafeSPL++ (AP 206/6).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Simbürger, A., Apel, S., Größlinger, A. et al. PolyJIT: Polyhedral Optimization Just in Time. Int J Parallel Prog 47, 874–906 (2019). https://doi.org/10.1007/s10766-018-0597-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-018-0597-3