skip to main content
10.1145/2737924.2737954acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

Improving compiler scalability: optimizing large programs at small price

Published:03 June 2015Publication History

ABSTRACT

Compiler scalability is a well known problem: reasoning about the application of useful optimizations over large program scopes consumes too much time and memory during compilation. This problem is exacerbated in polyhedral compilers that use powerful yet costly integer programming algorithms to compose loop optimizations. As a result, the benefits that a polyhedral compiler has to offer to programs such as real scientific applications that contain sequences of loop nests, remain impractical for the common users. In this work, we address this scalability problem in polyhedral compilers. We identify three causes of unscalability, each of which stems from large number of statements and dependences in the program scope. We propose a one-shot solution to the problem by reducing the effective number of statements and dependences as seen by the compiler. We achieve this by representing a sequence of statements in a program by a single super-statement. This set of super-statements exposes the minimum sufficient constraints to the Integer Linear Programming (ILP) solver for finding correct optimizations. We implement our approach in the PLuTo polyhedral compiler and find that it condenses the program statements and program dependences by factors of 4.7x and 6.4x, respectively, averaged over 9 hot regions (ranging from 48 to 121 statements) in 5 real applications. As a result, the improvements in time and memory requirement for compilation are 268x and 20x, respectively, over the latest version of the PLuTo compiler. The final compile times are comparable to the Intel compiler while the performance is 1.92x better on average due to the latter’s conservative approach to loop optimization.

References

  1. R. Allen and K. Kennedy. Automatic translation of fortran programs to vector form. ACM Transactions on Programming Languages and Systems, 9:491–542, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. U. K. Banerjee. Dependence Analysis for Supercomputing. Kluwer Academic Publishers, Norwell, MA, USA, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. G. Bhaskaracharya and U. Bondhugula. Polyglot: a polyhedral loop transformation framework for a graphical dataflow language. In Compiler Construction, pages 123–143. Springer, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. U. Bondhugula. Pluto: An automatic parallelizer and locality optimizer for multicores, 2014. Available at http:// pluto-compiler.sourceforge.net.Google ScholarGoogle Scholar
  5. U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral parallelizer and locality optimizer. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’08, pages 101–113, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. U. Bondhugula, V. Bandishti, A. Cohen, G. Potron, and N. Vasilache. Tiling and optimizing time-iterated computations on periodic domains. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, PACT ’14, pages 39–50, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. G. B. Dantzig and B. Curtis Eaves. Fourier-motzkin elimination and its dual. Journal of Combinatorial Theory, Series A, 14(3):288–297, 1973.Google ScholarGoogle ScholarCross RefCross Ref
  8. C. Ding and K. Kennedy. Improving effective bandwidth through compiler enhancement of global cache reuse. JPDC, 64(1):108 – 134, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Feautrier. Parametric integer programming. RAIRO Recherche Op’erationnelle, 22, 1988.Google ScholarGoogle Scholar
  10. P. Feautrier. Scalable and structured scheduling. International Journal of Parallel Programming, 34(5):459–487, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Ferrante, K. J. Ottenstein, and J. D. Warren. The program dependence graph and its use in optimization. ACM Transactions on Programming Languages and Systems (TOPLAS), 9(3):319–349, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. G. Goff, K. Kennedy, and C.-W. Tseng. Practical dependence testing. In Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, PLDI ’91, pages 15–29, New York, NY, USA, 1991. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. T. Grosser, A. Groesslinger, and C. Lengauer. Polly - performing polyhedral optimizations on a low-level intermediate representation. Parallel Processing Letters, 22(04), 2012.Google ScholarGoogle ScholarCross RefCross Ref
  14. N. P. Johnson, T. Oh, A. Zaks, and D. I. August. Fast condensation of the program dependence graph. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, pages 39–50, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. K. Kennedy and K. McKinley. Maximizing loop parallelism and improving data locality via loop fusion and distribution. In LCPC, volume 768 of Lecture Notes in Computer Science, pages 301–320. 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Lamport. The parallel execution of do loops. Commun. ACM, 17 (2):83–93, Feb. 1974. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. D. E. Maydan, J. L. Hennessy, and M. S. Lam. Efficient and exact data dependence analysis. In Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, PLDI ’91, pages 1–14, New York, NY, USA, 1991. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. N. Megiddo and V. Sarkar. Optimal weighted loop fusion for parallel programs. In SPAA, pages 282–291. ACM, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Mehta, P.-H. Lin, and P.-C. Yew. Revisiting loop fusion in the polyhedral framework. In Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’14, pages 233–246, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. S. P. Midkiff. Automatic parallelization: An overview of fundamental compiler techniques. Synthesis Lectures on Computer Architecture, 7 (1):1–169, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Ng, D. Kulkarni, W. Li, R. Cox, and S. Bobholz. Inter-procedural loop fusion, array contraction and rotation. In Parallel Architectures and Compilation Techniques, 2003. PACT 2003. Proceedings. 12th International Conference on, pages 114–124, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L.-N. Pouchet and M. Narayan. Polyopt: a polyhedral optimizer for the rose compiler, 2014. Available at http://www.cse. ohio-state.edu/˜pouchet/software/polyopt/.Google ScholarGoogle Scholar
  23. L.-N. Pouchet, P. Zhang, P. Sadayappan, and J. Cong. Polyhedralbased data reuse optimization for configurable computing. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA ’13, pages 29–38, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. Pugh. The omega test: a fast and practical integer programming algorithm for dependence analysis. In Proceedings of the 1991 ACM/IEEE conference on Supercomputing, pages 4–13. ACM, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. J. Thadhani. Factors affecting programmer productivity during application development. IBM Systems Journal, 23(1):19–35, 1984. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. R. Upadrasta and A. Cohen. Sub-polyhedral scheduling using (unit- )two-variable-per-inequality polyhedra. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’13, pages 483–496, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. N. Vasilache, C. Bastoul, A. Cohen, and S. Girbal. Violated dependence analysis. In Proceedings of the 20th annual international conference on Supercomputing, pages 335–344. ACM, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. A. Venkat, M. Shantharam, M. Hall, and M. M. Strout. Non-affine extensions to polyhedral code generation. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO ’14, pages 185:185–185:194, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. S. Verdoolaege. isl: An integer set library for the polyhedral model. In Mathematical Software ICMS 2010, volume 6327 of Lecture Notes in Computer Science, pages 299–302. Springer Berlin Heidelberg, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. S. Verdoolaege. Integer set coalescing. In 5th International Workshop on Polyhedral Compilation Techniques (IMPACT), 2015.Google ScholarGoogle Scholar
  31. M. Wolfe and U. Banerjee. Data dependence and its application to parallel processing. International Journal of Parallel Programming, 16(2):137–178, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. M. Wolfe and C.-W. Tseng. The power test for data dependence. Parallel and Distributed Systems, IEEE Transactions on, 3(5):591– 601, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. J. Wolfe. High Performance Compilers for Parallel Computing. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Improving compiler scalability: optimizing large programs at small price

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation
      June 2015
      630 pages
      ISBN:9781450334686
      DOI:10.1145/2737924
      • cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 50, Issue 6
        PLDI '15
        June 2015
        630 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2813885
        • Editor:
        • Andy Gill
        Issue’s Table of Contents

      Copyright © 2015 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 June 2015

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate406of2,067submissions,20%

      Upcoming Conference

      PLDI '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader