ABSTRACT
High-level compiler transformations, especially loop transformations, are widely recognized as critical optimizations to restructure programs to improve data locality and expose parallelism. Guaranteeing the correctness of program transformations is essential, and to date three main approaches have been developed: proof of equivalence of affine programs, matching the execution traces of programs, and checking bit-by-bit equivalence of program outputs. Each technique suffers from limitations in the kind of transformations supported, space complexity, or the sensitivity to the testing dataset. In this paper, we take a novel approach that addresses all three limitations to provide an automatic bug checker to verify any iteration reordering transformations on affine programs, including non-affine transformations, with space consumption proportional to the original program data and robust to arbitrary datasets of a given size. We achieve this by exploiting the structure of affine program control- and data-flow to generate at compile-time lightweight checker code to be executed within the transformed program. Experimental results assess the correctness and effectiveness of our method and its increased coverage over previous approaches.
- Clan, the Chunky Loop Analyzer. http://icps.u-strasbg. fr/˜bastoul.Google Scholar
- GNU GCC. http://gcc.gnu.org.Google Scholar
- ISA 0.13. http://repo.or.cz/w/isa.git.Google Scholar
- ISL, the Integer Set Library. http://repo.or.cz/w/isl.git.Google Scholar
- LLVM. http://llvm.org.Google Scholar
- MIT Cilk. http://supertech.csail.mit.edu/cilk.Google Scholar
- PoCC, the Polyhedral Compiler Collection 1.3. http://pocc. sourceforge.net.Google Scholar
- PolyBench/C 3.2. http://polybench.sourceforge.net.Google Scholar
- C. Alias and D. Barthou. On the recognition of algorithm templates. Electronic Notes in Theoretical Computer Science, 82(2):395–409, 2004.Google ScholarCross Ref
- W. Bao, S. Krishnamoorthy, L.-N. Pouchet, F. Rastello, and P. Sadayappan. Polycheck: Dynamic verification of iteration space transformations on affine programs. Technical report, OSU/PNNL/INRIA, Nov. 2015. OSU-CISRC-11/15-TR21.Google Scholar
- D. Barthou, P. Feautrier, and X. Redon. On the equivalence of two systems of affine recurrence equations. In Euro-Par 2002 Parallel Processing. 2002. Google ScholarDigital Library
- M. M. Baskaran, A. Hartono, S. Tavarageri, T. Henretty, J. Ramanujam, and P. Sadayappan. Parameterized tiling revisited. In Proc. of the 8th annual IEEE/ACM international symposium on Code generation and optimization. ACM, 2010. Google ScholarDigital Library
- C. Bastoul. Code generation in the polyhedral model is easier than you think. In Proc. of the 13th International Conference on Parallel Architectures and Compilation Techniques. IEEE, 2004. Google ScholarDigital Library
- V. Basupalli, T. Yuki, S. Rajopadhye, A. Morvan, S. Derrien, P. Quinton, and D. Wonnacott. ompVerify: polyhedral analysis for the OpenMP programmer. In OpenMP in the Petascale Era, pages 37– 53. Springer, 2011. Google ScholarDigital Library
- N. E. Beckman, A. V. Nori, S. K. Rajamani, R. J. Simmons, S. D. Tetali, and A. V. Thakur. Proofs from tests. In Proc. of the 2008 International Symposium on Software Testing and Analysis (ISSTA’08). IEEE, 2010. Google ScholarDigital Library
- M. A. Bender, J. T. Fineman, S. Gilbert, and C. E. Leiserson. Onthe-fly maintenance of series-parallel relationships in fork-join multithreaded programs. In Proc. of the 16th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’04). ACM, 2004. Google ScholarDigital Library
- R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. Journal of the ACM (JACM), 46(5):720–748, 1999. Google ScholarDigital Library
- R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: an efficient multithreaded runtime system. In Proc. of the 5th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 1995. Google ScholarDigital Library
- R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall, and Y. Zhou. Cilk: An efficient multithreaded runtime system, volume 30. ACM, 1995. Google ScholarDigital Library
- U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan. A practical automatic polyhedral program optimization system. In ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 2008. Google ScholarDigital Library
- P. Feautrier. Dataflow analysis of array and scalar references. International Journal of Parallel Programming, 20(1):23–53, 1991.Google ScholarDigital Library
- P. Feautrier. Some efficient solutions to the affine scheduling problem, part II: multidimensional time. International Journal of Parallel Programming, 21(6):389–420, 1992. Google ScholarDigital Library
- C. Flanagan and S. N. Freund. Fasttrack: Efficient and precise dynamic race detection. In Proc. of the 2009 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’09). ACM, 2009. Google ScholarDigital Library
- M. Frigo and V. Strumpen. Cache oblivious stencil computations. In Proc. of the 19th annual international conference on Supercomputing. ACM, 2005. Google ScholarDigital Library
- M. Frigo and V. Strumpen. The cache complexity of multithreaded cache oblivious algorithms. Theory of Computing Systems, 45(2):203– 233, 2009. Google ScholarDigital Library
- M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cacheoblivious algorithms. In Proc. of the 40th Annual Symposium on Foundations of Computer Science. IEEE, 1999. Google ScholarDigital Library
- P. Gachet, C. Mauras, P. Quinton, and Y. Saouter. Alpha du centaur: a prototype environment for the design of parallel regular alorithms. In Proc. of the 3rd international conference on Supercomputing. ACM, 1989. Google ScholarDigital Library
- S. Girbal, N. Vasilache, C. Bastoul, A. Cohen, D. Parello, M. Sigler, and O. Temam. Semi-automatic composition of loop transformations. International Journal of Parallel Programming, 34(3):261–317, June 2006. Google ScholarDigital Library
- B. Godlin and O. Strichman. Inference rules for proving the equivalence of recursive procedures. Acta Informatica, 45(6):403–439, 2008. Google ScholarDigital Library
- M. Griebl, P. Feautrier, and C. Lengauer. Index set splitting. International Journal of Parallel Programming, 28(6):607–631, 2000. Google ScholarCross Ref
- A. K. Gupta, R. Majumdar, and A. Rybalchenko. From tests to proofs. In Proc. of the 15th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’09). Springer, 2009. Google ScholarDigital Library
- S. Kalvala, R. Warburton, and D. Lacey. Program transformations using temporal logic side conditions. ACM Trans. on Programming Languages and Systems (TOPLAS), 31(4):14, 2009. Google ScholarDigital Library
- C. Karfa, K. Banerjee, D. Sarkar, and C. Mandal. Verification of loop and arithmetic transformations of array-intensive behaviors. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, 32(11):1787–1800, 2013. Google ScholarDigital Library
- S. Kundu, Z. Tatlock, and S. Lerner. Proving optimizations correct using parameterized program equivalence. ACM SIGPLAN Notices, 44(6):327–337, 2009. Google ScholarDigital Library
- W. Mansky and E. Gunter. A framework for formal verification of compiler optimizations. In Interactive Theorem Proving. Springer, 2010. Google ScholarDigital Library
- M. Naik, H. Yang, G. Castelnuovo, and M. Sagiv. Abstractions from tests. In Proc. of the 39th ACM SIGPLAN-SIGACT symposium on Principles of programming languages (POPL’12). ACM, 2012. Google ScholarDigital Library
- G. C. Necula. Translation validation for an optimizing compiler. ACM SIGPLAN Notices, 35(5):83–94, 2000. Google ScholarDigital Library
- L. C. Paulson. Isabelle Page. https://www.cl.cam.ac.uk/ research/hvg/Isabelle.Google Scholar
- L. Pouchet. Polyopt/C: A polyhedral optimizer for the rose compiler, 2011.Google Scholar
- H. Prokop. Cache-oblivious algorithms. PhD thesis, Massachusetts Institute of Technology, 1999.Google Scholar
- D. Quinlan, C. Liao, R. Matzke, M. Schordan, T. Panas, R. Vuduc, and Q. Yi. ROSE Web Page. http://www.rosecompiler.org, 2014.Google Scholar
- P. Quinton and V. Van Dongen. The mapping of linear recurrence equations on regular arrays. Journal of VLSI signal processing systems for signal, image and video technology, 1(2):95–113, 1989. Google ScholarDigital Library
- S. V. Rajopadhye, S. Purushothaman, and R. M. Fujimoto. On synthesizing systolic arrays from recurrence equations with linear dependencies. In Proc. of the 16th annual conference on Foundations of Software Technology and Theoretical Computer Science. Springer, 1986. Google ScholarDigital Library
- R. Raman, J. Zhao, V. Sarkar, M. T. Vechev, and E. Yahav. Scalable and precise dynamic datarace detection for structured parallelism. In Proc. of the 2012 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’12). ACM, 2012. Google ScholarDigital Library
- S. Savage, M. Burrows, G. Nelson, P. Sobalvarro, and T. Anderson. Eraser: A dynamic data race detector for multithreaded programs. ACM Trans. on Computer Systems (TOCS), 15(4):391–411, 1997. Google ScholarDigital Library
- M. Schordan, P.-H. Lin, D. Quinlan, and L.-N. Pouchet. Verification of polyhedral optimizations with constant loop bounds in finite state space computations. In Proc. of the 6th International Symposium On Leveraging Applications of Formal Methods, Verification and Validation. Springer, 2014.Google ScholarCross Ref
- R. Sharma, S. Gupta, B. Hariharan, A. Aiken, P. Liang, and A. V. Nori. A data driven approach for algebraic loop invariants. In Proc. of the 22nd European conference on Programming Languages and Systems (ESOP’13). Springer, 2013. Google ScholarDigital Library
- J. Shirako, L.-N. Pouchet, and V. Sarkar. Oil and water can mix: Reconciling polyhedral and ast transformations. In IEEE/ACM Conference on Supercomputing (SC’14). IEEE, 2014. Google ScholarDigital Library
- Y. Tang, R. Chowdhury, C.-K. Luk, and C. E. Leiserson. Coding stencil computations using the pochoir stencil-specification language. In Poster session presented at the 3rd USENIX Workshop on Hot Topics in Parallelism, 2011.Google Scholar
- Y. Tang, R. A. Chowdhury, B. C. Kuszmaul, C.-K. Luk, and C. E. Leiserson. The pochoir stencil compiler. In Proc. of the 32rd annual ACM symposium on Parallelism in algorithms and architectures. ACM, 2011. Google ScholarDigital Library
- S. Verdoolaege. isl: An integer set library for the polyhedral model. In The 3rd International Congress on Mathematical Software (ICMS’10). Springer, 2010. Google ScholarDigital Library
- S. Verdoolaege. Counting affine calculator and applications. In The 1st International Workshop on Polyhedral Compilation Techniques (IMPACT’11), 2011.Google Scholar
- S. Verdoolaege, R. Seghir, K. Beyls, V. Loechner, and M. Bruynooghe. Counting integer points in parametric polytopes using Barvinok’s rational functions. Algorithmica, 48(1):37–66, June 2007. Google ScholarDigital Library
- S. Verdoolaege, G. Janssens, and M. Bruynooghe. Equivalence checking of static affine programs using widening to handle recurrences. ACM Trans. on Programming Languages and Systems (TOPLAS), 34 (3):11, 2012. Google ScholarDigital Library
- M. Wolfe. High Performance Compilers for Parallel Computing. Addison-Wesley, 1996. Google ScholarDigital Library
- W. Zuo, P. Li, D. Chen, L.-N. Pouchet, S. Zhong, and J. Cong. Improving polyhedral code generation for high-level synthesis. In IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS’13). IEEE, 2013. Google ScholarDigital Library
Index Terms
- PolyCheck: dynamic verification of iteration space transformations on affine programs
Recommendations
PolyCheck: dynamic verification of iteration space transformations on affine programs
POPL '16High-level compiler transformations, especially loop transformations, are widely recognized as critical optimizations to restructure programs to improve data locality and expose parallelism. Guaranteeing the correctness of program transformations is ...
Affine and unimodular transformations for non-uniform nested loops
ICCOMP'08: Proceedings of the 12th WSEAS international conference on ComputersPerformance improvement in the modern parallel machines needs not only to find sufficient parallelism in a program, but it is also important that we minimize the synchronization and communication overheads in the parallelized program. Parallelizing and ...
Affine-by-Statement Transformations of Imperfectly Nested Loops
IPPS '96: Proceedings of the 10th International Parallel Processing SymposiumA majority of loop restructuring techniques developed so far assume that loops are perfectly nested. The unimodular approach unifies three individual transformations -- loop interchange, skewing and reversal -- but is still limited to perfect loop ...
Comments