Abstract
This paper applies unimodular transformations and tiling to improve the data locality of a loop nest. Due to data dependences and reuse information, not all loops will and can be tiled. Therefore, the approach proposed in this paper attempts to capture as much data reuse in the cache as possible while tiling as few loops as possible. By using cones to represent the data dependences and vector spaces to represent the reuse information in the program, a reuse-driven approach is presented to improve the data locality of the program. In the special case of a singly fully permutable loop nest, the data locality problem is formulated as an optimisation problem and solved optimally. In the general case, an algorithm is presented that attempts to construct the tiled loop nest in such a way that as much reuse as possible is carried in the innermost tiled loops.
Preview
Unable to display preview. Download preview PDF.
References
R. Andonov and S. Rajopadhye. Optimal tiling of two-dimensional uniform recurrences. Technical Report 97-01, LIMAV, Universitè de Valenciennes, Jan. 1997.
P. Boulet, A. Darte, T. Risset, and Y. Robert. (Pen)-ultimate tiling. Integration, the VLSI Journal, 17:33–51, 1994.
S. Carr and K. Kennedy. Compiler blockability of numerical algorithms. In Supercomputing '92, pages 114–124, Minneapolis, Minn., Nov. 1992.
S. Coleman and K. S. McKinley. Tile size selection using cache organization and data layout. In Proc. of the SIGPLAN'95 Coif Program Language Design and Implementation, pages 279–289, Jun. 1995.
K. Cooper, K. Kennedy, and N. McIntosh. Cross-loop reuse analysis and its application to cache optimizations. In Proc. of the 9th Workshop on Languages and Compilers for Parallel Computing, Aug. 1996.
A. Darte and F Vivien. A comparison of nested loops parallelization algorithms. Technical Report 95-11, Ecole Normale Supèrieure de Lyon, May. 1995.
A. Darse and F. Vivien. Combining retiming and scheduling techniques for loop parallelization and loop tiling. Technical Report 96-34, Ecole Normale Supèrieure de Lyon, Nov. 1996.
A. Darte and F Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. In Proc. of the 1996 International Conference on Parallel Architectures and Compilation Techniques, pages 281–291, Boston, MA., 1996.
M. E. Dyer and L. G. Proll. An algorithm for determining all extreme points of a convex polytope. Mathematical Programming, 12:81–96, 1977.
K. Gallivan, W. Jalby, and D. Gannon.On the problem of optimizing data transfers for complex memory systems. In Supercomputing '88, pages 238–253. ACM Press, 1988.
G. R. Gao, V. Sarkar, and S. Han. Locality analysis for distributed shared-memory multiprocessors. In Proc. of the 9th Workshop on Languages and Compilers for Parallel Computing, Aug. 1996.
F. Irigoin. Loop reordering with dependence direction vectors. Technical Report EMP-CAI-I A/184, Ecole Nationale Superieure des Mines de Paris, Nov. 1988.
F. Irigoin and R. Triolet. Supemode partitioning. In Proc. of the 15th Annual ACM Symposium on Principles of Programming Languages, pages 319–329, San Diego, Jan. 1988.
M. S. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance and optimizations of blocked algorithms. In Proc. of the 2nd International Conference on Architectural Support for Programming Languages and Operating Systems, pages 63–74, Santa Clara, Apr. 1991.
H. Le Verge. A note on chemikova's algorithm. Technical Report 635, IRISA (INRIA-Rennes), Feb. 1992.
H. Ohta, Y Saito, M. Kainaga, and H. Ono. Optimal tile size adjustment in compiling for general DOACROSS loop nests. In 1995 ACM International Conference on Supercomputing, pages 270–279. ACM Press, 1995.
J. Ramanujam and P. Sadayappan. Tiling multidimensional iteration spaces for multicomputers. J. of Parallel and Distributed Computing, 16(2):108–230, Oct. 1992.
R. Schreiber and J. J. Dongarra. Automatic blocking of nested loops. Technical Report 90.38, RIACS, May 1990.
A. Schrijver. Theory of Linear and Integer Programming. Series in Discrete Mathematics. John Wiley & Sons, 1986.
M. E. Wolf. Improving Locality and Parallelism in Nested Loops. PhD thesis, Stanford University, Mar. 1992.
M. E. Wolf and M. S. Lam. A data locality optimizing algorithm. In Proc. of the ACM SIGPLAN'91 Conf. on Programming Language Design and Implementation. ACM, Jun. 1991.
M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans. on Parallel and Distributed Systems, 2(4):452–471, Oct. 1991.
M. J. Wolfe. More iteration space tiling. In Supercomputing '88, pages 655–664, Nov. 1989.
M. J. Wolfe. Optimizing Supercompilers for Supercomputers. Research Monographs in Parallel and Distributed Computing. MIT Press, 1989.
M. J. Wolfe. High Performance Compilers for Parallel Computing. Addison-Wesley, 1996.
J. Xue. Automating non-unimodular loop transformations for massive parallelism. Parallel Computing, 20 (5):711–728, 1994.
J. Xue. On tiling as a loop transformation. In Proc. of the SPDP Workshop on Challenges in Compiling for Scalable Parallel Systems, New Orleans, 1996. IEEE Computer Society Press.
J. Xue. Communication-minimal tiling of uniform dependence loops. Journal of Parallel and Distributed Computing, 42:42–59, 1997.
Y. Q. Yang, C. Ancourt, and F. Irigoin.Minimal data dependence abstractions for loop transformations. In Proc. of the 7th Workshop on Languages and Compilers for Parallel Computing, Ithaca, Aug 1994.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xue, J., Huang, CH. (1998). Reuse-driven tiling for data locality. In: Li, Z., Yew, PC., Chatterjee, S., Huang, CH., Sadayappan, P., Sehr, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1997. Lecture Notes in Computer Science, vol 1366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032681
Download citation
DOI: https://doi.org/10.1007/BFb0032681
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64472-9
Online ISBN: 978-3-540-69788-6
eBook Packages: Springer Book Archive