Quantifying the multi-level nature of tiling interactions

Mitchell, Nicholas; Carter, Larry; Ferrante, Jeanne; Högstedt, Karin

doi:10.1007/BFb0032680

Nicholas Mitchell¹,
Larry Carter¹,
Jeanne Ferrante¹ &
…
Karin Högstedt¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1366))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

98 Accesses
3 Citations

Abstract

Optimizations, including tiling, often target a single level of memory or parallelism, such as cache. These optimizations usually operate on a level-by-level basis, guided by a cost function parameterized by features of that single level. The benefit of optimizations guided by these one-level cost functions decreases as architectures tend towards a hierarchy of memory and of parallelism. We have identified three common architectural scenarios where a single tiling choice could be improved by using information from multiple levels in concert. For the first two scenarios, we derive multi-level cost functions which guide the optimal choice of tile size and shape, and quantify the improvement gained. We give both analysis and simulation results to support our points. For the third scenario, we summarize our findings.

This work supported in part by NSF CCR-9504150 and a UC MICRO grant in association with the Intel Corporation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

A. Agarwal, D. Kranz, and V. Natarajan. Automatic partitioning of parallel loops and data arrays for distributed shared memory multiprocessors. In Int. Conf. on Parallel Computing, 1993.
Google Scholar
C. Ancourt and F. Irigoin. Scanning polyhedra with DO loops. In PPoPP, Apr. 1991.
Google Scholar
U. Banerjee. Unimodular transformations of double loops. In LCPC, Aug. 1990.
Google Scholar
S. Carr. Combining optimizations for cache and instruction-level parallelism. In PACT, 1996.
Google Scholar
S. Carr and K. Kennedy. Compiler blockability of numerical algorithms. J. of Supercomputing, Nov. 1992.
Google Scholar
S. Carr and K. Kennedy. Improving the ratio of memory operations to floatingpoint operations in loops. TOPLAS, 16(6), Nov. 1994.
Google Scholar
S. Carr, K. S. McKinley, and C. Tseng. Compiler optimizations for improving data locality. In ASPLOS, Oct. 1994.
Google Scholar
L. Carter, J. Ferrante, and S. F. Hummel. Efficient parallelism via hierarchical tiling. In Parallel Processing for Scientific Computing, Feb. 1995.
Google Scholar
L. Carter, J. Ferrante, and S. F. Hummel. Hierarchical tiling for improved superscalar perfomance. In IPPS, Apr. 1995.
Google Scholar
L. Carter, J. Ferrante, S. F. Hummel, B. Alpern, and K. S. Gatlin. Hierarchical tiling: A methodology for high performance. Technical Report CS96-508, UCSD, Department of Computer Science and Engineering, Nov. 1996.
Google Scholar
S. Coleman and K. S. McKinley. Tile size selection using cache organization and data layout. In PLDI, June 1995.
Google Scholar
P. Feautrier. Some efficient solutions to the affine scheduling problem, Part I, one-dimensional time. Int. J. of Parallel Programming, 21(5), Oct. 1992.
Google Scholar
J. Ferrante, V. Sarkar, and W. Thrash. On estimating and enhancing cache effectiveness. In LCPC, 1991.
Google Scholar
D. Gannon, W. Jalby, and K. Gallivan. Strategies for cache and local memory management by global program transformation. J. of Parallel and Distributed Computing, 5(5), Oct. 1988.
Google Scholar
D. Gannon and K. Wang. Applying AI Techniques to Program Optimization for Parallel Computers, chapter 12. McGraw Hill Co., 1989.
Google Scholar
K. Högstedt, L. Carter, and J. Ferrante. Calculating the idle time of a tiling. In POPL, 1997.
Google Scholar
F. Irigoin and R. Violet. Supernode partitioning. In POPL, Jan. 1988.
Google Scholar
W. Kelly and W. Pugh. A unifying framework for iteration reordering transformations. In Int. Conf. on Alg. and Arch. for Parallel Processing, Apr. 1995.
Google Scholar
K. Kennedy and K. S. McKinley. Optimizing for parallelism and data locality. In Int. Conf. on Supercomputing, July 1992.
Google Scholar
K. Kennedy and K. S. McKinley. Maximizing loop parallelism and improving data locality via loop fusion and distribution. In LCPC, 1993.
Google Scholar
M. S. Lam, E. E. Rothberg, and M. E. Wolf. The cache performance and optimizations of blocked algorithms. In ASPLOS, Apr. 1991.
Google Scholar
D. Lavery and W. Hwu. Unrolling-based optimizations for modulo scheduling. In MICRO-28, Dec. 1995.
Google Scholar
D. A. Padua and M. J. Wolfe. Advanced compiler optimizations for supercomputers. Communications of the ACM, 29(12):1184–1201, Dec. 1986.
Google Scholar
J. Ramanujam and P. Sadayappan. Tiling multidimensional iteration spaces for nonshared memory machines. In Supercomputing, Nov. 1991.
Google Scholar
V. Sarkar, G. R. Gao, and S. Han. Locality analysis for distributed shared-memory multiprocessors. In LCPC, 1996.
Google Scholar
V. Sarkar and R. Thekkath. A general framework for iteration-reordering loop transformations (Technical Summary). In PLDI, 1992.
Google Scholar
M. E. Wolf and M. S. Lam. A data locality optimizing algorithm. In PLDI, 1991.
Google Scholar
M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. Trans. on Parallel and Distributed Systems, 2(4), 1991.
Google Scholar
M. E. Wolf, D. Maydan, and D. Chen. Combining loop transformations considering caches and scheduling. In MICRO-29, Dec. 1996.
Google Scholar
M. J. Wolfe. Iteration space tiling for memory hierarchies. In Parallel Processing for Scientific Computing, 1987.
Google Scholar
M. J. Wolfe. More iteration space tiling. In Supercomputing, 1989.
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering Department, UCSD, 92093-0114, La Jolla, CA
Nicholas Mitchell, Larry Carter, Jeanne Ferrante & Karin Högstedt

Authors

Nicholas Mitchell
View author publications
You can also search for this author in PubMed Google Scholar
Larry Carter
View author publications
You can also search for this author in PubMed Google Scholar
Jeanne Ferrante
View author publications
You can also search for this author in PubMed Google Scholar
Karin Högstedt
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Zhiyuan Li Pen-Chung Yew Siddharta Chatterjee Chua-Huang Huang P. Sadayappan David Sehr

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mitchell, N., Carter, L., Ferrante, J., Högstedt, K. (1998). Quantifying the multi-level nature of tiling interactions. In: Li, Z., Yew, PC., Chatterjee, S., Huang, CH., Sadayappan, P., Sehr, D. (eds) Languages and Compilers for Parallel Computing. LCPC 1997. Lecture Notes in Computer Science, vol 1366. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032680

Download citation

DOI: https://doi.org/10.1007/BFb0032680
Published: 09 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64472-9
Online ISBN: 978-3-540-69788-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics