Combined ILP and Register Tiling: Analytical Model and Optimization Framework

Renganarayana, Lakshminarayanan; Ramakrishna, U.; Rajopadhye, Sanjay

doi:10.1007/978-3-540-69330-7_17

Lakshminarayanan Renganarayana²⁰,
U. Ramakrishna²⁰ &
Sanjay Rajopadhye²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4339))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

578 Accesses

Abstract

Efficient use of multiple pipelined functional units and registers is very important for achieving high performance on modern processors. Instruction Level Parallelism (ILP) and register reuse (through register tiling) are two mechanisms for this, respectively. Program transformations that expose and exploit ILP and register reuse interact with each other in subtle ways. We study the combined problem of optimal ILP and register reuse. We consider the class of uniform dependence, fully permutable, rectangular loop nests. We develop an analytical model of the combined problem and formulate a mathematical optimization problem that chooses the parameters of the ILP-exposing transformation and register tiling so as to minimize the total execution time. We distinguish two cases: when loop permutation can and cannot expose a parallel loop. We show that the combined problem can be reduced to a single integer convex optimization problem for the former case, and to a set of integer convex optimization problems for the latter case, both of which can be solved to global optimality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Instruction Level Loop De-optimization

Hybrid Register Allocation with Spill Cost and Pattern Guided Optimization

An Analytical Model for Loop Tiling Transformation

References

Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures: A Dependence Based Approach. Morgan Kaufman, San Francisco (2002)
Google Scholar
Carr, S., Sweany, P.: An experimental evaluation of scalar replacement on scientific benchmarks. Software Practice and Experience 33(15), 1419–1445 (2003)
Article Google Scholar
Lam, M.: Software pipelining: an effective scheduling technique for vliw machines. In: PLDI 1988: Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation, pp. 318–328. ACM Press, New York (1988)
Chapter Google Scholar
Rau, B.R.: Iterative modulo scheduling: an algorithm for software pipelining loops. In: MICRO 27: Proceedings of the 27th annual international symposium on Microarchitecture, pp. 63–74. ACM Press, New York (1994)
Chapter Google Scholar
Allan, V.H., Jones, R.B., Lee, R.M., Allan, S.J.: Software pipelining. ACM Comput. Surv. 27(3), 367–432 (1995)
Article Google Scholar
Darte, A., Robert, Y., Vivien, F.: Scheduling and Automatic Parallelization. Birkhäuser, Boston (2000)
MATH Google Scholar
Xue, J.: Loop tiling for parallelism. Kluwer Academic Publishers, Dordrecht (2000)
MATH Google Scholar
Renganarayana, L., Ramakrishna, U., Rajopadhye, S.: Combined ILP and register tiling: Analytical model and optimization framework. Technical Report CS-05-102, Department of Computer Science, Colorado State University (2005), Available from: http://www.cs.colostate.edu/~ln/publications/TR-CS-05-102.pdf
Rong, H., Douillet, A., Gao, G.R.: Register allocation for software pipelined multidimensional loops. In: PLDI 2005: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, pp. 154–167. ACM Press, New York (2005)
Chapter Google Scholar
Ramanujam, J.: Optimal software pipelining of nested loops. In: IPPS, pp. 335–342 (1994)
Google Scholar
Xue, J.: On tiling as a loop transformation. Parallel Processing Letters 7(4), 409–424 (1997)
Article MathSciNet Google Scholar
Renganarayana, L., Rajopadhye, S.: A geometric programming framework for optimal multi-level tiling. In: SC 2004: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, vol. 18. IEEE Computer Society, Los Alamitos (2004)
Google Scholar
Andonov, R., Balev, S., Rajopadhye, S.V., Yanev, N.: Optimal semi-oblique tiling. IEEE Trans. Parallel Distrib. Syst. 14(9), 944–960 (2003)
Article Google Scholar
Sarkar, V.: Optimized unrolling of nested loops. International Journal of Parallel Programming 29(5), 545–581 (2001)
Article MATH Google Scholar
Wolf, M.E., Maydan, D.E., Chen, D.K.: Combining loop transformations considering caches and scheduling. In: Proceedings of the 29th Annual International Symposium on Microarchitecture, Paris, IEEE Computer Society TC-MICRO and ACM SIGMICRO, pp. 274–286 (1996)
Google Scholar
Duffin, R., Peterson, E., Zener, C.: Geometric Programming – Theory and Applications. John Wiley, Chichester (1967)
Google Scholar
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004), Online version available at: http://www.stanford.edu/~boyd/cvxbook.html
MATH Google Scholar
Kortanek, K.O., Xu, X., Ye, Y.: An infeasible interior-point algorithm for solving primal and dual geometric programs. Math. Program. 76(1), 155–181 (1997)
Article MathSciNet MATH Google Scholar
Löfberg, J.: YALMIP: A toolbox for modeling and optimization in MATLAB. In: Proceedings of the CACSD Conference, Taipei, Taiwan (2004), Available from, http://control.ee.ethz.ch/~joloef/yalmip.php
Callahan, D., Carr, S., Kennedy, K.: Improving register allocation for subscripted variables. In: PLDI 1990: Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation, pp. 53–65. ACM Press, New York (1990)
Chapter Google Scholar
Carr, S., Kennedy, K.: Improving the ratio of memory operations to floating-point operations in loops. ACM Trans. Program. Lang. Syst. 16(6), 1768–1810 (1994)
Article Google Scholar
Carter, L., Ferrante, J., Hummel, S.F.: Hierarchical tiling for improved superscalar performance. In: Proceedings of the 9th International Symposium on Parallel Processing, Washington, DC, USA, pp. 239–245. IEEE Computer Society, Los Alamitos (1995)
Chapter Google Scholar
Mitchell, N., Högstedt, K., Carter, L., Ferrante, J.: Quantifying the multi-level nature of tiling interactions. International Journal of Parallel Programming 26(6), 641–670 (1998)
Article Google Scholar
Jiménez, M., Llabería, J.M., Fernández, A.: Register tiling in nonrectangular iteration spaces. ACM Trans. Program. Lang. Syst. 24(4), 409–453 (2002)
Article Google Scholar
Rong, H., Tang, Z., Govindarajan, R., Douillet, A., Gao, G.R.: Single-dimension software pipelining for multi-dimensional loops. In: CGO 2004: Proceedings of the international symposium on Code generation and optimization, Washington, DC, USA. IEEE Computer Society, Los Alamitos (2004)
Google Scholar
Rong, H., Douillet, A., Govindarajan, R., Gao, G.R.: Code generation for singledimension software pipelining of multi-dimensional loops. In: CGO 2004: Proceedings of the international symposium on Code generation and optimization, Washington, DC, USA. IEEE Computer Society, Los Alamitos (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Colorado State University,
Lakshminarayanan Renganarayana, U. Ramakrishna & Sanjay Rajopadhye

Authors

Lakshminarayanan Renganarayana
View author publications
You can also search for this author in PubMed Google Scholar
U. Ramakrishna
View author publications
You can also search for this author in PubMed Google Scholar
Sanjay Rajopadhye
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

BSC-UPC,
Eduard Ayguadé
Department of Computer Science, Louisiana State University, 70803, Baton Rouge, LA, USA
Gerald Baumgartner
Dept. of Electrical and Computer Engg., Louisiana State University, Baton Rouge, LA, USA
J. Ramanujam
Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Avenue, 43210, Columbus, OH, USA
P. Sadayappan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Renganarayana, L., Ramakrishna, U., Rajopadhye, S. (2006). Combined ILP and Register Tiling: Analytical Model and Optimization Framework. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2005. Lecture Notes in Computer Science, vol 4339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69330-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-540-69330-7_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69329-1
Online ISBN: 978-3-540-69330-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Combined ILP and Register Tiling: Analytical Model and Optimization Framework

Abstract

Access this chapter

Preview

Similar content being viewed by others

Instruction Level Loop De-optimization

Hybrid Register Allocation with Spill Cost and Pattern Guided Optimization

An Analytical Model for Loop Tiling Transformation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Combined ILP and Register Tiling: Analytical Model and Optimization Framework

Abstract

Access this chapter

Preview

Similar content being viewed by others

Instruction Level Loop De-optimization

Hybrid Register Allocation with Spill Cost and Pattern Guided Optimization

An Analytical Model for Loop Tiling Transformation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation