Multi-dimensional Kernel Generation for Loop Nest Software Pipelining

Douillet, Alban; Rong, Hongbo; Gao, Guang R.

doi:10.1007/11823285_32

Alban Douillet¹⁹,
Hongbo Rong²⁰ &
Guang R. Gao¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4128))

Included in the following conference series:

European Conference on Parallel Processing

Abstract

Single-dimension Software Pipelining (SSP) has been proposed as an effective software pipelining technique for multi-dimensional loops [16]. This paper introduces for the first time the scheduling methods that actually produce the kernel code. Because of the multi-dimensional nature of the problem, the scheduling problem is more complex and challenging than with traditional modulo scheduling. The scheduler must handle multiple subkernels and initiation rates under specific scheduling constraints, while producing a solution that minimizes the execution time of the final schedule.

In this paper three approaches are proposed: the level-by-level method, which schedules operations in loop level order, starting from the innermost, and does not let other operations interfere with the already scheduled levels, the flat method, which schedules operations from different loop levels with the same priority, and the hybrid method, which uses the level-by-level mechanism for the innermost level and the flat solution for the other levels. The methods subsume Huff’s modulo scheduling [8] for single loops as a special case. We also break a scheduling constraint introduced in earlier publications and allow for a more compact kernel. The proposed approaches were implemented in the Open64/ORC compiler, and evaluated on loop nests from the Livermore, SPEC200 and NAS benchmarks.

This work was supported in part by the DOD, by DARPA contract No.NBCH30904, by NSF grants No.0103723 and No.0429781, and by DOE grant No.DE-FC02-OIER25503.

Download to read the full chapter text

Chapter PDF

Polygonal Iteration Space Partitioning

Just in Time Load Balancing

Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Allan, V.H., Jones, R.B., Lee, R.M., Allan, S.J.: Software pipelining. ACM Comput. Surv. 27(3), 367–432 (1995)
Article Google Scholar
Carr, S., Ding, C., Sweany, P.: Improving software pipelining with unroll-and-jam. In: Proc. of HICSS 1996, pp. 183–192. IEEE Computer Society, Los Alamitos (1996)
Google Scholar
Darte, A., Schreiber, R., Rau, B.R., Vivien, F.: Constructing and exploiting linear schedules with prescribed parallelism. ACM Trans. Des. Autom. Electron. Syst. 7(1), 159–172 (2002)
Article Google Scholar
Douillet, A.: A Compiler Framework for Loop Nest Software-Pipelining. PhD thesis, University of Delaware, Newark, Delaware, USA (2006)
Google Scholar
Douillet, A., Gao, G.R.: Register pressure in software-pipelined loop nests: Fast computation and impact on architecture design. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds.) LCPC 2005. LNCS, vol. 4339, Springer, Heidelberg (2006)
Chapter Google Scholar
Gao, G.R., Ning, Q., Dongen, V.: Extending software pipelining techniques for scheduling nested loops. In: Pingali, K.K., Gelernter, D., Padua, D.A., Banerjee, U., Nicolau, A. (eds.) LCPC 1994. LNCS, vol. 892, pp. 340–357. Springer, Heidelberg (1995)
Google Scholar
Govindarajan, R., Altman, E.R., Gao, G.R.: A framework for resource-constrained rate-optimal software pipelining. IEEE Trans. Parallel Distrib. Syst. 7(11), 1133–1149 (1996)
Article Google Scholar
Huff, R.A.: Lifetime-sensitive modulo scheduling. In: Proc. of PLDI 1993, pp. 258–267. ACM Press, New York (1993)
Chapter Google Scholar
Lam, M.: Software pipelining: an effective scheduling technique for vliw machines. In: Proc. of PLDI 1988, pp. 318–328. ACM Press, New York (1988)
Google Scholar
Llosa, J.: Swing modulo scheduling: A lifetime-sensitive approach. In: Proc. of PACT 1996, p. 80. IEEE Computer Society, Los Alamitos (1996)
Google Scholar
Muthukumar, K., Doshi, G.: Software pipelining of nested loops. In: Wilhelm, R. (ed.) CC 2001 and ETAPS 2001. LNCS, vol. 2027, pp. 165–181. Springer, Heidelberg (2001)
Chapter Google Scholar
Petkov, D., Harr, R., Amarasinghe, S.: Efficient pipelining of nested loops: unroll-and-squash. In: Proc. of IPDPS 2002, IEEE, Los Alamitos (2002)
Google Scholar
Rau, B.R.: Iterative modulo scheduling: an algorithm for software pipelining loops. In: Proc. of MICRO 27, pp. 63–74. ACM Press, New York (1994)
Chapter Google Scholar
Rong, H., Douillet, A., Gao, G.R.: Register allocation for software pipelined multi-dimensional loops. In: Proc. of PLDI 2005, pp. 154–167 (2005)
Google Scholar
Rong, H., Douillet, A., Govindarajan, R., Gao, G.R.: Code generation for single-dimension software pipelining of multi-dimensional loops. In: Proc. of CGO 2004, pp. 175–186 (2004)
Google Scholar
Rong, H., Tang, Z., Govindarajan, R., Douillet, A., Gao, G.R.: Single-dimension software pipelining for multi-dimensional loops. In: Proc. of CGO 2004, pp. 163–174 (2004)
Google Scholar
Wang, J., Gao, G.R.: Pipelining-dovetailing: A transformation to enhance software pipelining for nested loops. In: Gyimóthy, T. (ed.) CC 1996. LNCS, vol. 1060, pp. 1–17. Springer, Heidelberg (1996)
Google Scholar
Wolf, M.E., Maydan, D.E., Chen, D.K.: Combining loop transformations considering caches and scheduling. Int. J. Parallel Program. 26(4), 479–503 (1998)
Article Google Scholar
Wood, G.: Global optimization of microprograms through modular control constructs. In: Proc. of MICRO 12, pp. 1–6. IEEE, Los Alamitos (1979)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Delaware, Newark, DE, 19716, USA
Alban Douillet & Guang R. Gao
Microsoft Corporation, Redmond, WA, 98052, USA
Hongbo Rong

Authors

Alban Douillet
View author publications
You can also search for this author in PubMed Google Scholar
Hongbo Rong
View author publications
You can also search for this author in PubMed Google Scholar
Guang R. Gao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ZIH, TU Dresden, Germany
Wolfgang E. Nagel
Fakultät Mathematik, Institut für wissenschaftliches Rechnen, TU Dresden, 01062, Dresden, Germany
Wolfgang V. Walter
Database Technology Group, Technische Universität Dresden, Germany
Wolfgang Lehner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Douillet, A., Rong, H., Gao, G.R. (2006). Multi-dimensional Kernel Generation for Loop Nest Software Pipelining. In: Nagel, W.E., Walter, W.V., Lehner, W. (eds) Euro-Par 2006 Parallel Processing. Euro-Par 2006. Lecture Notes in Computer Science, vol 4128. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11823285_32

Download citation

DOI: https://doi.org/10.1007/11823285_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37783-2
Online ISBN: 978-3-540-37784-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multi-dimensional Kernel Generation for Loop Nest Software Pipelining

Abstract

Chapter PDF

Similar content being viewed by others

Polygonal Iteration Space Partitioning

Just in Time Load Balancing

Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Multi-dimensional Kernel Generation for Loop Nest Software Pipelining

Abstract

Chapter PDF

Similar content being viewed by others

Polygonal Iteration Space Partitioning

Just in Time Load Balancing

Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation