Abstract
Software pipelining is widely used as a compiler optimization technique to achieve high performance in machines that exploit instruction-level parallelism. However, surprisingly, there have been few theoretical or empirical results on time optimal software pipelining of loops with control flows. In this paper, we present three new theoretical and practical contributions for this underinvestigated problem. First, we propose a necessary and sufficient condition for a loop with control flows to have an optimally software-pipelined program. We also present a decision procedure to compute the condition. As part of the formal treatment of software pipelining, we propose a new formalization of software pipelining. Second, we present two software pipelining algorithms. The first algorithm computes an optimal solution for every loop satisfying the condition, but may run in exponential time. The second algorithm computes optimal solutions efficiently for most (but not all) loops satisfying the condition. The former one proves the sufficiency of the condition and the latter one suggests a practical optimal software pipelining algorithm. Third, we present experimental results which strongly indicate that achieving the time optimality in the software-pipelined programs is a viable goal in practice with reasonable hardware support.
Similar content being viewed by others
References
A. Aiken and A. Nicolau, Optimal Loop Parallelization, in Proc. of the ACM SIGPLAN' 88 Conference on Programming Language Design and Implementation, pp. 308-317 (1988).
A. Aiken and A. Nicolau, Perfect Pipelining, in Proc. of the Second European Symposium on Programming, Lecture Notes in Computer Science, Vol. 300, Springer-Verlag, pp. 221-235 (1988).
A. Aiken, A. Nicolau, and S. Novack. Resource-Constrained Software Pipelining, IEEE Trans. Parall. Distr. 6(12):1248-1270 (1995).
J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren, Conversion of Control Dependence to Data Dependence, in Proc. of the 10th ACM Symposium on Principles of Programming Languages, pp. 177-189 (1983).
E. R. Altman, R. Govindarajan, and G. R. Gao, Scheduling and Mapping: Software Pipelining in the Presence of Structural Hazards, in Proc. of the ACM SIGPLAN' 95 Conference on Programming Language Design and Implementation, pp. 139-150 (1995).
P.-Y. Calland, A. Darte, and Y. Robert, Circuit Retiming Applied to Decomposed Software Pipelining, IEEE Trans. Parall. Distr. 9(1):24-35 (1998).
L.-F. Chao and E. Sha, Scheduling Data-Flow Graphs via Retiming and Unfolding, IEEE Trans. Parall. Distr. 8(12):1259-1267 (1997).
R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and F. Zadeck, Efficiently Computing Static Single Assignment Form and the Control Dependence Graph, ACM Trans. Progr. Lang. Sys. 13(4):451-490 (1991).
K. Ebcioğglu, Some Design Ideas for a VLIW Architecture for Sequential Natured Soft-ware, in Proc. of IFIP WG 10.3 Working Conference on Parallel Processing, pp. 3-21 (1988).
J. Farrante, K. Ottenstein, and J. Warren. The Program Dependence Graph and Its Use in Optimization, ACM Trans. Progr. Lang. Sys. 9(3):319-349 (1987).
F. Gasperoni and U. Schwiegelshohn, Generating Close to Optimum Loop Schedules on Parallel Processors, Parallel Process. Lett. 4(4):391-403 (1994).
F. Gasperoni and U. Schwiegelshohn, Optimal Loop Scheduling on Multiprocessors: A Pumping Lemma for p-Processor Schedules, in Proc. of the 3rd Interna-tional Conference on Parallel Computing Technologies, pp. 51-56 (1995).
F. Gasperoni and U. Schwiegelshohn, List Scheduling in the Presence of Branches: A Theoretical Evaluation, Theoret. Comput. Sci., 196(2):347-363 (1998).
R. Govindarajan, E. R. Altman, and G. R. Gao. A Framework for Resource-Constrained Rate-Optimal Software Pipelining, IEEE Trans. Parall. Distr. 7(11):1133-1149 (1996).
J. Janssen and H. Corporaal, Making Graphs Reducible with Controlled Node Splitting, ACM Trans. Progr. Lang. Sys. 19(6):1031-1052 (1997).
D. Johnson, Finding All the Elementary Circuits of a Directed Graph, SIAM J. Comput. 4(1):77-84 (1975).
S. Kim, S.-M. Moon, J. Park, and K. Ebcioğglu, Unroll-based Copy Elimination for Enhanced Pipeline Scheduling, IEEE Trans. Comput. 52(9):977-994 (2002).
D. Kuck, R. Kuhn, D. Padua, B. Leasure, and M. Wolfe, Dependence Graphs and Compiler Optimizations, in Proc. of the 8th ACM Symposium on Principles of Programming Languages, pp. 207-218 (1981).
M. Lam, Software pipelining: An Effective Scheduling Technique for VLIW Machines, in Proc. of the ACM SIGPLAN' 88 Conference on Programming Language Design and Implementation, pp. 318-328 (1988).
D. Milicev and Z. Jovanovic, Control Flow Regeneration for Software Pipelined Loops with Conditions, Int. J. Parallel Prog. 30(3):149-179 (2002).
S.-M. Moon and S. Carson, Generalized Multi-Way Branch Unit for VLIW Microprocessors, IEEE Trans. Parall. Distr. 6(8):850-862 (1995).
S.-M. Moon and K. Ebcioğglu, Parallelizing Non-Numerical Code with Selective Scheduling and Software Pipelining, ACM Trans. Progr. Lang. Sys. 19(6):853-898 (1997).
A. Nicolau, Uniform Parallelism Exploitation in Ordinary Programs, in Proc. of the International Conference on Parallel Processing, pp. 614-618 (1985).
S. Park, S. Shim, and S.-M. Moon, Evaluation of Scheduling Techniques on a SPARC-Based VLIW Testbed, in Proc. of the 30th Annual International Symposium on Microarchitecture, pp. 104-113 (1997).
K. Pingali, M. Beck, R. Johnson, M. Moudgill, and P. Stodghill, Dependence Flow Graphs: An Algebraic Approach to Program Dependences, in Proc. of the 18th ACM Symposium on Principles of Programming Languages, pp. 67-78 (1991).
U. Schwiegelshohn, F. Gasperoni, and K. Ebcioğglu, On Optimal Parallelization of Arbitrary Loops, J. Parallel. Distr. Com. 11(2):130-134 (1991).
S. Shim and S.-M. Moon, Split-Path Enhanced Pipeline Scheduling for Loops with Control Flows, in Proc. of the 29th Annual Symposium on Microarchitecture, pp. 93-102 (1998).
A. Uht, Requirements for Optimal Execution of Loops with Tests, IEEE Trans. Parall. Distr. 3(5):573-581 (1992).
D. W. Wall, Limits of Instruction-Level Parallelism, in Proc. of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 176-188 (1991).
N. Warter, S. Mahlke, W-M. Hwu, and B. Rau, Reverse If-Conversion, in Proc. of the ACM SIGPLAN' 93 Conference on Programming Language Design and Implementation, pp. 290-299 (1993).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Yun, HS., Kim, J. & Moon, SM. Time Optimal Software Pipelining of Loops with Control Flows. International Journal of Parallel Programming 31, 339–391 (2003). https://doi.org/10.1023/A:1027387028481
Issue Date:
DOI: https://doi.org/10.1023/A:1027387028481