Abstract
Generating schedules for expression DAGs that use a minimal number of registers is a classical NP-complete optimization problem. Up to now an exact solution could only be computed for small DAGs (with up to 20 nodes), using a trivial O(n!) enumeration algorithm. We present a new algorithm with worst-case complexity O(n22n) and very good average behaviour. Applying a dynamic programming scheme and reordering techniques, it is able to defer the combinatorial explosion and to generate an optimal schedule not only for small DAGs but also for medium-sized ones with up to 50 nodes, a class that contains nearly all DAGs encountered in typical application programs. Experiments with randomly generated DAGs and large DAGs from real application programs confirm that the new algorithm generates optimal schedules quite fast. We extend our algorithm to cope with delay slots and multiple functional units, two common features of modern superscalar processors.
Preview
Unable to display preview. Download preview PDF.
References
A.V. Aho and S.C. Johnson. Optimal Code Generation for Expression Trees. Journal of the ACM, 23(3):488–501, July 1976.
D. Bernstein, M.C. Golumbic, Y. Mansour, R.Y. Pinter, D.Q. Goldin, H. Krawczyk, and I. Nahshon. Spill code minimization techniques for optimizing compilers. In Proc. ACM SIGPLAN Programming Language Design and Implementation, pages 258–263, 1989.
D. Bernstein and I. Gertner. Scheduling expressions on a pipelined processor with a maximal delay on one cycle. ACM Trans. on Progr. Lang. and Systems, 11(1):57–67, Jan. 1989.
D. Bernstein, J.M. Jaffe, and M. Rodeh. Scheduling arithmetic and load operations in parallel with no spilling. SIAM J. Comput., 18:1098–1127, 1989.
D.G. Bradlee, S.J. Eggers, and R.R. Henry. Integrating Register Allocation and Instruction Scheduling for RISCs. In Proc. 4th Int. Conf. on Architectural Support for Programming Languages and Operating Systems, pages 122–131, Apr. 1991.
Preston Briggs, Keith Cooper, Ken Kennedy, and Linda Torczon. Coloring heuristics for register allocation. In Proc. ACM SIGPLAN Programming Language Design and Implementation, 1989.
G.J. Chaitin. Register allocation & spilling via graph coloring. ACM SIGPLAN Notices, 17(6):201–207, 1982.
G.J. Chaitin, M.A. Auslander, A.K. Chandra, J. Cocke, M.E. Hopkins, and P.W. Markstein. Register allocation via coloring. Computer Languages, 6:47–57, 1981.
Fred C. Chow and John L. Hennessy. Register allocation by priority-based coloring. ACM SIGPLAN Notices, 19(6):222–232, 1984.
J.J. Dongarra and A.R. Jinds. Unrolling Loops in Fortran. Software — Practice and Experience, 9(3):219–226, 1979.
C. Eisenbeis, S. Lelait, and B. Marmol. The meeting graph: a new model for loop cyclic register allocation. In Proc. 5th Workshop on Compilers for Parallel Computers, pages 503–516. Dept. of Computer Architecture, University of Malaga, Spain. Report No. UMA-DAC-95/09, June 28–30 1995.
Joseph A. Fisher. Trace scheduling: A technique for global microcode compaction. IEEE Transactions on Computers, C-30(7):478–490, July 1981.
R.A. Freiburghouse. Register allocation via usage counts. Comm. ACM, 17(11), 1974.
P.B. Gibbons and S.S. Muchnick. Efficient instruction scheduling for a pipelined architecture. In Proc. SIGPLAN Symp. on Compiler Construction, pages 11–16, July 1986.
J.R. Goodman and W. Hsu. Code scheduling and register allocation in large basic blocks. In Proc. Int. Conf. on Supercomputing, pages 442–452, July 1988.
John Hennessy and Thomas Gross. Postpass Code Optimization of Pipeline Constraints. ACM Transactions on Programming Languages and Systems, 5(3):422–448, July 1983.
Jörg Keller, Wolfgang J. Paul, and Dieter Scheerer. Realization of PRAMs: Processor Design. In Proc. WDAG94, 8th Int. Workshop on Distributed Algorithms, Springer Lecture Notes in Computer Science vol. 857, pages 17–27, 1994.
C.W. Keßler. Code-Optimierung quasiskalarer vektorieller Grundblöcke. Diploma thesis, University of Saarbrücken (Germany), 1990.
C.W. Keßler, W.J. Paul, and T. Rauber. A Randomized Heuristic Approach to Register Allocation. In Proc. 3rd Symp. on Programming Language Implementation and Logic Programming, pages 195–206. Springer LNCS vol. 528, Aug. 1991.
C.W. Keßler, W.J. Paul, and T. Rauber. Scheduling Vector Straight Line Code on Vector Processors. In R. Giegerich and S.L. Graham, editors, Code Generation — Concepts, Tools, Techniques, pages 77–91. Springer Worksh. in Computing Series, 1992.
C.W. Keßler and T. Rauber. Generating optimal contiguous evaluations for expression DAGs. Computer Languages, 21(2):113–127, 1996.
C.W. Keßler and H. Seidl. Integrating Synchronous and Asynchronous Paradigms: The Fork95 Parallel Programming Language. In W. Giloi, S. Jähnichen, and B. Shriver, editors, Proc. 2nd Int. Conf. on Massively Parallel Programming Models, pages 134–141. Los Alamitos: IEEE Computer Society Press, Oct. 1995. See also: Technical Report 95-05, FB IV Informatik, Universität Trier, http://www.informatik.uni-trier.de/ ∼kessler/fork95.html.
Monica Lam. Software pipelining: An effective scheduling technique for VLIW machines. In Proc. SIGPLAN Symp. on Compiler Construction, pages 318–328, July 1988.
J. Llosa, M. Valero, and E. Ayguade. Bidirectional scheduling to minimize register requirements. In Proc. 5th Workshop on Compilers for Parallel Computers, pages 534–554. Dept. of Computer Architecture, University of Malaga, Spain. Report No. UMA-DAC-95/09, June 28–30 1995.
Rajeev Motwani, Krishna V. Palem, Vivek Sarkar, and Salem Reyen. Combining Register Allocation and Instruction Scheduling (Technical Summary). Technical Report TR 698, Courant Institute of Mathematical Sciences, New York, July 1995.
Todd A. Proebsting and Charles N. Fischer. Linear-time, optimal code scheduling for delayed-load architectures. In Proc. ACM SIGPLAN Programming Language Design and Implementation, pages 256–267, June 1991.
R. Sethi. Complete register allocation problems. SIAM J. Comput., 4:226–248, 1975.
R. Sethi and J.D. Ullman. The generation of optimal code for arithmetic expressions. Journal of the ACM, 17:715–728, 1970.
Thinking Machines Corp. Connection Machine Model CM-5. Technical Summary. TMC, Cambridge, MA, Nov. 1992.
Steven R. Vegdahl. A Dynamic-Programming Technique for Compacting Loops. In Proc. 25th Annual IEEE/ACM Int. Symp. on Microarchitecture, pages 180–188. Los Alamitos: IEEE Computer Society Press, 1992.
R. Venugopal and Y.N. Srikant. Scheduling expression trees with reusable registers on delayed-load architectures. Computer Languages, 21(1):49–65, 1995.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Keßler, C.W. (1996). Scheduling expression DAGs for minimal register need. In: Kuchen, H., Doaitse Swierstra, S. (eds) Programming Languages: Implementations, Logics, and Programs. PLILP 1996. Lecture Notes in Computer Science, vol 1140. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-61756-6_88
Download citation
DOI: https://doi.org/10.1007/3-540-61756-6_88
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61756-3
Online ISBN: 978-3-540-70654-0
eBook Packages: Springer Book Archive