Skip to main content
Log in

Techniques for critical path reduction of scalar programs

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Scalar performance on processors with instruction level parallelism (ILP) is often limited by control and data dependences. This paper describes a family of compiler techniques, called Critical Path Reduction (CPR) techniques, which reduce the length of critical paths through control and data dependences. Control CPR reduces the number of branches on the critical path and improves the performance of branch intensive codes on processors with inadequate branch throughput or excessive branch latency. Data CPR reduces the number of arithmetic operations on the critical path. Optimization and scheduling are adapted to support CPR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. R. Hank, W. W. Hwu, and B. R. Rau, Region-Based Compilation: An Introduction and Motivation,Proc. 28th Ann. Symp. on Microarchitecture Ann Arbor, Michigan, pp. 158–168 (1995).

  2. J. C. Dehnert and R. A. Towle, Compiling for the Cydra 5,J. Supercomputing 7(1/2):181–228 (1993).

    Article  Google Scholar 

  3. M. Schlansker and V. Kathail, Acceleration of First and Higher Order Recurrences on Processors with Instruction Level Parallelism,Sixt Int’l. Workshop on Lang. Compilers for Parallel Computing, U. Banerjee,et al. (Eds., Springer-Verlag, pp. 406–429 (1993).

  4. M. Schlansker, V. Kathail, and S. Anik, Height Reduction of Control Recurrences for ILP Processors,Proc. 27th Ann. Int’l. Symp. on Microarchitecture, San Jose, California, pp. 40–51 (1994).

  5. J. A. Fisher, Very Long Instruction Word Architectures and the ELI-512,Proc. Tenth Ann. Intnl. Symp. Computer Architecture, Stockholm, Sweden, pp. 140–150 (1983).

  6. G. Lowneyet al., The Multiflow Trace Scheduling Compilers,J. Supercomputing 7(1/2):51–142 (1993).

    Article  Google Scholar 

  7. W. W. Hwu,et al., The Superblock: An Effective Technique for VLIW and Superscalar Compilation.J. Supercomputing 7(1/2): 229–248 (1993).

    Article  Google Scholar 

  8. J. A. Fisher and S. M. Freudenberger, Predicting Conditional Jump Directions from Previous Runs of a Program,Proc. Fifth Int’l. Conf. Archit. Support for Progr. Lang. and Oper. Syst., Boston, Massachusetts, pp. 85–95 (1992).

  9. V. Kathail, M. S. Schlansker, and B. R. Rau, HPL PlayDoh Architecture Specification: Version 1.0. Technical Report HPL-93-80, Hewlett-Packard Laboratories, Palo Alto, California (1993).

  10. P. Y. T. Hsu and E. S. Davidson. Highly Concurrent Scalar Processing.Proc. 13th Ann. Int’l. Symp. Computer Archit., pp. 386–395 (1986).

  11. B. R. Rauet al., The Cydra 5 Departmental Supercomputer: Design Philosophies, Decisions and Trade-Offs.Computer 22(1):12–35 (1989).

    Article  MathSciNet  Google Scholar 

  12. S. A. Mahlke,et al., Effective Compiler Support for Predicated Execution Using the Hyperblock.Proc. 25th Ann. Int’l. Symp. Microarchitecture, pp. 45–54 (1992).

  13. J. C. Dehnert, P. Y.-T. Hsu, and J. P. Bratt, Overlapped Loop Support in the Cydra 5.Proc. Third Int’l. Conf. Archit. Support for Progr. Lang. Oper. Syst., Boston, Massachusetts, pp. 26–38 (1989).

  14. S. A. Mahlke,et al., Sentinel Scheduling: A Model for Compiler-Controlled Speculative Execution.ACM Trans. Computer Systems 11(4):376–408 (1993).

    Article  Google Scholar 

  15. J. R. Ellis,Bulldog: A Compiler for VLIW Architectures, The MIT Press, Cambridge, Massachusetts, (1985).

    Google Scholar 

  16. J. Ferrante, K. Ottenstein, and J. Warren, The Program Dependence Graph and Its Use in Optimization.ACM Trans. Progr. Lang. Syst. 9(3):319–349 (1987).

    Article  MATH  Google Scholar 

  17. K. Pingali and G. Bilardi, APT: A Data Structure for Optimal Control Dependence Computation.Proc. Progr. Lang. Design and Implementation, La Jolla, California (1995).

  18. J. C. H. Park and M. S. Schlansker, On Predicated Execution. Technical Report HPL-91-58, Hewlett-Packard Laboratories, Palo Alto, California (1991).

    Google Scholar 

  19. D. J. Kuck,The Structure of Computers and Computations, John Wiley, New York (1978).

    Google Scholar 

  20. J. A. Fisher, Trace scheduling: A Technique for Global Microcode Compaction,IEEE Trans. Computers C-30(7):478–490 (1981).

    Article  Google Scholar 

  21. A. Nicolau, Percolation Scheduling: A Parallel Compilation Technique. Technical Report TR 85-678, Department of Computer Science, Cornell (1985).

    Google Scholar 

  22. K. Ebcioglu and A. Nicolau. AGlobal Resource-Constrained Parallelization Technique.Proc. Third Int’l. Conf. Supercomputing, Crete, Greece, pp. 154–163 (1989).

  23. P. Tirumalai, M. Lee, and M. S. Schlansker, Parallelization of Loops with Exits on Pipelined Architectures,Proc. Supercomputing, pp. 200–212 (1990).

  24. S.-M. Moon and K. Ebcioglu, An Efficient Resource-Constrained Global Scheduling Technique for Superscalar and VLIW Processors,Proc. 25th Ann. Int’l. Symp. Microarchitecture, Portland, Oregon (1992).

  25. J. A. Fisher, 2N-way Jump Microinstruction Hardware and an Effective Instruction Binding Method,Proc. 13th Ann. Workshop on Microprogramming, Colorado Springs, Colorado, pp. 64–75 (1980).

  26. K. Ebcioglu and R. Groves, Some Global Compiler Optimization and Architectural Features for Improving Performance of Superscalars, Technical Report RC16145, IBM T. J. Watson Research Center, Yorktown Heights, New York (1990).

    Google Scholar 

  27. B. R. Rau, M. S. Schlansker, and P. P. Tirumalai, Code Generation Schemas for Modulo Scheduled DO-Loops and WHILE-Loops. Technical Report HPL-92-47, Hewlett-Packard Laboratories, Palo Alto, California (1992).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schlansker, M., Kathail, V. Techniques for critical path reduction of scalar programs. Int J Parallel Prog 25, 147–181 (1997). https://doi.org/10.1007/BF02700034

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02700034

Key words

Navigation