Skip to main content
Log in

Path Analysis and Renaming for Predicated Instruction Scheduling

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Increases in instruction level parallelism are needed to exploit the potential parallelism available in future wide issue architectures. Predicated execution is an architectural mechanism that increases instruction level parallelism by removing branches and allowing simultaneous execution of multiple paths of control, only committing instructions from the correct path. In order for the compiler to expose and use such parallelism, traditional compiler data-flow and path analysis needs to be extended to predicated code. In this paper, we motivate the need for renaming and for predicates that reflect path information. We present Predicated Static Single Assignment (PSSA) which uses renaming and introduces Full -Path Predicates to remove false dependences and enable aggressive predicated optimization and instruction scheduling. We demonstrate the usefulness of PSSA for Predicated Speculation and Control Height Reduction. These two predicated code optimizations used during instruction scheduling reduce the dependence length of the critical paths through a predicated region. Our results show that using PSSA to enable speculation and control height reduction reduces execution time from 12 to 68%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  1. D. I. August, K. M. Crozier, J. W. Sias, P. R. Eaton, Q. B. Olaniran, D. A. Connors, and W. W. Hwu, The IMPACT EPIC 1.0 Architecture and Instruction Set reference manual. Technical Report IMPACT-98-04, IMPACT, University of Illinois (February 1998).

  2. L. Gwennap, Intel, HP make EPIC disclosure, Microprocessor Report, 11(14):1–9 (October 1997).

    Google Scholar 

  3. Intel Press Release, Merced processor and IA-64 architecture (1998). http://developer. intel.com/design/processor/future/iaa64.htm (1998).

  4. J. C. H. Park and M. Schlansker, On predicated execution. Technical Report HPL-91-58, HP Labs (May 1991).

  5. S. A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank, and R. A. Bringmann, Effective com-piler support for predicated execution using the hyperblock, Proc. 25th Ann. Int'l. Symp. Microarchitecture, pp. 45–54 (December 1992).

  6. R. Cytron, J. Ferrante, B. K. Rosen, M. K. Wegman, and F. K. Zadeck, An efficient method of computing static single assignment form, 16th Ann. ACM Symp. Principles Progr. Lang., pp. 25–35 (1989).

  7. L. Carter, B. Simon, B. Calder, L. Carter, and J. Ferrante, Predicated static single assign-ment, Proc. Int'l. Conf. Parallel Architectures and Compilation Techniques, pp. 245–255 (October 1999).

  8. James C. Dehnert and Ross A. Towle, Compiling for the Cydra 5, J. Supercomputing, 7(1-2):181–227 (May 1993).

    Google Scholar 

  9. B. Ramakrishna Rau, David W. L. Yen, Wei Yen, and Ross A. Towle, The Cydra 5 departmental supercomputer, Computer, 22(1):12–35 (January 1989).

    Google Scholar 

  10. J. R. Allen, K. Kennedy, C. Porterfield, and J. Warren, Conversion of control dependence to data dependence, Proc. Tenth ACM Symp. Principles of Progr. Lang., pp. 177–189 (January 1983).

  11. S. A. Mahlke, R. E. Hank, R. A. Bringmann, J. C. Gyllenhaal, D. M. Gallagher, and W. W. Hwu, Characterizing the impact of predicated execution on branch prediction, Proc. 27th Ann. Int'l. Symp. Microarchitecture, pp. 217–227 (December 1994).

  12. Trimaran, An infrastructure for research in instruction level parallelism (1998). http://www.trimaran.org.

  13. V. Kathail, M. S. Schlansker, and B. R. Rau, HPL PlayDoh architecture specification: Version 1.0. Technical Report HPL-93-80, HP Labs (February 1994).

  14. R. Johnson and M. Schlansker, Analysis techniques for predicated code, Proc. 29th Ann. Int'l. Symp. Microarchitecture, pp. 100–113 (December 1996).

  15. D. M. Gillies, D. R. Ju, R. Johnson, and M. Schlansker, Global predicate analysis and its application to register allocation, Proc. 29th Ann. Int'l. Symp. Microarchitecture, pp. 114–125 (December 1996).

  16. A. V. Aho, R. Sethi, and J. D. Ullman, Compilers: Principles, Techniques and Tools, Addison-Wesley (1986).

  17. R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck, Efficiently com-puting static single assignment form and the control dependence graph, ACM Trans. Progr. Lang. Syst., 13(4):451–490 (October 1991).

    Google Scholar 

  18. M. Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley, Redwood City, California (1996).

    Google Scholar 

  19. P. G. Lowney, S. M. Freudenberger, T. J. Karzes, W. D. Lichtenstein, R. P. Nix, J. S. O'Donnell, and J. C. Ruttenberg, The Multiflow Trace Scheduling compiler, J. Supercomputing, 7(1-2):51–142 (May 1993).

    Google Scholar 

  20. D. I. August, D. A. Connors, S. A. Mahlke, J. W. Sias, K. M. Crozier, B. Cheng, P. R. Eaton, Q. B. Olaniran, and W. W. Hwu, Integrated predicated and speculative execution in the IMPACT EPIC architecture, Proc. 25th Int'l. Symp. on Computer Architecture, pp. 227–237 (July 1998).

  21. M. Schlansker, V. Kathail, and S. Anik, Height reduction of control recurrences for ILP processors, Proc. 27th Ann. Int'l. Symp. Microarchitecture, pp. 40–51 (December 1994).

  22. M. Schlansker and V. Kathail, Critical path reduction for scalar programs, Proc. 28th Ann. Int'l. Symp. Microarchitecture, pp. 57–69 (December 1995).

  23. M. Schlansker, S. Mahlke, and R. Johnson, Control CPR: A branch height reduction optimization for EPIC architectures, Proc. ACM SIGPLAN Conf. Progr. Lang. Design and Implementation, pp. 155–168 (May 1999).

  24. IA-64 Application Developer's Architecture Guide, Revision 1.0 (1999).

  25. G. S. Tyson, The effects of predicated execution on branch prediction, Proc. 27th Ann. Int'l. Symp. Microarchitecture, pp. 196–206 (December 1994).

  26. D. I. August, W. Hwu, and S. A. Mahlke, A framework for balancing control flow and predication, 30th Ann. Int'l. Symp. on Microarchitecture, pp. 92–103 (December 1997).

  27. N. J. Warter, S. A. Mahlke, W. W. Hwu, and B. R. Rau, Reverse if-conversion, Proc. SIGPLAN Conf. Progr. Lang. Design and Implementation, pp. 290–299 (June 1993).

  28. G. Ammons and J. R. Larus, Improving data-flow analysis with path profiles, ACM SIGPLAN Notices, 33(5):72–84 (May 1998).

    Google Scholar 

  29. T. Ball and J. R. Larus, Efficient path profiling, Proc. 29th Ann. Int'l. Symp. Microarchitecture, pp. 46–57 (December 1996).

  30. R. Gupta, D. A. Berson, and J. Z. Fang, Path profile guided partial dead code elimation using predication, Proc. Int'l. Conf. Parallel Architectures and Compilation Techniques, pp. 102–113 (November 1997).

  31. S. Moon and K. Ebciogğlu, Parallelizing nonnumerical code with selective scheduling and software pipelining, ACM Trans. Progr. Lang. Syst., 19(6):853–898 (November 1997).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Carter, L., Simon, B., Calder, B. et al. Path Analysis and Renaming for Predicated Instruction Scheduling. International Journal of Parallel Programming 28, 563–588 (2000). https://doi.org/10.1023/A:1007512717742

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1007512717742

Navigation