Skip to main content
Log in

On the Boosting of Instruction Scheduling by Renaming

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Speculative execution is the execution of instructions before it is known whether these instructions should be executed. In the speculative execution for instruction level parallelism (ILP) processors, the concept of shadow register provides a hardware solution to maintain semantics of a program from the pollution of boosted instructions that are incorrectly predicted. In a recent study, Chang and Lai proposed a special register file based on shadow register, named conjugate register file (CRF), to support multilevel boosting in speculative execution. They also proposed a scheduling heuristic named frequency-driven scheduling to incorporate with CRF for execution. However, the ability of boosting is still constrained since the concept of register pair will force the results produced speculatively be stored in dedicated locations. Moreover, when the parallelism potential increases to tens through the advancement of hardware techniques, the heavy demand on register usage and the complexity of register file may well become a serious bottleneck for the exploitation of ILP.

In this paper, the algorithm of frequency-driven scheduling is modified by replacing the function of hardware CRF with the technique of variable renaming during compilation. The new scheduling technique, named LESS, can exploit the parallelism efficiently with limited number of registers. Moreover, since the technique can benefit ILP without any special hardware support, it can be incorporated with any other ILP architecture without changing its instruction set architecture (ISA).

Simulation results show that the performance achievable by LESS is better than other existing methods. For example, under the ILP model with an issue rate of 8, the speculative execution can achieve an increase of 34% in parallelism, as compared to 18% in CRF scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. M. C. Chang and F. P. Lai. Efficient exploitation of instruction-level parallelism for superscalar processors by the conjugate register file scheme. IEEE Trans. Comput., 45(3):278-293, 1996.

    Google Scholar 

  2. B. R. Rau and J. A. Fisher. Instruction-level parallel processing: history, overview,and perspective. J. Supercomput., 7(1/2):9-50, 1993.

    Google Scholar 

  3. M. S. Schlansker and R. R. Rau. EPIC: Explicitly parallel instruction computing. IEEE Comput., 37-45, 2000.

  4. D. I. August, D. A. Connors, S. A. Mahlke, J. W. Sias, K. M. Crozier, B. C. Cheng, P.R. Eaton, Q. B. Olaniran, and W. W. Hwu. Integrated predicated and speculative execution in the IMPACT EPIC architecture. In Proceedings of the 25th Annual International Symposium on Computer Architecture, pp.138-149, 1998.

  5. M. S. Lam and R. P. Wilson. Limits of control flow on parallelism. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pp.46-57, 1992.

  6. S. A. Mahlke, et al. Sentinel scheduling for VLIW and superscalar processors. In Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating System, pp.238-247, 1992.

  7. H. Ando et al. Unconstrained speculative execution with predicated state buffering. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pp.138-149, 1995.

  8. M. D. Smith, M. S. Lam,and M. Horowitz. Boosting beyond static scheduling in a superscalar processor. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pp.344-255, 1990.

  9. S. A. Mahlke, D. C. Lin, and W. Y. Chen et al. Effective compiler support for predicated execution using the hyperblock. In Proceedings of the 25th Annual International Symposium on Microarchitecture, pp.45-54, 1992.

  10. S. A. Mahlke, R. E. Hank, J. E. McCormick, D. I. August, and W. W. Hwu. A comparison of full and partial predicated execution support for ILP processors. In Proceedings of the 22th Annual International Symposium on Computer Architecture, pp.138-149, 1995.

  11. J. A. Fisher. Trace scheduling: a technique for global microcode compaction. IEEE Trans. Comput., C-30:478-490, 1981.

    Google Scholar 

  12. W. W. Hwu et al. The superblock: an effective technique for VLIW and superscalar compilation. J. Supercomput., 7(1/2):229-248, 1993.

    Google Scholar 

  13. V. E. Kotov. Automatic Construction of Parallel Programs, Algorithms, Software and Hardware for Parallel Computers. Springer, Berlin, 1984.

    Google Scholar 

  14. R. Cytron, J. Ferrante, B. Rosen, M. Wegman,and F. Zadeck. Efficiently computing static single assignment and the control dependence graph. ACM Trans. Programm. Languages Syst., 13(4): 451-490, 1991.

    Google Scholar 

  15. P. P. Pineo. An efficient algorithm for the creation of single assignment forms. In Proceedings of the 29th Annual Hawaii International Conference on System Sciences, pp. 213-222, 1996.

  16. V. H. Allan et al. Software pipelining. ACM Comput. Surveys, 27(3): pp.367-432, 1995.

    Google Scholar 

  17. D. A. Patterson and J. L. Hennessy. Computer Architecture: A Quantitative Approach, 2nd ed. Morgan Maufmann, San Mateo, CA, 1996.

    Google Scholar 

  18. L. Wang and T. C. Yang. Compiler/hardware co-design for instruction boosting in ILP processors. IEE Proc. Comput. Digit. Technol., 146(6):269-274, 1999.

    Google Scholar 

  19. C. M. Chen, C. T. King, and Y. Y. Chen. Branch merging for scheduling concurrent executions of branch operations. IEE Proc. Comput. Digit. Technol., 143(6):278-293, 1996.

    Google Scholar 

  20. M. Srinivas and A. Nicolau. Analyzing the individual/combined effects of speculative and guarded execution on a superscalar architecture. In Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing, pp.199-208, 1998.

  21. J. Huck, D. Morris, J. Ross, A. Knies, H. Mulder, and R. Zahir. Introducing the IA-64 architecture. IEEE Micro, 20(5):12-23, 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, L., Yang, T.C. On the Boosting of Instruction Scheduling by Renaming. The Journal of Supercomputing 19, 173–197 (2001). https://doi.org/10.1023/A:1011141304485

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011141304485

Navigation