On the Boosting of Instruction Scheduling by Renaming

Wang, L.; Yang, Ted C.

doi:10.1023/A:1011141304485

On the Boosting of Instruction Scheduling by Renaming

Published: June 2001

Volume 19, pages 173–197, (2001)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

L. Wang¹ &
Ted C. Yang²

46 Accesses
3 Altmetric
Explore all metrics

Abstract

Speculative execution is the execution of instructions before it is known whether these instructions should be executed. In the speculative execution for instruction level parallelism (ILP) processors, the concept of shadow register provides a hardware solution to maintain semantics of a program from the pollution of boosted instructions that are incorrectly predicted. In a recent study, Chang and Lai proposed a special register file based on shadow register, named conjugate register file (CRF), to support multilevel boosting in speculative execution. They also proposed a scheduling heuristic named frequency-driven scheduling to incorporate with CRF for execution. However, the ability of boosting is still constrained since the concept of register pair will force the results produced speculatively be stored in dedicated locations. Moreover, when the parallelism potential increases to tens through the advancement of hardware techniques, the heavy demand on register usage and the complexity of register file may well become a serious bottleneck for the exploitation of ILP.

In this paper, the algorithm of frequency-driven scheduling is modified by replacing the function of hardware CRF with the technique of variable renaming during compilation. The new scheduling technique, named LESS, can exploit the parallelism efficiently with limited number of registers. Moreover, since the technique can benefit ILP without any special hardware support, it can be incorporated with any other ILP architecture without changing its instruction set architecture (ISA).

Simulation results show that the performance achievable by LESS is better than other existing methods. For example, under the ILP model with an issue rate of 8, the speculative execution can achieve an increase of 34% in parallelism, as compared to 18% in CRF scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

M. C. Chang and F. P. Lai. Efficient exploitation of instruction-level parallelism for superscalar processors by the conjugate register file scheme. IEEE Trans. Comput., 45(3):278-293, 1996.
Google Scholar
B. R. Rau and J. A. Fisher. Instruction-level parallel processing: history, overview,and perspective. J. Supercomput., 7(1/2):9-50, 1993.
Google Scholar
M. S. Schlansker and R. R. Rau. EPIC: Explicitly parallel instruction computing. IEEE Comput., 37-45, 2000.
D. I. August, D. A. Connors, S. A. Mahlke, J. W. Sias, K. M. Crozier, B. C. Cheng, P.R. Eaton, Q. B. Olaniran, and W. W. Hwu. Integrated predicated and speculative execution in the IMPACT EPIC architecture. In Proceedings of the 25th Annual International Symposium on Computer Architecture, pp.138-149, 1998.
M. S. Lam and R. P. Wilson. Limits of control flow on parallelism. In Proceedings of the 19th Annual International Symposium on Computer Architecture, pp.46-57, 1992.
S. A. Mahlke, et al. Sentinel scheduling for VLIW and superscalar processors. In Proceedings of the Fifth International Conference on Architectural Support for Programming Languages and Operating System, pp.238-247, 1992.
H. Ando et al. Unconstrained speculative execution with predicated state buffering. In Proceedings of the 22nd Annual International Symposium on Computer Architecture, pp.138-149, 1995.
M. D. Smith, M. S. Lam,and M. Horowitz. Boosting beyond static scheduling in a superscalar processor. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pp.344-255, 1990.
S. A. Mahlke, D. C. Lin, and W. Y. Chen et al. Effective compiler support for predicated execution using the hyperblock. In Proceedings of the 25th Annual International Symposium on Microarchitecture, pp.45-54, 1992.
S. A. Mahlke, R. E. Hank, J. E. McCormick, D. I. August, and W. W. Hwu. A comparison of full and partial predicated execution support for ILP processors. In Proceedings of the 22th Annual International Symposium on Computer Architecture, pp.138-149, 1995.
J. A. Fisher. Trace scheduling: a technique for global microcode compaction. IEEE Trans. Comput., C-30:478-490, 1981.
Google Scholar
W. W. Hwu et al. The superblock: an effective technique for VLIW and superscalar compilation. J. Supercomput., 7(1/2):229-248, 1993.
Google Scholar
V. E. Kotov. Automatic Construction of Parallel Programs, Algorithms, Software and Hardware for Parallel Computers. Springer, Berlin, 1984.
Google Scholar
R. Cytron, J. Ferrante, B. Rosen, M. Wegman,and F. Zadeck. Efficiently computing static single assignment and the control dependence graph. ACM Trans. Programm. Languages Syst., 13(4): 451-490, 1991.
Google Scholar
P. P. Pineo. An efficient algorithm for the creation of single assignment forms. In Proceedings of the 29th Annual Hawaii International Conference on System Sciences, pp. 213-222, 1996.
V. H. Allan et al. Software pipelining. ACM Comput. Surveys, 27(3): pp.367-432, 1995.
Google Scholar
D. A. Patterson and J. L. Hennessy. Computer Architecture: A Quantitative Approach, 2nd ed. Morgan Maufmann, San Mateo, CA, 1996.
Google Scholar
L. Wang and T. C. Yang. Compiler/hardware co-design for instruction boosting in ILP processors. IEE Proc. Comput. Digit. Technol., 146(6):269-274, 1999.
Google Scholar
C. M. Chen, C. T. King, and Y. Y. Chen. Branch merging for scheduling concurrent executions of branch operations. IEE Proc. Comput. Digit. Technol., 143(6):278-293, 1996.
Google Scholar
M. Srinivas and A. Nicolau. Analyzing the individual/combined effects of speculative and guarded execution on a superscalar architecture. In Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing, pp.199-208, 1998.
J. Huck, D. Morris, J. Ross, A. Knies, H. Mulder, and R. Zahir. Introducing the IA-64 architecture. IEEE Micro, 20(5):12-23, 2000.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Feng Chia University Taichung, 407, Taiwan ROC
L. Wang
Department of Information Engineering, Feng Chia University Taichung, 407, Taiwan ROC
Ted C. Yang

Authors

L. Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ted C. Yang
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, L., Yang, T.C. On the Boosting of Instruction Scheduling by Renaming. The Journal of Supercomputing 19, 173–197 (2001). https://doi.org/10.1023/A:1011141304485

Download citation

Issue Date: June 2001
DOI: https://doi.org/10.1023/A:1011141304485

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Boosting of Instruction Scheduling by Renaming

Abstract

Access this article

Similar content being viewed by others

An Approach for Compiler Optimization to Exploit Instruction Level Parallelism

Hybrid Register Allocation with Spill Cost and Pattern Guided Optimization

A Low-Energy Wide SIMD Architecture with Explicit Datapath

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

On the Boosting of Instruction Scheduling by Renaming

Abstract

Access this article

Similar content being viewed by others

An Approach for Compiler Optimization to Exploit Instruction Level Parallelism

Hybrid Register Allocation with Spill Cost and Pattern Guided Optimization

A Low-Energy Wide SIMD Architecture with Explicit Datapath

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation