Virtual Register Renaming

Sharafeddine, Mageda; Akkary, Haitham; Carmean, Doug

doi:10.1007/978-3-642-36424-2_8

Mageda Sharafeddine²⁰,
Haitham Akkary²⁰ &
Doug Carmean²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7767))

Included in the following conference series:

International Conference on Architecture of Computing Systems

1772 Accesses

Abstract

This paper presents a novel high performance substrate for building energy-efficient out-of-order superscalar cores. The architecture does not require a reorder buffer or physical registers for register renaming and instruction retirement. Instead, it uses a large number of virtual register IDs for register renaming, a physical register file of the same size as the logical register file, and checkpoints to bulk retire instructions and to recover from exceptions and branch mispredictions. By eliminating physical register renaming and the reorder buffer, the architecture not only eliminates complex power hungry hardware structures, but also reduces reorder buffer capacity stalls when execution encounters long delays from data cache misses, thus improving performance. The paper presents performance and power evaluation of this new architecture using Spec 2006 benchmarks. The performance data was collected using an x86 ASIM-based performance simulator from Intel Labs. The data shows that the new architecture improves performance of a 2-wide out-of-order x86 processor core by an average of 4.2%, while saving 43% of the energy consumption of the reorder buffer and retirement register file functional block.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Selective register-file cache: an energy saving technique for embedded processor architecture

Article 29 May 2022

The Basic Design Space of Out-of-Order Execution in CPU Cores

Energy-Efficient VLSI Architecture & Implementation of Bi-modal Multi-banked Register-File Organization

References

Akkary, H., Rajwar, R., Srinivasan, S.: Checkpoint processing and recovery: towards scalable large instruction window processors. In: Proceedings of MICRO 2003 (2003)
Google Scholar
Akkary, H., Rajwar, R., Srinivasan, S.: Checkpoint processing and recovery: an efficient, scalable alternative to reorder buffers. IEEE MICRO 23(6), 11–19 (2003)
Article Google Scholar
Akkary, H., Rajwar, R., Srinivasan, S.: An analysis of a resource efficient checkpoint architecture. ACM Transactions on Architecture and Code Optimization 1(4), 418–444 (2004)
Article Google Scholar
Cristal, A., Santana, O.J., Valero, M., Martinez, J.F.: Toward kilo-instruction processors. ACM Transactions on Architecture and Code Optimization 1(4), 389–417 (2004)
Article Google Scholar
Cristal, A., Ortega, D., Llosa, J., Valero, M.: Out-of-order commit processors. In: Proceedings of HPCA 2004 (2004)
Google Scholar
Cristal, A., Valero, M., Llosa, J., Gonzalez, A.: Large virtual ROBs by processor checkpointing. Tech. Report, UPC-DAC-2002-39, Department of Computer Science, Barcelona, Spain (July 2002)
Google Scholar
Emer, J., Ahuja, P., Borch, E., Klauser, A., Luk, C.-K., Manne, S., Mukherjee, S.S., Patil, H., Wallace, S., Binkert, N., Espasa, R., Juan, T.: ASIM: A performance model framework. IEEE Computer 35(2), 68–76 (2002)
Article Google Scholar
Gonzalez, A., Gonzalez, J., Valero, M.: Virtual-physical registers. In: Proceedings of HPCA 1998 (1998)
Google Scholar
Gonzalez, A., Valero, M., Gonzalez, J., Monreal, T.: Virtual registers. In: Proceedings of HPCA 1997 (1997)
Google Scholar
Hilton, A., Nagarakatte, S., Roth, A.: Tolerating all-level cache misses in in-order processors. In: Proceedings of HPCA 2009 (2009)
Google Scholar
Hilton, A., Roth, A.: BOLT: energy-efficient out-of-order latency tolerant execution. In: Proceedings of HPCA 2010 (2010)
Google Scholar
Hinton, G., Sager, D., Upton, M., Boggs, D., Carmean, D., Kyker, A., Roussel, P.: The microarchitecture of the Pentium 4 processor. Intel Technology Journal 5(4) (February 2001)
Google Scholar
Hwu, W.W., Patt, Y.N.: Checkpoint repair for out-of-order execution machines. In: Proceedings of ISCA 1987 (1987)
Google Scholar
Jacobsen, E., Rotenberg, E., Smith, J.E.: Assigning confidence to conditional branch predictions. In: Proceedings of MICRO 1996 (1996)
Google Scholar
Jothi, K., Akkary, H., Sharafeddine, M.: Simultaneous continual flow pipeline architecture. In: Proceedings of ICCD 2011 (2011)
Google Scholar
Leibholz, D., Razdan, R.: The Alpha 21264: a 500 MHz out-of-order execution microprocessor. In: Proceedings of the 42nd IEEE Computer Society International Conference (COMPCON), pp. 28–36 (February 1997)
Google Scholar
Martinez, J.F., Renau, J., Huang, M.C., Prvulovic, M., Torrellas, J.: Cherry: checkpoint early resource recycling in out-of-order Microprocessors. In: Proc. of MICRO 2002 (2002)
Google Scholar
Moudgill, M., Pingali, K., Vassiliadis, S.: Register renaming and dynamic speculation: an alternative approach. In: Proceedings of MICRO 1993 (1993)
Google Scholar
Papworth, D.B.: Tuning the Pentium Pro microarchitecture. IEEE MICRO 16(2), 8–15 (1996)
Article Google Scholar
Smith, J.E., Pleszkun, A.R.: Implementation of precise interrupts in pipelined processors. In: Proceedings of ISCA 1985 (1985)
Google Scholar
Smith, J.E., Sohi, G.S.: The microarchitecture of superscalar processors. Proceedings of the IEEE 83(12), 1609–1624 (1995)
Article Google Scholar
Srinivasan, S.T., Rajwar, R., Akkary, H., Gandhi, A., Upton, M.: Continual flow pipelines. In: ASPLOS-11 (October 2004)
Google Scholar
Tomasulo, R.M.: An efficient algorithm for exploiting multiple arithmetic Units. IBM Journal of Research and Development 11, 25–33 (1967)
Article MATH Google Scholar
Yeager, K.: The MIPS R10000 superscalar microprocessor. IEEE Micro 16(2), 28–40 (1996)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Electrical and Computer Engineering Department, American University of Beirut, Lebanon
Mageda Sharafeddine & Haitham Akkary
Intel Corporation, Hillsboro, Oregon, USA
Doug Carmean

Authors

Mageda Sharafeddine
View author publications
You can also search for this author in PubMed Google Scholar
Haitham Akkary
View author publications
You can also search for this author in PubMed Google Scholar
Doug Carmean
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

FIT, Czech Technical University, Thákurova 9, 160 00, Prague 6, Czech Republic
Hana Kubátová
Elektrotechnik und Informationstechnik, TU Darmstadt, Merckstraße 25, 64283, Darmstadt, Germany
Christian Hochberger
Department of Signal Processing, Institute of Information Theory and Automation, Pod Vodárenskou věží 4, 18208, Prague 8, Czech Republic
Martin Daněk
Intelligent Embedded Systems, University of Kassel, Wilhelmshöher Allee 73, 34121, Kassel, Germany
Bernhard Sick

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sharafeddine, M., Akkary, H., Carmean, D. (2013). Virtual Register Renaming. In: Kubátová, H., Hochberger, C., Daněk, M., Sick, B. (eds) Architecture of Computing Systems – ARCS 2013. ARCS 2013. Lecture Notes in Computer Science, vol 7767. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36424-2_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-36424-2_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36423-5
Online ISBN: 978-3-642-36424-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics