ABSTRACT
A large logical register file is important to allow effective compiler transformations or to provide a windowed space of registers to allow fast function calls. Unfortunately, a large logical register file can be slow, particularly in the context of a wide-issue processor which requires an even larger physical register file, and many read and write ports. Previous work has suggested that a register cache can be used to address this problem. This paper proposes a new register caching mechanism in which a number of good features from previous approaches are combined with existing out-of-order processor hardware to implement a register cache for a large logical register file. It does so by separating the logical register file from the physical register file and using a modified form of register renaming to make the cache easy to implement. The physical register file in this configuration contains fewer entries than the logical register file and is designed so that the physical register file acts as a cache for the logical register file, which is the backing store. The tag information in this caching technique is kept in the register alias table and the physical register file. It is found that the caching mechanism improves IPC up to 20% over an un-cached large logical register file and has performance near to that of a logical register file that is both large and fast.
- 1.Douglas C. Burger and Todd M. Austin. The SimpleScalar Tool Set, Version 2.0. University of Wisconsin, Madison Tech. Report. June, 1997.Google Scholar
- 2.Jose-Lorenzo Cruz, Antonio Gonzalez, Mateo Valero and Nigel P. Topham. Multiple-Banked Register File Architectures. Proc. 27th Intl. Symp. Computer Architecture, pp. 316-325, June 2000. Google ScholarDigital Library
- 3.Antonio Gonzalez, Mateo Valero, Jose Gonzalez and T. Monreal. Virtual Registers. Proc. Intl. Conf. High-Performance Computing, pp. 364-369, 1997. Google ScholarDigital Library
- 4.Antonio Gonzalez, Jose Gonzalez and Mateo Valero. Virtual- Physical Registers. Proc. 4th Intl. Symp. High-Performance Computer Architecture (HPCA-4), pp. 175-184, Feb. 1998. Google ScholarDigital Library
- 5.Linley Gwenapp. Digital 21264 Sets New Standard. Microprocessor Report, Vol. 10, No. 14. October 28, 1996, pp. 11-16.Google Scholar
- 6.Intel IA-64 Application Developer's Architecture Guide. May 1999. Order Number: 245188-001. Available at http://developer.intel.com/design/ia64/devinfo.htm.Google Scholar
- 7.Jack L. Lo, Sujay S. Parekh, Susan J. Eggers, Heny M. Levy, Dean M. Tullsen. Software-Directed Register Deallocation for Simultaneous Multithreaded Processors. IEEE Transactions on Parallel and Distributed Systems, Vol. 10, No. 9, September 1999, pp. 922-933. Google ScholarDigital Library
- 8.Luis A. Lozano C. and Guang R. Gao. Exploiting Short-Lived Variables in Superscalar Processors. Proc. 28th Intl. Symp. Microarchitecture, pp. 292-302, Nov. 1995. Google ScholarDigital Library
- 9.Milo M. Martin, Amir Roth, and Charles N. Fischer. Exploiting Dead Value Information. Proc. 30th Intl. Symp. Microarchitecture (MICRO'97), Dec. 1997. Google ScholarDigital Library
- 10.T. Monreal, A. Gonzalez, M. Valero, J. Gonzalez and V. Vinals. Delaying Physical Register Allocation through Virtual-Physical Registers. Proc. 32nd Intl. Symp. Microarchitecture, pp. 186-192, Nov. 1999. Google ScholarDigital Library
- 11.M. Moudgill, K. Pingali and S. Vassiliadis. Register Renaming and Dynamic Speculation: An Alternative Approach. Proc. 26th Intl. Symp. Microarchitecture (MICRO'93), pp. 202-213, Dec. 1993. Google ScholarDigital Library
- 12.David A. Patterson and Carlo H. Sequin. RISC I: A Reduced Instruction Set VLSI Computer. Proc. 8th Intl. Symp. Computer Architecture, Vol. 32 No. CS-93-63, pp. 443-457. Nov. 1981. Google ScholarDigital Library
- 13.Matthew Postiff, David Greene, Charles Lefurgy, Dave Helder, Trevor Mudge. The MIRV SimpleScalar/PISA Compiler. University of Michigan CSE Technical Report CSE-TR-421- 00, April 2000. Available at {22}.Google ScholarCross Ref
- 14.Matthew Postiff, David Greene, and Trevor Mudge. Exploiting Large Register Files in General Purpose Code. University of Michigan Technical Report CSE-TR-434-00, October 2000. Available at {22}.Google ScholarCross Ref
- 15.Matthew Postiff. Compiler and Microarchitecture Mechanisms for Exploiting Registers to Improve Memory Performance. Ph.D. Dissertation, University of Michigan. March 2001. Google ScholarDigital Library
- 16.Dezso Sima. The Design Space of Register Renaming Techniques. IEEE Micro, Vol. 20 No. 5, pp. 70-83. Sep/Oct 2000. Google ScholarDigital Library
- 17.John A. Swenson and Yale N. Patt. Hierarchical Registers for Scientific Computers. Proc. Intl. Conf. Supercomputing, pp. 346-353, July 1988. Google ScholarDigital Library
- 18.UNIX System Laboratories Inc. System V Application Binary Interface: MIPS Processor Supplement. Unix Press/Prentice Hall, Englewood Cliffs, New Jersey, 1991. Google ScholarDigital Library
- 19.Kenneth C. Yeager. The MIPS R10000 superscalar microprocessor. IEEE Micro, Vol. 16 No. 2, pp. 28-40. April, 1996. Google ScholarDigital Library
- 20.Robert Yung and Neil C. Wilhelm. Caching Processor General Registers. Intl. Conf. Computer Design, pp. 307-312, Oct, 1995. Google ScholarDigital Library
- 21.Robert Yung and Neil C. Wilhelm. Caching Processor General Registers. Sun Microsystems Laboratories Tech. Report. June, 1995.Google Scholar
- 22.http://www.eecs.umich.edu/mirv.Google Scholar
Index Terms
- Integrating superscalar processor components to implement register caching
Recommendations
Energy-efficient register caching with compiler assistance
The register file is a critical component in a modern superscalar processor. It must be large enough to accommodate the results of all in-flight instructions. It must also have enough ports to allow simultaneous issue and writeback of many values each ...
Performance evaluation of superscalar processor with multi-bank register file using SPEC2000
ICCOMP'06: Proceedings of the 10th WSEAS international conference on ComputersRecently, register files in highly parallel superscalar processors tend to have large chip area and many access ports. This trend causes problems with chip-size, access time and power consumption. As one of the approaches for solving these problems, ...
Reducing the complexity of the register file in dynamic superscalar processors
MICRO 34: Proceedings of the 34th annual ACM/IEEE international symposium on MicroarchitectureDynamic superscalar processors execute multiple instructions out-of-order by looking for independent operations within a large window. The number of physical registers within the processor has a direct impact on the size of this window as most in-flight ...
Comments