CROB: Implementing a Large Instruction Window through Compression

Latorre, Fernando; Magklis, Grigorios; González, Jose; Chaparro, Pedro; González, Antonio

doi:10.1007/978-3-642-19448-1_7

Fernando Latorre¹⁷,
Grigorios Magklis¹⁷,
Jose González¹⁷,
Pedro Chaparro¹⁷ &
…
Antonio González¹⁷

Part of the book series: Lecture Notes in Computer Science ((THIPEAC,volume 6590))

595 Accesses
3 Citations

Abstract

Current processors require a large number of in-flight instructions in order to look for further parallelism and hide the increasing gap between memory latency and processor cycle time. These in-flight instructions are typically stored in centralized structures called reorder buffer (ROB), which is a centerpiece to handle precise exceptions and recover a safe state in the event of a branch misprediction. However, this structure is becoming so big that it is difficult to fit it in the power budget of future processors designs. In this paper we propose a novel ROB microarchitecture named CROB (Compressed ROB) that can compress ROB entries and therefore give the illusion of having a larger virtual ROB than the number of ROB entries. The performance study of CROB shows a tremendous benefit, with an average speedup of 20% and 12% for a 128-entry and 256-entry ROB respectively. For some benchmark categories such as SpecFP2000, speedup raise up to 30%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Smith, J., Pleszkun, A.R.: Implementing precise interrupts in pipelined processors. IEEE Transactions on Computers 37(5), 562–573 (1988)
Article Google Scholar
Hinton, G., Sager, D., Upton, M., Boggs, D., Carmean, D., Kyker, A., Roussel, P.: The Microarchitecture of the Pentium® 4 Processor. Intel Technology Journal (February 2001)
Google Scholar
Martinez, J.F., Renau, J., Huang, M.C., Prvulovic, M., Cherry, T.J.: Checkpointed Early Recycling in Out-of-order Microprocessors. In: Proceedings of International Symposium on Microarchitecture (November 2002)
Google Scholar
Akkary, H., Rajwar, R., Srinivasan, S.T.: Checkpoint Processing and Recovery: Towards Scalable Large Instruction Window Processors. In: Proceedings of International Symposium on Microarchitecture, pp. 423–434 (December 2003)
Google Scholar
Cristal, A., Santana, O., Valero, M.: Toward Kilo-instruction Processors. ACM Transactions on Architecture and Code Optimization 1(4), 389–417 (2004)
Article Google Scholar
Canal, R., Parcerisa, J.M., González, A.: Dynamic Cluster Assignment Mechanisms. In: Proceedings of International Symposium on High Performance Computer Architectures (2000)
Google Scholar
Balasubramonian, R., Dwarkadas, S., Albonesi, D.: Dynamically Managing the Communication-Parallelism Trade-off in Future Clustered Processors. In: Proceedings of the Annual International Symposium on Computer Architecture (June 2003)
Google Scholar
Baniasadi, A., Moshovos, A.: Instruction Distribution Heuristics for Quad-Cluster, Dynamically-Schedule, Superscalar Processors. In: Proceedings of International Symposium on Microarchitecture (December 2000)
Google Scholar
Aggarwal, A., Franklin, M.: An Empirical Study of the Scalability Aspects of Instruction Distribution Algorithms for Clustered Processors. In: Proceedings of ISPASS (2001)
Google Scholar
Palacharla, S., Jouppi, N.P., Smith, J.E.: Complexity-effective Superscalar Processors. In: Proceedings of the Annual International Symposium on Computer Architecture, pp. 210–218 (June 1997)
Google Scholar
Brown, M.D., Stark, J., Patt, Y.N.: Select-free instruction scheduling logic. In: Proceedings of International Symposium on Microarchitecture, pp. 204–213 (December 2001)
Google Scholar
Buyuktosunoglu, A., Bose, P., Cook, P.W., Schuster, S.E.: Tradeoffs in Power-Efficient Issue Queue Design. In: Proceedings of International Conference on Parallel Architectures and Compilation Techniques (November 2000)
Google Scholar
Folegnani, D., Gonzalez, A.: Energy-Effective Issue Logic. In: Proceedings ACM/IEEE 27th Intl. Symposium Computer Architecture, pp. 230–239 (June 2001)
Google Scholar
Fields, B., Rubin, S., Bodik, R.: Focusing Processor Policies via Critical-Path Prediction. In: Proceedings 28th annual Intl. Symposium on Computer Architecture, pp. 74–85 (2001)
Google Scholar
Lebeck, R., Li, T., Rotenberg, E., Koppanalil, J., Patwardhan, J.: A Large, Fast Instruction Window for Tolerating Cache Misses. In: Proceedings ACM/IEEE 29th Intl. Symposium on Computer Architecture, pp. 59–70 (June 2002)
Google Scholar
Ponomarev, D., Kucuk, G., Ghose, K.: Reducing Power Requirements of Instruction Scheduling Through Dynamic Allocation of Multiple Datapath Resources. In: Proceedings 34th ACM/IEEE International Symposium on Microarchitecture, pp. 90–101 (2001)
Google Scholar
Capitanio, A., Dutt, N., Nicolau, A.: Partitioned Register Files for VLIWs: A Preliminary Analysis of Trade-offs. In: Proceedings of the International Symposium on Microarchitecture, pp. 292–300 (December 1992)
Google Scholar
Wallace, S., Bagherzadeh, N.: A Scalable Register File Architecture for Dynamically Scheduled Processors. In: Proceedings of International Conference on Parallel Architectures and Compilation Techniques, pp. 179–184 (1996)
Google Scholar
Gonzalez, A., Gonzalez, J., Valero, M.: Virtual-Physical Registers. In: Proceedings of International Symposium on High-Performance Computer Architectures, pp. 175–184 (February 1998)
Google Scholar
Cruz, J.-L., Gonzalez, A., Valero, M., Topham, N.: Multiple-Banked Register File Architectures. In: Proceedings of International Symposium on Computer Architecture, pp. 316–325 (June 2000)
Google Scholar
Shivakumar, P., Jouppi, N.P.: CACTI 3.0: An Integrated Cache Timing, Power, and Area Model. WRL Research Report 2001/2 (August 2001)
Google Scholar
Ergin, O., Balkan, D., Ponomarev, D., Ghose, K.: Increasing Processor Performance Through Early Register Release. In: Proceedings of 22nd International Conference on Computer Design, pp. 480–487 (October 2004)
Google Scholar
http://www-03.ibm.com/servers/eserver/pseries/hardware/whitepapers/power4.html
Raasch, S.E., Binkert, N.L., Reinhardt, S.K.: A Scalable Instruction Queue Design Using Dependence Chains. In: Proceedings of 29th Annual Int’l Symp. on Computer Architecture, pp. 318–329 (May 2002)
Google Scholar
Moshovos, A.: Checkpointing Alternatives for High Performance, Power-AwareProcessors. In: Proceedings of the IEEE Intl’ Symposium on Low Power Electronic Devices and Design (August 2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Intel Barcelona Research Center, Intel Labs, UPC, Spain
Fernando Latorre, Grigorios Magklis, Jose González, Pedro Chaparro & Antonio González

Authors

Fernando Latorre
View author publications
You can also search for this author in PubMed Google Scholar
Grigorios Magklis
View author publications
You can also search for this author in PubMed Google Scholar
Jose González
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Chaparro
View author publications
You can also search for this author in PubMed Google Scholar
Antonio González
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Chalmers University of Technology, 412 96, Gothenburg, Sweden
Per Stenström

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Latorre, F., Magklis, G., González, J., Chaparro, P., González, A. (2011). CROB: Implementing a Large Instruction Window through Compression. In: Stenström, P. (eds) Transactions on High-Performance Embedded Architectures and Compilers III. Lecture Notes in Computer Science, vol 6590. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19448-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-19448-1_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19447-4
Online ISBN: 978-3-642-19448-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics