Abstract
Continuing advances in semiconductor technology and demand for higher performance will lead to more powerful, superpipelined and wider issue processors. Instruction caches in such processors will consume a significant fraction of the on-chip energy due to very wide fetch on each cycle. This paper proposes a new energy-effective design of the fetch unit that exploits the fact that not all instructions in a given I-cache fetch line are used due to taken branches. A Fetch Mask Determination unit is proposed to detect which instructions in an I-cache access will actually be used to avoid fetching any of the other instructions. The solution is evaluated for a 4-, 8- and 16-wide issue processor in 100nm technology. Results show an average improvement in the I-cache Energy-Delay product of 20% for the 8-wide issue processor and 33% for the 16-wide issue processor for the SPEC2000, with no negative impact on performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
IBM RISC System/6000 Processor Architecture Manual
Aragón, J.L., González, J., González, A.: Power-Aware Control Speculation through Selective Throttling. In: Proc. Int. Symp. on High Performance Computer Architecture (HPCA 2003) (February 2003)
Aragón, J.L., Nicolaescu, D., Veidenbaum, A., Badulescu, A.M.: Energy–Efficient Design for Highly Associative Instruction Caches in Next–Generation Embedded Processors. In: Proc. of the Int. Conference on Design, Automation and Test in Europe (DATE 2004) (February 2004)
Bahar, I., Albera, G., Manne, S.: Power and Performance Trade-Offs Using Various Caching Strategies. In: Proc. of the Int. Symp. on Low-Power Electronics and Design (1998)
Brooks, D., Tiwari, V., Martonosi, M.: Wattch: A Frame-Work for Architectural-Level Power Analysis and Optimizations. In: Proc. of the Int. Symp. on Computer Architecture (2000)
Clark, L.T., et al.: An embedded 32b microprocessor core for low-power and high-performance applications. IEEE Journal of Solid State Circuits 36(11) (November 2001)
Clark, L.T., Choi, B., Wilkerson, M.: Reducing Translation Lookaside Buffer Active Power. In: Proc. of the Int. Symp. on Low Power Electronics and Design (2003)
Ghose, K., Kamble, M.B.: Reducing Power in Superscalar Processor Caches using Subbanking, Multiple Line Buffers and Bit-line Segmentation. In: Proc. Int. Symp. on Low Power Electronics and Design, pp. 70–75 (1999)
Gowan, M.K., Biro, L.L., Jackson, D.B.: Power Considerations in the Design of the Alpha 21264 Microprocessor. In: Proc. of the Design Automation Conference (June 1998)
Hasegawa, A., et al.: SH3: High Code Density, Low Power. IEEE Micro 15(6), 11–19 (1995)
Inoue, K., Ishihara, T., Murakami, K.: Way-Predicting Set-Associative Cache for High Performance and Low Energy Consumption. In: Proc. Int. Symp. on Low Power Electronics and Design, August 1999, pp. 273–275 (1999)
Kamble, M.B., Ghose, K.: Analytical Energy Dissipation Models for Low Power Caches. In: Proc. Int. Symp. on Low-Power Electronics and Design (August 1997)
Kin, J., Gupta, M., Mangione-Smith, W.H.: The Filter Cache: An Energy Efficient Memory Structure. In: Proc. Int. Symp. on Microarchitecture (December 1997)
Krewell, K.: IBM’s Power4 Unveiling Continues. Microprocessor Report (November 2000)
Ma, A., Zhang, M., Asanovic, K.: Way Memoization to Reduce Fetch Energy in Instruction Caches. In: ISCA Workshop on Complexity-Effective Design (July 2001)
Memik, G., Reinman, G., Mangione-Smith, W.H.: Reducing Energy and Delay using Efficient Victim Caches. In: Proc. Int. Symp. on Low Power Electronics and Design (2003)
Montanaro, J., et al.: A 160Mhz, 32b, 0.5W CMOS RISC Microprocessor. IEEE Journal of Solid State Circuits 31(11), 1703–1712 (1996)
Nicolaescu, D., Veidenbaum, A.V., Nicolau, A.: Reducing Power Consumption for High-Associativity Data Caches in Embedded Processors. In: Proc. Int. Conf. on Design, Automation and Test in Europe (DATE 2003), March 2003, pp. 11064–11069 (2003)
Nicolaescu, D., Veidenbaum, A.V., Nicolau, A.: Reducing Data Cache Energy Consumption via Cached Load/Store Queue. In: Proc. Int. Symp. on Low Power Electronics and Design (ISLPED 2003), August 2003, pp. 252–257 (2003)
Powell, M.D., Agarwal, A., Vijaykumar, T., Falsafi, B., Roy, K.: Reducing Set-Associative Cache Energy via Way-Prediction and Selective Direct-Mapping. In: Proc. Int. Symp. on Microarchitecture (December 2001)
Rotenberg, E., Bennett, S., Smith, J.E.: Trace Cache: A Low Latency Approach to High Bandwidth Instruction Fetching. In: Proc. of the 29th Int. Symp. on Microarchitecture (November 1996)
Shivakumar, P., Jouppi, N.P.: Cacti 3.0: An Integrated Cache Timing, Power and Area Model. Tech. Report 2001/2, Digital Western Research Lab (2001)
Su, C., Despain, A.: Cache Design Tradeoffs for Power and Performance Optimization: A Case Study. In: Proc Int. Symp. on Low Power Design (1995)
Tang, W., Veidenbaum, A.V., Nicolau, A., Gupta, R.: Integrated I-cache Way Predictor and Branch Target Buffer to Reduce Energy Consumption. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds.) ISHPC 2002. LNCS, vol. 2327, pp. 120–132. Springer, Heidelberg (2002)
Yeh, T.Y., Patt, Y.N.: A Comparison of Dynamic Branch Predictors that Use Two Levels of Branch History. In: Proc. of the Int. Symp. on Computer Architecture, pp. 257–266 (1993)
Yoshimito, M., Anami, K., Shinohara, H., Yoshihara, T., Takagi, H., et al.: A Divided Word-Line Structure in the Static RAM and its Application to a 64k Full CMOS RAM. IEEE J. Solid-State Circuits SC-18, 479–485 (1983)
Zhang, M., Asanovic, K.: Highly-Associative Caches for Low-power processors. In: Proc. Kool Chips Workshop, 33rd Int. Symp. on Microarchitecture (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aragón, J.L., Veidenbaum, A.V. (2005). Energy-Effective Instruction Fetch Unit for Wide Issue Processors. In: Srikanthan, T., Xue, J., Chang, CH. (eds) Advances in Computer Systems Architecture. ACSAC 2005. Lecture Notes in Computer Science, vol 3740. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11572961_3
Download citation
DOI: https://doi.org/10.1007/11572961_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29643-0
Online ISBN: 978-3-540-32108-8
eBook Packages: Computer ScienceComputer Science (R0)