Abstract
Continuing advances in semiconductor technology and demand for higher performance will lead to more powerful, superpipelined and wider issue processors. Instruction caches in such processors will consume a significant fraction of the on-chip energy due to very wide fetch on each cycle. This paper proposes a new energy-effective design of the fetch unit that exploits the fact that not all instructions in a given I-cache fetch line are used due to taken branches. A Fetch Mask Determination unit is proposed to detect which instructions in an I-cache access will actually be used to avoid fetching any of the other instructions. The solution is evaluated for a 4-, 8- and 16-wide issue processor in 100nm technology. Results show an average improvement in the I-cache Energy-Delay product of 20% for the 8-wide issue processor and 33% for the 16-wide issue processor for the SPEC2000, with no negative impact on performance.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
IBM RISC System/6000 Processor Architecture Manual
Aragón, J.L., González, J., González, A.: Power-Aware Control Speculation through Selective Throttling. In: Proc. Int. Symp. on High Performance Computer Architecture (HPCA 2003) (February 2003)
Aragón, J.L., Nicolaescu, D., Veidenbaum, A., Badulescu, A.M.: Energy–Efficient Design for Highly Associative Instruction Caches in Next–Generation Embedded Processors. In: Proc. of the Int. Conference on Design, Automation and Test in Europe (DATE 2004) (February 2004)
Bahar, I., Albera, G., Manne, S.: Power and Performance Trade-Offs Using Various Caching Strategies. In: Proc. of the Int. Symp. on Low-Power Electronics and Design (1998)
Brooks, D., Tiwari, V., Martonosi, M.: Wattch: A Frame-Work for Architectural-Level Power Analysis and Optimizations. In: Proc. of the Int. Symp. on Computer Architecture (2000)
Clark, L.T., et al.: An embedded 32b microprocessor core for low-power and high-performance applications. IEEE Journal of Solid State Circuits 36(11) (November 2001)
Clark, L.T., Choi, B., Wilkerson, M.: Reducing Translation Lookaside Buffer Active Power. In: Proc. of the Int. Symp. on Low Power Electronics and Design (2003)
Ghose, K., Kamble, M.B.: Reducing Power in Superscalar Processor Caches using Subbanking, Multiple Line Buffers and Bit-line Segmentation. In: Proc. Int. Symp. on Low Power Electronics and Design, pp. 70–75 (1999)
Gowan, M.K., Biro, L.L., Jackson, D.B.: Power Considerations in the Design of the Alpha 21264 Microprocessor. In: Proc. of the Design Automation Conference (June 1998)
Hasegawa, A., et al.: SH3: High Code Density, Low Power. IEEE Micro 15(6), 11–19 (1995)
Inoue, K., Ishihara, T., Murakami, K.: Way-Predicting Set-Associative Cache for High Performance and Low Energy Consumption. In: Proc. Int. Symp. on Low Power Electronics and Design, August 1999, pp. 273–275 (1999)
Kamble, M.B., Ghose, K.: Analytical Energy Dissipation Models for Low Power Caches. In: Proc. Int. Symp. on Low-Power Electronics and Design (August 1997)
Kin, J., Gupta, M., Mangione-Smith, W.H.: The Filter Cache: An Energy Efficient Memory Structure. In: Proc. Int. Symp. on Microarchitecture (December 1997)
Krewell, K.: IBM’s Power4 Unveiling Continues. Microprocessor Report (November 2000)
Ma, A., Zhang, M., Asanovic, K.: Way Memoization to Reduce Fetch Energy in Instruction Caches. In: ISCA Workshop on Complexity-Effective Design (July 2001)
Memik, G., Reinman, G., Mangione-Smith, W.H.: Reducing Energy and Delay using Efficient Victim Caches. In: Proc. Int. Symp. on Low Power Electronics and Design (2003)
Montanaro, J., et al.: A 160Mhz, 32b, 0.5W CMOS RISC Microprocessor. IEEE Journal of Solid State Circuits 31(11), 1703–1712 (1996)
Nicolaescu, D., Veidenbaum, A.V., Nicolau, A.: Reducing Power Consumption for High-Associativity Data Caches in Embedded Processors. In: Proc. Int. Conf. on Design, Automation and Test in Europe (DATE 2003), March 2003, pp. 11064–11069 (2003)
Nicolaescu, D., Veidenbaum, A.V., Nicolau, A.: Reducing Data Cache Energy Consumption via Cached Load/Store Queue. In: Proc. Int. Symp. on Low Power Electronics and Design (ISLPED 2003), August 2003, pp. 252–257 (2003)
Powell, M.D., Agarwal, A., Vijaykumar, T., Falsafi, B., Roy, K.: Reducing Set-Associative Cache Energy via Way-Prediction and Selective Direct-Mapping. In: Proc. Int. Symp. on Microarchitecture (December 2001)
Rotenberg, E., Bennett, S., Smith, J.E.: Trace Cache: A Low Latency Approach to High Bandwidth Instruction Fetching. In: Proc. of the 29th Int. Symp. on Microarchitecture (November 1996)
Shivakumar, P., Jouppi, N.P.: Cacti 3.0: An Integrated Cache Timing, Power and Area Model. Tech. Report 2001/2, Digital Western Research Lab (2001)
Su, C., Despain, A.: Cache Design Tradeoffs for Power and Performance Optimization: A Case Study. In: Proc Int. Symp. on Low Power Design (1995)
Tang, W., Veidenbaum, A.V., Nicolau, A., Gupta, R.: Integrated I-cache Way Predictor and Branch Target Buffer to Reduce Energy Consumption. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds.) ISHPC 2002. LNCS, vol. 2327, pp. 120–132. Springer, Heidelberg (2002)
Yeh, T.Y., Patt, Y.N.: A Comparison of Dynamic Branch Predictors that Use Two Levels of Branch History. In: Proc. of the Int. Symp. on Computer Architecture, pp. 257–266 (1993)
Yoshimito, M., Anami, K., Shinohara, H., Yoshihara, T., Takagi, H., et al.: A Divided Word-Line Structure in the Static RAM and its Application to a 64k Full CMOS RAM. IEEE J. Solid-State Circuits SC-18, 479–485 (1983)
Zhang, M., Asanovic, K.: Highly-Associative Caches for Low-power processors. In: Proc. Kool Chips Workshop, 33rd Int. Symp. on Microarchitecture (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Aragón, J.L., Veidenbaum, A.V. (2005). Energy-Effective Instruction Fetch Unit for Wide Issue Processors. In: Srikanthan, T., Xue, J., Chang, CH. (eds) Advances in Computer Systems Architecture. ACSAC 2005. Lecture Notes in Computer Science, vol 3740. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11572961_3
Download citation
DOI: https://doi.org/10.1007/11572961_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29643-0
Online ISBN: 978-3-540-32108-8
eBook Packages: Computer ScienceComputer Science (R0)