Abstract
Modern high–performance out–of–order processors use L1 caches with increasing degree of associativity to improve performance. Higher associativity is not always feasible for two reasons: it increases cache hit latency and energy consumption. One of the main reasons for the increased latency is a multiplexor delay to select one of the lines in a set. The multiplexor is controlled by a hit signal, which means that tag comparison needs to be completed before the multiplexor can be enabled. This paper proposes a new mechanism called Way Cache for setting the multiplexor ahead of time in order to reduce the hit latency. The same mechanism allows access to only one of the tag stores and only one corresponding data store per cache access, which reduces the energy consumption. Unlike way prediction, the Way Cache always contains correct way information - but has misses. The performance of Way Cache is evaluated and compared with Way Prediction for data and instruction caches. The performance of the Way Cache is also evaluated in the presence of a Cached Load/Store Queue, an integrated L0 cache-Load/Store Queue which significantly reduces the number of accesses to the L1 cache.
This work was supported in part by the National Science Foundation under grant NSF CCR-0311738.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Klaiber, A.: The technology behind Crusoe processors. Technical report, Transmeta Corporation (2000)
Motorola: MPC7450 RISC Microprocessor Family User’s Manual (2001)
Intel: Intel Pentium M Processor Datasheet (2003)
Yeager, K.C.: The MIPS R10000 superscalar microprocessor. IEEE Micro 16, 28–40 (1996)
Kessler, R.E.: The Alpha 21264 microprocessor. IEEE Micro 19, 24–36 (1999)
Inoue, K., Ishihara, T., Murakami, K.: Way-predicting set-associative cache for high performance and low energy consumption. In: ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 273–275. IEEE Computer Society Press, Los Alamitos (1999)
McNairy, C., Soltis, D.: Itanium 2 processor microarchitecture. IEEE Micro 23, 44–55 (2003)
MIPS Technologies, Inc.: MIPS R10000 Microprocessor User’s Manual Version 2.0.(1996)
Tang, W., Veidenbaum, A., Nicolau, A., Gupta, R.: Simultaneous way-footprint prediction and branch prediction for energy savings in set-associative instruction caches. In: IEEE Workshop on Power Management for Real-Time and Embedded Systems (2001)
Witchel, E., Larsen, S., Ananian, C.S., Asanovic, K.: Direct addressed caches for reduced power consumption. In: Proceedings of the 34th Annual International Symposium on Microa rchitecture (MICRO-34) (2001)
Burger, D., Austin, T.M.: The SimpleScalar tool set, version 2.0. Technical Report TR-97-1342, University of Wisconsin-Madison (1997)
Nicolaescu, D., Veidenbaum, A., Nicolau, A.: Reducing data cache energy consumption via cached load/store queue. In: Proceedings of the 2003 International Symposium on Low Power Electronics and Design, pp. 252–257. ACM Press, New York (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nicolaescu, D., Veidenbaum, A., Nicolau, A. (2008). Using a Way Cache to Improve Performance of Set-Associative Caches. In: Labarta, J., Joe, K., Sato, T. (eds) High-Performance Computing. ISHPC ALPS 2005 2006. Lecture Notes in Computer Science, vol 4759. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77704-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-77704-5_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77703-8
Online ISBN: 978-3-540-77704-5
eBook Packages: Computer ScienceComputer Science (R0)