Abstract
Instruction reuse and memoization exploit the fact that during a program run there are operations that execute more than once with the same operand values. By saving previous occurrences of instructions (operands and result) in dedicated, on-chip lookup tables, it is possible to avoid re-execution of these instructions. This has been shown to be efficient in a naive model that assumes single-cycle table lookup. We now extend the analysis to consider the energy, area, and timing overheads of maintaining such tables.
We show that reuse opportunities abound in the SPEC CPU2000 benchmark suite, and that by judiciously selecting table configurations it is possible to exploit these opportunities with a minimal penalty. Energy consumption can be further reduced by employing confidence counters, which enable instructions that have a history of failed memoizations to be filtered out. We conclude by identifying those instructions that profit most from memoization, and the conditions under which it is beneficial.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
O’Connell, F., White, S.: POWER3: the next generation of PowerPC processors. IBM Journal of Research and Development 44, 873–884 (2000)
Vetter, S., et al.: The POWER4 Processor Introduction and Tuning Guide. IBM (2001)
Intel Corporation: Differences in Optimizing for the Pentium 4 Processor vs. the Pentium III Processor
Intel Corporation: IA-32 Intel® Architecture Optimization Reference Manual (2003)
Sun Microsystems: UltraSPARC III User Manual 2.2 edn. (2003)
Sodani, A., Sohi, G.: Dynamic Instruction Reuse. In: Proceedings of the 24th International Symposium on Computer Architecture (1997)
Citron, D., Feitelson, D., Rudolph, L.: Accelerating Multi-Media Processing by Implementing Memoing in Multiplication and Division Units. In: Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operationg Systems, pp. 252–261 (1998)
Richardson, S.: Exploiting Trivial and Redundant Computation. In: Proceedings of the 11th Symposium on Computer Arithmetic, pp. 220–227 (1993)
Molina, C., González, A., Tubella, J.: Dynamic Removal of Redundant Computations. In: Proceedings of the 1999 International Conference on Supercomputing, pp. 474–481 (1999)
Azam, M., Franzon, P., Liu, W.: Low Power Data Processing by Elimination of Redundant Computations. In: Proceedings of the 7th International Symposium on Low Power Electronics and Design, pp. 259–264 (1997)
Citron, D., Feitelson, D.: Revisiting Instruction Level Reuse. In: Proceedings of the 1st Workshop on Duplicating, Deconstructing, and Debunking, pp. 62–70 (2002)
Tendler, J.M., Dodson, J.S., Fields, J., Le, H., Sinharoy, B.: POWER4 system microarchitecture. IBM Journal of Research and Development 46, 5–26 (2002)
Moudgill, M., Wellman, J., Moreno, J.: Environment for PowerPC Microarchitecture Exploration. IEEE Micro 19, 15–25 (1999)
KleinOsowski, A., Lilja, D.J.: MinneSPEC: A New SPEC Benchmark Workload for Simulation-Based Computer Architecture Research. Computer Architecture Letters 1 (2002)
Yi, J., Lilja, D.: An Analysis of the Amount of Global Level Redundant Computation in the SPEC 95 and SPEC 2000 Benchmarks. In: Proceedings of the 4th Annual Workshop on Workload Characterization (2001)
Jain, R.: The Art of Computer Systems Performance Analysis. Wiley Professional Computing (1992)
Shivakumar, P., Jouppi, N.: CACTI 3.0: An Integrated Cache Timing, Power, and Area Model. Technical report, Compaq: Western Research Laboratory (2001)
Yi, J., Lilja, D.: Improving Processor Performance by Simplifying and Bypassing Trivial Computations. In: Proceedings of the 20th International Conference on Computer Design (2002)
Jacobsen, E., Rotenberg, E., Smith, J.: Assigning Confidence to Conditional Branch Predictions. In: Proceedings of the 29th International Symposium on Microarchitecture, pp. 142–152 (1996)
Burtscher, M., Zorn, B.G.: Prediction Outcome History-based Confidence Estimation for Load Value Prediction. Journal of Instruction-Level Parallelism 1 (1999)
Brooks, D., Bose, P., Srinivasan, V., Gschwind, M.K., Emma, P.G., Rosenfield, M.G.: New methodology for early-stage microarchitecture-level power-performance analysis of microprocessors. IBM Journal of Research and Development 47, 653–670 (2003)
Connors, D., mei Hwu, W.: Compiler-Directed Dynamic Computation Reuse: Rationale and Initial Results. In: Proceedings of the 32nd International Symposium on Microarchitecture, pp. 158–169 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Citron, D., Feitelson, D.G. (2005). “Look It Up” or “Do the Math”: An Energy, Area, and Timing Analysis of Instruction Reuse and Memoization. In: Falsafi, B., VijayKumar, T.N. (eds) Power-Aware Computer Systems. PACS 2003. Lecture Notes in Computer Science, vol 3164. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-28641-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-28641-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24031-0
Online ISBN: 978-3-540-28641-7
eBook Packages: Computer ScienceComputer Science (R0)