Abstract
Merging processors and memory into a single chip has the well-known benefits of allowing high-bandwidth and low-latency communication between processor and memory, and reducing energy consumption. As a result, many different systems based on what has been called Processor In Memory (PIM) architectures have been proposed [1, 3, 7, 8, 10, 12-16, 18].
Recent advances in technology [4, 5] appear to make it possible to integrate logic that cycles nearly as fast as in a logic-only chip. As a result, processors are likely to put much pressure on the relatively slow on-chip DRAM. To handle the speed mismatch between processors and DRAM, these chips are likely to include non-trivial memory hierarchies in each DRAM bank.
With many on-chip high-frequency processors, all of them potentially accessing the memory system concurrently, these chips will consume much energy. In addition, these chips are likely to be used in non-traditional places like the memory of a server [3, 7,
In this abstract, we examine, from a performance and energy-efficiency point of view, the design of the memory hierarchy in a multi-banked PIM chip with many simple, fast processors. Our results suggest the use of per-processor memory hierarchies that include modest-sized caches, simple DRAM bank organizations that support segmentation, and no prefetching.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
A. Brown et al. ISTORE: Introspective Storage for Data-Intensive Network Services. Workshop on Hot Topics in Operating Systems, March 1999.
R. Gonzalez and M. Horowitz. Energy Dissipation In General Purpose Microprocessors. IEEE Journal on Solid-State Circuits, 31(4):1277–1284, September 1996.
M. Hall et al. Mapping Irregular Aplications to DIVA, a PIM-Based Data-Intensive Architecture. In Supercomputing, November 1999.
IBM Microelectronics. Blue Logic SA-27E ASIC. http://www.chips.ibm.com/news/1999/sa27e, February 1999.
S. Iyer and H. Kalter. Embedded DRAM Technology: Opportunities and Challenges. IEEE Spectrum, April 1999.
M. Kamble and K. Ghose. Analytical Energy Dissipation Models for Low Power Caches. In International Symposium on Low Power Electronics and Design, pages 143–148, 1997.
Y. Kang, W. Huang, S. Yoo, D. Keen, Z. Ge, V. Lam, P. Pattnaik, and J. Torrellas. FlexRAM: Toward an Advanced Intelligent Memory System. In International Conference on Computer Design, pages 192–201, October 1999.
P. Kogge, S. Bass, J. Brockman, D. Chen, and E. Sha. Pursuing a Petaflop: Point Designs for 100_TF Computers Using PIM Technologies. In Frontiers of Massively Parallel Computation Symposium, 1996.
V. Krishnan and J. Torrellas. An Execution-Driven Framework for Fast and Accurate Simulation of Superscalar Processors. In International Conference on Parallel Architectures and Compilation Techniques, pages 286–293, October 1998.
K. Mai et al. Smart Memories: A Modular Reconfigurable Architecture. In International Symposium on Computer Architecture, June 2000.
J. Montanaro et al. A 160-MHz, 32-b, 0.5-W CMOS RISC Microprocessor. IEEE Journal of Solid State Circuits, 31(11):1703–1714, November 1996.
M. Oskin, F. Chong, and T. Sherwood. Active Pages: A Computation Model for Intelligent Memory. In International Symposium on Computer Architecture, pages 192–203, June 1998.
M. Oskin et al. Exploiting ILP in Page-Based Intelligent Memory. In International Symposium on Microarchitecture, 1999.
D. Patterson et al. A Case for Intelligent DRAM. IEEE Micro, pages 33–44, 1997.
D. Patterson and M. Smith. Workshop on Mixing Logic and DRAM: Chips that Compute and Remember. 1997.
S. Rixner et al. A Bandwidth-Efficient Architecture for Media Processing. In International Symposium on Microarchitecture, November 1998.
C-L. Su and A. Despain. Cache Design Trade-offs for Power and Performance Optimization: A Case Study. In International Symposium on Low Power Electronics and Design, pages 63–68, April 1995.
E. Waingold et al. Baring It All to Software: Raw Machines. IEEE Computer, pages 86–93, September 1997.
S. Wilton and N. Jouppi. CACTI: An Enhanced Cache Access and Cycle Time Model. IEEE Journal on Solid-State Circuits, 31(5):677–688, May 1996.
N. Yeung et al. The Design of a 55SPECint92 RISC Processor under 2W. ISSCC Digest of Technical Papers, pages 206–207, February 1994.
S-M. Yoo, J. Renau, M. Huang, and J. Torrellas. FlexRAM Architecture Design Parameters. Technical Report CSRD-1584, Department of Computer Science, University of Illinois at Urbana-Champaign, October 2000. http://iacoma.cs.uiuc.edu/flexram/publications.html.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Huang, M., Renau, J., Yoo, SM., Torrellas, J. (2001). Energy/Performance Design of Memory Hierarchies for Processor-in-Memory Chips⋆. In: Chong, F.T., Kozyrakis, C., Oskin, M. (eds) Intelligent Memory Systems. IMS 2000. Lecture Notes in Computer Science, vol 2107. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44570-6_11
Download citation
DOI: https://doi.org/10.1007/3-540-44570-6_11
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42328-7
Online ISBN: 978-3-540-44570-8
eBook Packages: Springer Book Archive