skip to main content
10.1145/2429384.2429446acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
research-article

CACTI-IO: CACTI with off-chip power-area-timing models

Published: 05 November 2012 Publication History

Abstract

We describe CACTI-IO, an extension to CACTI [4] that includes power, area and timing models for the IO and PHY of the off-chip memory interface for various server and mobile configurations. CACTI-IO enables design space exploration of the off-chip IO along with the DRAM and cache parameters. We describe the models added and three case studies that use CACTI-IO to study the tradeoffs between memory capacity, bandwidth and power.
The case studies show that CACTI-IO helps (i) provide IO power numbers that can be fed into a system simulator for accurate power calculations, (ii) optimize off-chip configurations including the bus width, number of ranks, memory data width and off-chip bus frequency, especially for novel buffer-based topologies, and (iii) enable architects to quickly explore new interconnect technologies, including 3-D interconnect. We find that buffers on board and 3-D technologies offer an attractive design space involving power, bandwidth and capacity when appropriate interconnect parameters are deployed.

References

[1]
W. Dally and J. Poulton, Digital Systems Engineering, Cambridge University Press, 1998.
[2]
H. Bakoglu, Circuits, Interconnections, and Packaging for VLSI, Addison-Wesley, 1990.
[3]
D. Oh and C. Yuan. High-Speed Signaling: Jitter Modeling, Analysis, and Budgeting, Prentice Hall, 2011.
[4]
CACTI. http://www.hpl.hp.com/research/cacti/
[5]
N. P. Jouppi, A. B. Kahng, N. Muralimanohar and V. Srinivas, "CACTI-IO Technical Report," technical report CS2012--0986, UC San Diego CSE Department, August 2012.
[6]
N. Chang, K. Kim and J. Cho, "Bus Encoding for Low-Power High-Performance Memory Systems," Proc. IEEE DAC, 2000, pp. 800--805.
[7]
A. B. Kahng and V. Srinivas, "Mobile System Considerations for SDRAM Interface Trends," Proc. ACM/IEEE SLIP Workshop, 2011, pp. 1--8.
[8]
J. Baloria, "Micron Reinvents DRAM Memory: Hybrid Memory Cube," Proc. IDF Workshop, Sept. 2011.
[9]
Intel's Scalable Memory Buffer. http://tinyurl.com/7xbt27o
[10]
D. H. Yoon, J. Chang, N. Muralimanohar and P. Ranganathan, "BOOM: Enabling Mobile Memory Based Low-Power Server DIMMs," Proc. IEEE ISCA, 2012, pp 25--36.
[11]
McSim. http://cal.snu.ac.kr/mediawiki/index.php/McSim
[12]
C.-K. Luk et al., "PIN: Building Customized Program Analysis Tools with Dynamic Instrumentation," Proc. ACM PLDI, 2005, pp. 190--200.
[13]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh and A. Gupta, "The SPLASH-2 Programs: Characterization and Methodological Considerations," Proc. IEEE ISCA, 1995, pp. 24--36.
[14]
H. Zheng and Z. Zhu, "Power and Performance Trade-Offs in Contemporary DRAM System Designs for Multicore Processors," IEEE Trans. on Computers 59(8) (2010), pp. 1033--1046.
[15]
H. Lee et al., "A 16 Gb/s/Link, 64 GB/s Bidirectional Asymmetric Memory Interface," IEEE JSSC 44(4) (2009), pp. 1235--1247.
[16]
J. Poulton, R. Palmer, A. M. Fuller, T. Greer, J. Eyles, W. J. Dally and M. Horowitz, "A 14-mW 6.25-Gb/s Transceiver in 90-nm CMOS," IEEE JSSC 42(12) (2007), pp. 2745--2757.
[17]
F. O'Mahony et al., "A 47x10Gb/s 1.4mW/(Gb/s) Parallel Interface in 45nm CMOS," Proc. IEEE ISSCC, 2010, pp. 156--158.
[18]
S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen and N. P. Jouppi, "McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures," Proc. IEEE/ACM MICRO, 2009, pp. 469--480.
[19]
S. Thoziyoor, J. Ahn, M. Monchiero, J. B. Brockman and N. P. Jouppi, "A Comprehensive Memory Modeling Tool and its Application to the Design and Analysis of Future Memory Hierarchies," Proc. IEEE ISCA, 2008, pp. 51--62.
[20]
Micron DRAM System Power Calculators. http://www.micron.com/support/dram/power_calc.html
[21]
JEDEC DDR3 Specification JESD79-3B.
[22]
JEDEC LPDDR2 Specification JESD209-2C.
[23]
JEDEC. http://www.jedec.org
[24]
G. Taguchi, Introduction to Quality Engineering, 2nd ed., McGraw-Hill, 1996.
[25]
R. Palmer, J. Poulton, A. Fuller, J. Chen and J. Zerbe, "Design Considerations for Low-Power High-Performance Mobile Logic and Memory Interfaces," Proc. IEEE ASSCC, 2008, pp. 205--208.
[26]
J. Ellis, "Overcoming Obstacles for Closing Timing for DDR3-1600 and Beyond," Denali MemCon, 2010.
[27]
A. Vaidyanath, "Challenges and Solutions for GHz DDR3 Memory Interface Design," Denali MemCon, 2010.
[28]
D. Wang, B. Ganesh, N. Tuaycharoen, K. Baynes, A. Jaleel and B. Jacob, "DRAMsim: A Memory System Simulator," ACM SIGARCH Computer Architecture News - special issue 33(4) (2005), pp. 100--107.
[29]
HP Memory Technology Evolution: An Overview of System Memory Technologies. http://tinyurl.com/7mvktcn
[30]
http://www.micron.com/products/dram_modules/lrdimm.html
[31]
"Challenges and Solutions for Future Main Memory," Rambus White Paper, May 2009. http://tinyurl.com/cetetsz
[32]
B. Schroeder, E. Pinheiro and W. Weber, "DRAM Errors in the Wild: A Large-Scale Field Study," Proc. ACM SIGMETRICS, 2009, pp. 193--204.
[33]
J.-S. Kim et al., "A 1.2V 12.8GB/s 2Gb Mobile Wide-I/O DRAM with 4128 I/Os Using TSV-Based Stacking," Proc. IEEE ISSCC, 2011, pp. 496--498.
[34]
S. Sarkar, A. Brahme and S. Chandar, "Design Margin Methodology for DDR Interface," Proc. IEEE EPEPS, 2007, pp. 167--170.
[35]
S. Chaudhuri, J. McCall and J. Salmon, "Proposal for BER Based Specifications for DDR4," Proc. IEEE EPEPS, 2010, pp. 121--124.
[36]
M. Qureshi, V. Srinivasan and J. Rivers, "Scalable High-Performance Main Memory System Using Phase-Change Memory Technology," Proc. IEEE ISCA, 2009, pp. 24--33.
[37]
HP Power Advisor. http://h18000.www1.hp.com/products/solutions/power/index.html.
[38]
International Technology Roadmap for Semiconductors, 2011 edition. http://www.itrs.net/
[39]
B. K. Casper, M. Haycock and R. Mooney, "An Accurate and Efficient Analysis Method for Multi-Gb/s Chip-to-Chip Signaling Schemes," Proc. IEEE VLSIC, 2002, pp. 54--57.
[40]
Future-Mobile JEDEC Draft Wide IO Specification.
[41]
M. A. Horowitz, C.-K. K. Yang and S. Sidiropoulos, "High-Speed Electrical Signaling: Overview and Limitations," IEEE Trans. on Advanced Packaging 31(4) (2008), pp. 722--730.
[42]
D. Oh, F. Lambrecht, J. H. Ren, S. Chang, B. Chia, C. Madden and C. Yuan, "Prediction of System Performance Based on Component Jitter and Noise Budgets," Proc. IEEE EPEPS, 2007, pp. 33--36.

Cited By

View all
  • (2024)PMGraph: Accelerating Concurrent Graph Queries over Streaming GraphsACM Transactions on Architecture and Code Optimization10.1145/368933721:4(1-25)Online publication date: 20-Nov-2024
  • (2024)PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00052(612-626)Online publication date: 2-Nov-2024
  • (2024)Energy-Efficient and Low-Latency Computation of Transcendental Functions in a Precision-Tunable PIM Architecture2024 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI61997.2024.00043(186-191)Online publication date: 1-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICCAD '12: Proceedings of the International Conference on Computer-Aided Design
November 2012
781 pages
ISBN:9781450315739
DOI:10.1145/2429384
  • General Chair:
  • Alan J. Hu
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 November 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CACTI
  2. DRAM
  3. IO
  4. memory interface
  5. power and timing models

Qualifiers

  • Research-article

Conference

ICCAD '12
Sponsor:

Acceptance Rates

Overall Acceptance Rate 457 of 1,762 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)47
  • Downloads (Last 6 weeks)1
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)PMGraph: Accelerating Concurrent Graph Queries over Streaming GraphsACM Transactions on Architecture and Code Optimization10.1145/368933721:4(1-25)Online publication date: 20-Nov-2024
  • (2024)PIFS-Rec: Process-In-Fabric-Switch for Large-Scale Recommendation System Inferences2024 57th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO61859.2024.00052(612-626)Online publication date: 2-Nov-2024
  • (2024)Energy-Efficient and Low-Latency Computation of Transcendental Functions in a Precision-Tunable PIM Architecture2024 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI61997.2024.00043(186-191)Online publication date: 1-Jul-2024
  • (2024)Hardware-Software Co-Design for Path Planning by Drones2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS58592.2024.10802753(8141-8146)Online publication date: 14-Oct-2024
  • (2024)FlashGNN: An In-SSD Accelerator for GNN Training2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00035(361-378)Online publication date: 2-Mar-2024
  • (2023)MetaNMP: Leveraging Cartesian-Like Product to Accelerate HGNNs with Near-Memory ProcessingProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589091(1-13)Online publication date: 17-Jun-2023
  • (2022)Mach-RT: A Many Chip Architecture for High Performance Ray TracingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2020.302104828:3(1585-1596)Online publication date: 1-Mar-2022
  • (2022)Optimizing data placement and size configuration for morphable NVM based SPM in embedded multicore systemsFuture Generation Computer Systems10.1016/j.future.2022.05.005135(270-282)Online publication date: Oct-2022
  • (2021)TRiM: Enhancing Processor-Memory Interfaces with Scalable Tensor Reduction in MemoryMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480080(268-281)Online publication date: 18-Oct-2021
  • (2017)Processing-In-Memory Architecture Design for Accelerating Neuro-Inspired AlgorithmsNeuro-inspired Computing Using Resistive Synaptic Devices10.1007/978-3-319-54313-0_10(183-207)Online publication date: 22-Apr-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media