skip to main content
10.1145/2463209.2488761acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

XDRA: exploration and optimization of last-level cache for energy reduction in DDR DRAMs

Published: 29 May 2013 Publication History

Abstract

Embedded systems with high energy consumption often exploit the idleness of DDR-DRAM to reduce their energy consumption by putting the DRAM into deepest low-power mode (self-refresh power down mode) during idle periods. DDR-DRAM idle periods heavily depend on the last-level cache. Exhaustive search using processor-memory simulators can take several months. This paper for first time proposes a fast framework called XDRA, which allows the exploration of last-level cache configurations to improve DDR-DRAM energy efficiency.
XDRA combines a processor-memory simulator, a cache simulator and novel analysis techniques to produce a Kriging based estimator which predicts the energy savings for differing cache configurations for a given main memory size and application. Errors for the estimator were less than 4.4% on average for 11 applications from mediabench and SPEC2000 suite and two DRAM sizes (Micron DDR3-DRAM 256MB and 4GB). Cache configurations selected by XDRA were on average 3.6x and 4x more energy efficient (cache and DRAM energy) than a common cache configuration. Optimal cache configurations were selected by XDRA 20 times out of 22. The two suboptimal configurations were at most 3.9% from their optimal counterparts. XDRA took a few days for the exploration of 330 cache configurations compared to several hundred days of cycle-accurate simulations, saving at least 85% of exploration time.

References

[1]
F. Catthoor, E. d. Greef, and S. Suytack, Custom Memory Management Methodology: Exploration of Memory Organisation for Embedded Multimedia System Design. Norwell, MA, USA: Kluwer Academic Publishers, 1998.
[2]
I. Micron, "Micron ddr3." http://www.micron.com/products/dram/ddr3/.
[3]
G. Thomas, K. Chandrasekar, B. Akesson, B. Juurlink, and K. Goossens, "A predictor-based power-saving policy for dram memories," in Proc. 15th Euromicro Conference on Digital System Design, (Izmir, Turkey), September 2012.
[4]
A. M. Amin and Z. A. Chishti, "Rank-aware cache replacement and write buffering to improve dram energy efficiency," in Proceedings of the 16th ACM/IEEE international symposium on Low power electronics and design, ISLPED '10, 2010.
[5]
K. Swaminathan, E. Kultursay, V. Saripalli, V. Narayanan, and M. Kandemir, "Design space exploration of workload-specific last-level caches," in Proceedings of International symposium on Low power electronics and design, ISLPED, 2012.
[6]
L. Benini, A. Macii, and M. Poncino, "Energy-aware design of embedded memories: A survey of technologies, architectures, and optimization techniques," ACM Trans. Embed. Comput. Syst., vol. 2, pp. 5--32, February 2003.
[7]
M. Haque, J. Peddersen, A. Janapsatya, and S. Parameswaran, "Dew: A fast level 1 cache simulation approach for embedded processors with fifo replacement policy," in Design, Automation Test in Europe Conference Exhibition (DATE), 2010.
[8]
X. Li, T. Mitra, H. S. Negi, and A. Roychoudhury, "Design space exploration of caches using compressed traces," in In Proceedings of the 18th annual international conference on Supercomputing, pp. 116--125, ACM Press, 2004.
[9]
K. Chandrasekar, B. Akesson, and K. Goossens, "Improved power modeling of ddr sdrams," in DSD, pp. 99--108, 2011.
[10]
V. Delaluz, M. Kandemir, N. Vijaykrishnan, A. Sivasubramaniam, and M. J. Irwin, "Dram energy management using software and hardware directed power mode control," in Proceedings of the 7th International Symposium on High-Performance Computer Architecture, HPCA '01, pp. 159--, IEEE Computer Society, 2001.
[11]
X. Fan, C. Ellis, and A. Lebeck, "Memory controller policies for dram power management," in Proceedings of the 2001 international symposium on Low power electronics and design, ISLPED '01, (New York, NY, USA), pp. 129--134, ACM, 2001.
[12]
I. Hur and C. Lin, "A comprehensive approach to dram power management," in HPCA, pp. 305--316, 2008.
[13]
S. Liu, S. Ogrenci Memik, Y. Zhang, and G. Memik, "An approach for adaptive dram temperature and power management," in Proceedings of the 22nd annual international conference on Supercomputing, ICS '08, pp. 63--72, ACM, 2008.
[14]
J. Lin, H. Zheng, Z. Zhu, Z. Zhang, and H. David, "Dram-level prefetching for fully-buffered dimm: Design, performance and power saving," in Performance Analysis of Systems Software, IEEE International Symposium on, pp. 94--104, 2007.
[15]
J. Trajkovic, A. V. Veidenbaum, and A. Kejariwal, "Improving sdram access energy efficiency for low-power embedded systems," ACM Trans. Embed. Comput. Syst., vol. 7, pp. 24:1--24:21, May 2008.
[16]
H. Zheng, J. Lin, Z. Zhang, E. Gorbatov, H. David, and Z. Zhu, "Mini-rank: Adaptive dram architecture for improving memory power efficiency," Microarchitecture, IEEE/ACM International Symposium on, vol. 0, pp. 210--221, 2008.
[17]
H. Huang, K. G. Shin, C. Lefurgy, and T. Keller, "Improving energy efficiency by making dram less randomly accessed," in Proceedings of the 2005 international symposium on Low power electronics and design, ISLPED '05, 2005.
[18]
M. Lee, E. Seo, J. Lee, and J. soo Kim, "Pabc: Power-aware buffer cache management for low power consumption," IEEE Transactions on Computers, vol. 56, 2007.
[19]
C.-G. Lyuh and T. Kim, "Memory access scheduling and binding considering energy minimization in multi-bank memory systems," in Proceedings of the 41st annual Design Automation Conference, DAC '04, 2004.
[20]
G. Chen, F. Li, and M. Kandemir, "Compiler-directed channel allocation for saving power in on-chip networks," SIGPLAN Not., vol. 41, pp. 194--205, January 2006.
[21]
T. J. Santner, W. B., and N. W., The Design and Analysis of Computer Experiments. Springer-Verlag, 2003.
[22]
HP, "Cacti 6.5." http://www.hpl.hp.com/research/cacti/.
[23]
G. Mariani, A. Brankovic, G. Palermo, J. Jovic, V. Zaccaria, and C. Silvano, "A correlation-based design space exploration methodology for multi-processor systems-on-chip," in Proceedings of the 47th Design Automation Conference, DAC '10, 2010.
[24]
J. L. Hennessy and D. A. Patterson, Computer Architecture, Fourth Edition: A Quantitative Approach. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2006.
[25]
S. Min, J. Peddersen, and S. Parameswaran, "Realizing cycle accurate processor memory simulation via interface abstraction," in VLSI Design (VLSI Design), 2011 24th International Conference on, pp. 141--146, jan. 2011.
[26]
Tensilica, Inc., "Xtensa Configurable Processors." http://www.tensilica.com.
[27]
D. Wang, B. Ganesh, N. Tuaycharoen, K. Baynes, A. Jaleel, and B. L. Jacob, "Dramsim: a memory system simulator," SIGARCH Computer Architecture News, vol. 33, no. 4, pp. 100--107, 2005.
[28]
A. Jaleel, "Memory characterization of workloads using instrumentation-driven simulation." http://www.jaleels.org/ajaleel/workload/SPECanalysis.pdf.
[29]
S. Sair and M. Charney, "Memory Behavior of the SPEC2000 Bechmark Suite," tech. rep., IBM T. J. Watson Research Center, Oct 2000.
[30]
R. Baysal, B. Nelson, and J. Staum, "Response surface methodology for simulating hedging and trading strategies," in Simulation Conference, 2008, dec. 2008.
[31]
"Kriging toolbox for matlab." http://www2.imm.dtu.dk/hbni/dace/.
[32]
A. Gordon-Ross, F. Vahid, and N. Dutt, "Fast configurable-cache tuning with a unified second-level cache," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 17, pp. 80--91, jan. 2009.
[33]
D. R. Jones, M. Schonlau, and W. J. Welch, "Efficient global optimization of expensive black-box functions," J. of Global Optimization, Dec. 1998.

Cited By

View all
  • (2020)FINDER: Find Efficient Parallel Instructions for ASIPs to Improve Performance of Large ApplicationsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2020.3012211(1-1)Online publication date: 2020
  • (2016)An accurate and flexible early memory system power evaluation approach using a microcomponent methodProceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis10.1145/2968456.2968472(1-8)Online publication date: 1-Oct-2016
  • (2015)RAPITIMATEProceedings of the 2015 33rd IEEE International Conference on Computer Design (ICCD)10.1109/ICCD.2015.7357175(635-642)Online publication date: 18-Oct-2015
  • Show More Cited By

Index Terms

  1. XDRA: exploration and optimization of last-level cache for energy reduction in DDR DRAMs

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      DAC '13: Proceedings of the 50th Annual Design Automation Conference
      May 2013
      1285 pages
      ISBN:9781450320719
      DOI:10.1145/2463209
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 29 May 2013

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Research-article

      Conference

      DAC '13
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

      Upcoming Conference

      DAC '25
      62nd ACM/IEEE Design Automation Conference
      June 22 - 26, 2025
      San Francisco , CA , USA

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 19 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2020)FINDER: Find Efficient Parallel Instructions for ASIPs to Improve Performance of Large ApplicationsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2020.3012211(1-1)Online publication date: 2020
      • (2016)An accurate and flexible early memory system power evaluation approach using a microcomponent methodProceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis10.1145/2968456.2968472(1-8)Online publication date: 1-Oct-2016
      • (2015)RAPITIMATEProceedings of the 2015 33rd IEEE International Conference on Computer Design (ICCD)10.1109/ICCD.2015.7357175(635-642)Online publication date: 18-Oct-2015
      • (2014)FALCONProceedings of the 51st Annual Design Automation Conference10.1145/2593069.2593138(1-6)Online publication date: 1-Jun-2014

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media