skip to main content
10.1145/3240302.3240314acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmemsysConference Proceedingsconference-collections
research-article

Tackling memory access latency through DRAM row management

Published: 01 October 2018 Publication History

Abstract

Memory latency is a critical bottleneck in today's systems. The organization of the DRAM main memory necessitates sensing and reading an entire row (around 4KB) of data in order to access a single cache block. The benefit of this organization is that subsequent accesses to the same row can be served faster (row hits). However, accesses to other rows incur high latency to prepare the DRAM bank for a subsequent access and read the contents of the new row (row conflicts). Therefore, the decision on how long a row is held open for is a key factor that determines the access latency incurred by requests to memory.
While prior work has tackled this problem, existing solutions are either complex or ineffective. Our goal, in this work, is to build a row management scheme that is simple yet effective. Towards this end, we first build a scoreboard scheme that determines how long to hold a row open, by i) predicting the number of row hits and row conflicts for different lengths of time rows are held open and ii) picking the time that maximizes row hits without increasing row conflicts significantly. We then observe that a small set of rows tend to experience a large number of back-to-back accesses. We build a row exclusion scheme that identifies such rows and prevents them from being closed until the next access to a different row arrives. Our evaluations show that our scoreboard and row exclusion policies together incur less than 0.4% of the additional storage cost of the most effective prior mechanism, while surpassing it in terms of performance.

References

[1]
Manu Awasthi, David Nellans, Rajeev Balasubramonian, and Al Davis. 2011. Prediction Based DRAM Row-Buffer Management in the Many-Core Era. In PACT.
[2]
Matthew Blackmore. 2013. A quantitative analysis of memory controller page policies. (2013).
[3]
Karthik Chandrasekar, Sven Goossens, Christian Weis, Martijn Koedam, Benny Akesson, Norbert Wehn, and Kees Goossens. 2014. Exploiting Expendable Process-margins in DRAMs for Run-time Performance Optimization. In DATE.
[4]
K. K. Chang, P. J. Nair, D. Lee, S. Ghose, M. K. Qureshi, and O. Mutlu. 2016. Low-Cost Inter-Linked Subarrays (LISA): Enabling fast inter-subarray data movement in DRAM. In HPCA.
[5]
Jungwhan Choi, Wongyu Shin, Jaemin Jang, Jinwoong Suh, Yongkee Kwon, Youngsuk Moon, and Lee-Sup Kim. 2015. Multiple Clone Row DRAM: A Low Latency and Area Optimized DRAM. In ISCA.
[6]
Erik P DeBenedictis, Jeanine Cook, Sriseshan Srikanth, and Thomas M Conte. 2017. Superstrider associative array architecture: Approved for unlimited unclassified release: SAND2017-7089 C. In High Performance Extreme Computing Conference (HPEC), 2017 IEEE. IEEE, 1--7.
[7]
E. Ebrahimi, O. Mutlu, and Y. N. Patt. 2009. Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems. In HPCA.
[8]
John W. C. Fu, Janak H. Patel, and Bob L. Janssens. 1992. Stride Directed Prefetching in Scalar Processors. In MICRO.
[9]
Mohsen Ghasempour, Aamer Jaleel, Jim D. Garside, and Mikel Luján. 2016. HAPPY: Hybrid Address-based Page Policy in DRAMs. In MEMSYS.
[10]
Nagendra Dwarakanath Gulur, R. Manikantan, Mahesh Mehendale, and R. Govindarajan. 2012. Multiple Sub-row Buffers in DRAM: Unlocking Performance and Energy Improvement Opportunities. In ICS.
[11]
Hasan Hassan, Gennady Pekhimenko, Nandita Vijaykumar, Vivek Seshadri, Donghyuk Lee, Oguz Ergin, and Onur Mutlu. 2016. ChargeCache: Reducing DRAM latency by exploiting row access locality. In HPCA.
[12]
Ibrahim Hur and Calvin Lin. 2006. Memory Prefetching Using Adaptive Stream Detection. In MICRO.
[13]
Radhika Jagtap, Matthias Jung, Wendy Elsasser, Christian Weis, Andreas Hansson, and Norbert Wehn. 2017. Integrating DRAM power-down modes in gem5 and quantifying their impact. In Proceedings of the International Symposium on Memory Systems. ACM, 86--95.
[14]
Victor Jiménez, Roberto Gioiosa, Francisco J. Cazorla, Alper Buyuktosunoglu, Pradip Bose, and Francis P. O'Connell. 2012. Making Data Prefetch Smarter: Adaptive Prefetching on POWER7. In PACT.
[15]
Norman P. Jouppi. 1990. Improving Direct-mapped Cache Performance by the Addition of a Small Fully-associative Cache and Prefetch Buffers. In ISCA.
[16]
O. Kahn and J. Wilcox. 2004. Method for Dynamically Adjusting a Memory Page Closing Policy. U.S. Patent Number 6799241-B2.
[17]
Mushfique Khurshid, Mohit Chainani, Alekhya Perugupalli, and Rahul Srikumar. 2012. Stride and Global History Based DRAM Page Management. In JWAC.
[18]
Yoongu Kim, M. Papamichael, O. Mutlu, and M. Harchol-Balter. 2010. Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior. In MICRO.
[19]
Y. Kim, V. Seshadri, D. Lee, J. Liu, and O. Mutlu. 2012. A Case for Subarray-Level Parallelism (SALP) in DRAM. In ISCA.
[20]
Yoongu Kim, Weikun Yang, and Onur Mutlu. 2016. Ramulator: A Fast and Extensible DRAM Simulator. IEEE CAL 15, 1 (Jan. 2016).
[21]
D. Lee, Y. Kim, G. Pekhimenko, S. Khan, V. Seshadri, K. Chang, and O. Mutlu. 2015. Adaptive-Latency DRAM: Optimizing DRAM Timing for the Common-Case. In HPCA.
[22]
D. Lee, Y. Kim, V. Seshadri, J. Liu, L. Subramanian, and O. Mutlu. 2013. Tiered-Latency DRAM: A Low Latency and Low Cost DRAM Architecture. In HPCA.
[23]
C. K. Luk. 2005. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI.
[24]
S. Muralidhara, L. Subramanian, O. Mutlu, M. Kandemir, and T. Moscibroda. 2011. Reducing Memory Interference in Multi-Core Systems via Application-Aware Memory Channel Partitioning. In MICRO.
[25]
S. Palacharla and R. E. Kessler. 1994. Evaluating Stream Buffers As a Secondary Cache Replacement. In ISCA.
[26]
Seong-Il Park and In-Cheol Park. 2003. History-based memory mode prediction for improving memory performance. In ISCAS.
[27]
S. Rixner, W. J. Dally, U. J. Kapasi, P. Mattson, and J. D. Owens. 2000. Memory Access Scheduling. In ISCA.
[28]
T. Rokicki. 2002. Method and Computer System for Speculatively Closing Pages in Memory. U.S. Patent Number 6389514-B1.
[29]
B. Sander, P. Madrid, and G. Samus. 2005. Dynamic Idle Counter Threshold Value for Use in Memory Paging Policy. U.S. Patent Number 6976122-B1.
[30]
Vivek Seshadri, Yoongu Kim, Chris Fallin, Donghyuk Lee, Rachata Ausavarungnirun, Gennady Pekhimenko, Yixin Luo, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2013. RowClone: Fast and Efficient In-DRAM Copy and Initialization of Bulk Data. In MICRO.
[31]
Vivek Seshadri, Thomas Mullins, Amirali Boroumand, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. 2015. Gather-scatter DRAM: In-DRAM Address Translation to Improve the Spatial Locality of Non-unit Strided Accesses. In MICRO.
[32]
Manjunath Shevgoor, Sahil Koladiya, Rajeev Balasubramonian, Chris Wilkerson, Seth H. Pugsley, and Zeshan Chishti. 2015. Efficiently Prefetching Complex Address Patterns. In MICRO.
[33]
Sriseshan Srikanth, Thomas M Conte, Erik P DeBenedictis, and Jeanine Cook. 2017. The Superstrider Architecture: Integrating Logic and Memory Towards Non-Von Neumann Computing. In Rebooting Computing (ICRC), 2017 IEEE International Conference on. IEEE, 1--8.
[34]
Vladimir V Stankovic and Nebojsa Z Milenkovic. 2005. Dram controller with a close-page predictor. In Computer as a Tool, 2005. EUROCON 2005. The International Conference on, Vol. 1. IEEE, 693--696.
[35]
L. Subramanian, V. Seshadri, Y. Kim, B. Jaiyen, and O. Mutlu. 2013. MISE: Providing Performance Predictability and Improving Fairness in Shared Main Memory Systems. In HPCA.
[36]
The Standard Performance Evaluation Corporation {n. d.}. Welcome to SPEC. The Standard Performance Evaluation Corporation. http://www.specbench.org/.
[37]
Ying Xu, Aabhas S. Agarwal, and Brian T. Davis. 2009. Prediction in Dynamic SDRAM Controller Policies. In SAMOS.
[38]
Y. Zhou and D. Wentzlaff. 2016. MITTS: Memory Inter-arrival Time Traffic Shaping. In ISCA.

Cited By

View all
  • (2024)Characterization and Design of 3D-Stacked Memory for Image Signal Processing on AR/VR DevicesProceedings of the International Symposium on Memory Systems10.1145/3695794.3695799(38-44)Online publication date: 30-Sep-2024
  • (2023)CoolDRAM: An Energy-Efficient and Robust DRAM2023 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)10.1109/ISLPED58423.2023.10244464(1-6)Online publication date: 7-Aug-2023
  • (2023)Efficient Signed Arithmetic Multiplication on Memristor-Based CrossbarIEEE Access10.1109/ACCESS.2023.326325911(33964-33978)Online publication date: 2023
  • Show More Cited By

Index Terms

  1. Tackling memory access latency through DRAM row management

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    MEMSYS '18: Proceedings of the International Symposium on Memory Systems
    October 2018
    361 pages
    ISBN:9781450364751
    DOI:10.1145/3240302
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 October 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Conference

    MEMSYS '18
    MEMSYS '18: The International Symposium on Memory Systems
    October 1 - 4, 2018
    Virginia, Alexandria, USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)56
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 27 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Characterization and Design of 3D-Stacked Memory for Image Signal Processing on AR/VR DevicesProceedings of the International Symposium on Memory Systems10.1145/3695794.3695799(38-44)Online publication date: 30-Sep-2024
    • (2023)CoolDRAM: An Energy-Efficient and Robust DRAM2023 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED)10.1109/ISLPED58423.2023.10244464(1-6)Online publication date: 7-Aug-2023
    • (2023)Efficient Signed Arithmetic Multiplication on Memristor-Based CrossbarIEEE Access10.1109/ACCESS.2023.326325911(33964-33978)Online publication date: 2023
    • (2022)MNEMOSENE: Tile Architecture and Simulator for Memristor-based Computation-in-memoryACM Journal on Emerging Technologies in Computing Systems10.1145/348582418:3(1-24)Online publication date: 29-Jan-2022
    • (2021)Tile Architecture and Hardware Implementation for Computation-in-Memory2021 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI51109.2021.00030(108-113)Online publication date: Jul-2021
    • (2021)PF-DRAMProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00019(126-138)Online publication date: 14-Jun-2021
    • (2020)Intrepydd: performance, productivity, and portability for data science application kernelsProceedings of the 2020 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/3426428.3426915(65-83)Online publication date: 18-Nov-2020
    • (2020)Efficient Organization of Digital Periphery to Support Integer Datatype for Memristor-Based CIM2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI49217.2020.00047(216-221)Online publication date: Jul-2020
    • (2019)MetaStriderACM Transactions on Architecture and Code Optimization10.1145/335539616:4(1-26)Online publication date: 11-Oct-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media