research-article

Cache/Memory Coordinated Fair Scheduling for Hybrid Memory Systems

Authors:

Xiaofei LiaoAuthors Info & Claims

HP3C 2020: Proceedings of the 2020 4th International Conference on High Performance Compilation, Computing and Communications

Pages 103 - 111

https://doi.org/10.1145/3407947.3407967

Published: 06 August 2020 Publication History

Abstract

Hybrid memory systems comprising DRAM and Non-Volatile Memory (NVM) have gained ever-increasing attention for building large-capacity and energy-efficiency main memory. Nevertheless, there remain challenges to best utilize them because of the asymmetrical access latencies of DRAM and NVM. Traditional memory request scheduling schemes usually lead to notable application performance degradation and unfairness. In this paper, we propose a cache/memory coordinated (CMC) memory request scheduling strategy to address inter-process memory interference and unfairness scheduling problems in hybrid memory systems. Taking the asymmetrical access latencies into account, CMC leverages average access time (AST) to quantify inter-process memory interference in hybrid memory systems. CMC preferentially schedules applications with a small number of memory requests, while schedules memory-intensive applications according to their AST. CMC further filters some memory blocks with poor locality to improve the efficiency of last level cache (LLC). Experimental results show that CMC improves system performance and fairness by up to 16% and 9% than traditional first-ready-first-come-first-service (FRFCFS) memory scheduling policies, respectively.

References

[1]

Renshuo Liu, Deyu Shen, Chialin Yang, Shunchih Yu, and Chengyuan Michael Wang. 2014. NVM duet: Unified working memory and persistent store architecture. SIGPLAN Not. 49, 4 (Feb. 2014), 455--470.

Digital Library

[2]

Jishen Zhao, Onur Mutlu, and Yuan Xie. 2014. FIRM: Fair and high-performance memory control for persistent memory systems. In Proceedings of MICRO. 153--165.

Digital Library

[3]

Xiaoyuan Wang, Haikun Liu, Xiaofei Liao, Ji Chen, Hai Jin, Yu Zhang, Long Zheng, Bingsheng He, and Song Jiang. 2019. Supporting superpages and lightweight page migration in hybrid memory systems. ACM Trans. Architecture and Code Optimization 16, 2 (April 2019), 11:1--11:26.

Digital Library

[4]

Di Chen, Hai Jin, Xiaofei Liao, Haikun Liu, Rentong Guo, and Dong Liu. 2017. MALRU: Miss penalty aware LRU-based cache replacement for hybrid memory systems. In Proceedings of DATE. 1086--1091.

[5]

Runze Han, Peng Huang, Yudi Zhao, Xiaole Cui, Xiaoyan Liu, and Jinfeng Kang. 2019. Efficient evaluation model including interconnect resistance effect for large scale RRAM crossbar array matrix computing. Science China (Information Sciences) 62, 2 (2019), 169--179.

[6]

Hewlet Packard. 2017. HPE unveils computer built for the era of big data, [online] https://www.hpe.com/us/en/newsroom/press-release/2017/05/a-newcomputer-built-for-the-big-data-era.html.

[7]

Yoongu Kim, Michael Papamichael, Onur Mutlu, and Mor Harchol-Balter. 2010. Thread cluster memory scheduling: Exploiting differences in memory access behavior. In Proceedings of MICRO. 65--76.

Digital Library

[8]

Sai Prashanth Muralidhara, Lavanya Subramanian, Onur Mutlu, Mahmut Kandemir, and Tomas Moscibroda. 2011. Reducing memory interference in multicore systems via application-aware memory channel partitioning. In Proceedings of MICRO. 374--385.

Digital Library

[9]

Eiman Ebrahimi, Chang Joo Lee, Onur Mutlu, and Yale N. Patt. 2010. Fairness via source throttling: A configurable and high-performance fairness substrate for multi-core memory systems. In Proceedings of ASPLOS. 335--346.

[10]

Rachata Ausavarungnirun, Kevin KaiWei Chang, Lavanya Subramanian, Gabriel H. Loh, and Onur Mutlu. 2012. Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems. SIGARCH Comput. Archit. News 40, 3 (June 2012), 416--427.

Digital Library

[11]

Lavanya Subramanian, Donghyuk Lee, Vivek Seshadri, Harsha Rastogi, and Onur Mutlu. 2016. BLISS: Balancing performance, fairness and complexity in memory access scheduling. IEEE Trans. Parallel and Distributed Systems 27, 10 (2016), 3071--3087.

Digital Library

[12]

Xuchao Xie, Liquan Xiao, Dengping Wei, Qiong Li, Zhenlong Song, and Xiongzi Ge. 2019. Pinpointing and scheduling access conflicts to improve internal resource utilization in solid-state drives. Front. Comput. Sci. 13, 1 (2019), 35--50.

Digital Library

[13]

Lavanya Subramanian, Vivek Seshadri, Arnab Ghosh, Samira Khan, and Onur Mutlu. 2015. The application slowdown model: Quantifying and controlling the impact of inter-application interference at shared caches and main memory. In Proceedings of MICRO. 62--75.

Digital Library

[14]

Junwhan Ahn, Sungjoo Yoo, and Kiyoung Choi. 2014. DASCA: Dead write prediction assisted STT-RAM cache architecture. In Proceedings of HPCA. 25--36.

[15]

Jayesh Gaur, Mainak Chaudhuri, and Sreenivas Subramoney. 2011. Bypass and insertion algorithms for exclusive last-level caches. In Proceedings of ISCA. 81--92.

Digital Library

[16]

Priyank Faldu and Boris Grot. 2017. Leeway: Addressing variability in dead-block prediction for last-level caches. In Proceedings of PACT. 180--193.

[17]

Aamer Jaleel, Kevin B. Theobald, Simon C. Steely Jr., and Joel Emer. 2010. High performance cache replacement using re-reference interval prediction (RRIP). In Proceedings of ISCA. 60--71.

Digital Library

[18]

Jahagirdar Sanjeev, George Varghese, Sodhi Inder, and Wells Ryan. 2012. Power management of the third generation Intel core micro architecture formerly codenamed Ivy Bridge. [online] http://www.hotchips.org/wpcontent/uploads/hc_archives/hc24/HC24-1-Microprocessor/HC24.28.117-HotChips_IvyBridge_Power_04.pdf.

[19]

Matt Poremba and Yuan Xie. 2012. NVMain: An architectural-level main memory simulator for emerging nonvolatile memories. In Proceedings of ISVLSI. 392--397.

Digital Library

[20]

Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The GEM5 simulator. SIGARCH Comput. Archit. News 39, 2 (2011), 1--7.

Digital Library

Index Terms

Cache/Memory Coordinated Fair Scheduling for Hybrid Memory Systems
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Heterogeneous (hybrid) systems
2. Hardware
  1. Emerging technologies
    1. Memory and dense storage

Recommendations

Morphable DRAM Cache Design for Hybrid Memory Systems

DRAM caches have emerged as an efficient new layer in the memory hierarchy to address the increasing diversity of memory components. When a small amount of fast memory is combined with slow but large memory, the cache-based organization of the fast ...
Shared Last-Level Cache Management and Memory Scheduling for GPGPUs with Hybrid Main Memory

Memory intensive workloads become increasingly popular on general purpose graphics processing units (GPGPUs), and impose great challenges on the GPGPU memory subsystem design. On the other hand, with the recent development of non-volatile memory (NVM) ...
Refresh pausing in DRAM memory systems

Dynamic Random Access Memory (DRAM) cells rely on periodic refresh operations to maintain data integrity. As the capacity of DRAM memories has increased, so has the amount of time consumed in doing refresh. Refresh operations contend with read ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

HP3C 2020: Proceedings of the 2020 4th International Conference on High Performance Compilation, Computing and Communications

June 2020

191 pages

ISBN:9781450376914

DOI:10.1145/3407947

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Xi'an Jiaotong-Liverpool University: Xi'an Jiaotong-Liverpool University
City University of Hong Kong: City University of Hong Kong
Guangdong University of Technology: Guangdong University of Technology

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 August 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

HP3C 2020

HP3C 2020: 2020 4th International Conference on High Performance Compilation, Computing and Communications

June 27 - 29, 2020

Guangzhou, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
133
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten