skip to main content
10.1145/3407947.3407967acmotherconferencesArticle/Chapter ViewAbstractPublication Pageshp3cConference Proceedingsconference-collections
research-article

Cache/Memory Coordinated Fair Scheduling for Hybrid Memory Systems

Published: 06 August 2020 Publication History

Abstract

Hybrid memory systems comprising DRAM and Non-Volatile Memory (NVM) have gained ever-increasing attention for building large-capacity and energy-efficiency main memory. Nevertheless, there remain challenges to best utilize them because of the asymmetrical access latencies of DRAM and NVM. Traditional memory request scheduling schemes usually lead to notable application performance degradation and unfairness. In this paper, we propose a cache/memory coordinated (CMC) memory request scheduling strategy to address inter-process memory interference and unfairness scheduling problems in hybrid memory systems. Taking the asymmetrical access latencies into account, CMC leverages average access time (AST) to quantify inter-process memory interference in hybrid memory systems. CMC preferentially schedules applications with a small number of memory requests, while schedules memory-intensive applications according to their AST. CMC further filters some memory blocks with poor locality to improve the efficiency of last level cache (LLC). Experimental results show that CMC improves system performance and fairness by up to 16% and 9% than traditional first-ready-first-come-first-service (FRFCFS) memory scheduling policies, respectively.

References

[1]
Renshuo Liu, Deyu Shen, Chialin Yang, Shunchih Yu, and Chengyuan Michael Wang. 2014. NVM duet: Unified working memory and persistent store architecture. SIGPLAN Not. 49, 4 (Feb. 2014), 455--470.
[2]
Jishen Zhao, Onur Mutlu, and Yuan Xie. 2014. FIRM: Fair and high-performance memory control for persistent memory systems. In Proceedings of MICRO. 153--165.
[3]
Xiaoyuan Wang, Haikun Liu, Xiaofei Liao, Ji Chen, Hai Jin, Yu Zhang, Long Zheng, Bingsheng He, and Song Jiang. 2019. Supporting superpages and lightweight page migration in hybrid memory systems. ACM Trans. Architecture and Code Optimization 16, 2 (April 2019), 11:1--11:26.
[4]
Di Chen, Hai Jin, Xiaofei Liao, Haikun Liu, Rentong Guo, and Dong Liu. 2017. MALRU: Miss penalty aware LRU-based cache replacement for hybrid memory systems. In Proceedings of DATE. 1086--1091.
[5]
Runze Han, Peng Huang, Yudi Zhao, Xiaole Cui, Xiaoyan Liu, and Jinfeng Kang. 2019. Efficient evaluation model including interconnect resistance effect for large scale RRAM crossbar array matrix computing. Science China (Information Sciences) 62, 2 (2019), 169--179.
[6]
Hewlet Packard. 2017. HPE unveils computer built for the era of big data, [online] https://www.hpe.com/us/en/newsroom/press-release/2017/05/a-newcomputer-built-for-the-big-data-era.html.
[7]
Yoongu Kim, Michael Papamichael, Onur Mutlu, and Mor Harchol-Balter. 2010. Thread cluster memory scheduling: Exploiting differences in memory access behavior. In Proceedings of MICRO. 65--76.
[8]
Sai Prashanth Muralidhara, Lavanya Subramanian, Onur Mutlu, Mahmut Kandemir, and Tomas Moscibroda. 2011. Reducing memory interference in multicore systems via application-aware memory channel partitioning. In Proceedings of MICRO. 374--385.
[9]
Eiman Ebrahimi, Chang Joo Lee, Onur Mutlu, and Yale N. Patt. 2010. Fairness via source throttling: A configurable and high-performance fairness substrate for multi-core memory systems. In Proceedings of ASPLOS. 335--346.
[10]
Rachata Ausavarungnirun, Kevin KaiWei Chang, Lavanya Subramanian, Gabriel H. Loh, and Onur Mutlu. 2012. Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems. SIGARCH Comput. Archit. News 40, 3 (June 2012), 416--427.
[11]
Lavanya Subramanian, Donghyuk Lee, Vivek Seshadri, Harsha Rastogi, and Onur Mutlu. 2016. BLISS: Balancing performance, fairness and complexity in memory access scheduling. IEEE Trans. Parallel and Distributed Systems 27, 10 (2016), 3071--3087.
[12]
Xuchao Xie, Liquan Xiao, Dengping Wei, Qiong Li, Zhenlong Song, and Xiongzi Ge. 2019. Pinpointing and scheduling access conflicts to improve internal resource utilization in solid-state drives. Front. Comput. Sci. 13, 1 (2019), 35--50.
[13]
Lavanya Subramanian, Vivek Seshadri, Arnab Ghosh, Samira Khan, and Onur Mutlu. 2015. The application slowdown model: Quantifying and controlling the impact of inter-application interference at shared caches and main memory. In Proceedings of MICRO. 62--75.
[14]
Junwhan Ahn, Sungjoo Yoo, and Kiyoung Choi. 2014. DASCA: Dead write prediction assisted STT-RAM cache architecture. In Proceedings of HPCA. 25--36.
[15]
Jayesh Gaur, Mainak Chaudhuri, and Sreenivas Subramoney. 2011. Bypass and insertion algorithms for exclusive last-level caches. In Proceedings of ISCA. 81--92.
[16]
Priyank Faldu and Boris Grot. 2017. Leeway: Addressing variability in dead-block prediction for last-level caches. In Proceedings of PACT. 180--193.
[17]
Aamer Jaleel, Kevin B. Theobald, Simon C. Steely Jr., and Joel Emer. 2010. High performance cache replacement using re-reference interval prediction (RRIP). In Proceedings of ISCA. 60--71.
[18]
Jahagirdar Sanjeev, George Varghese, Sodhi Inder, and Wells Ryan. 2012. Power management of the third generation Intel core micro architecture formerly codenamed Ivy Bridge. [online] http://www.hotchips.org/wpcontent/uploads/hc_archives/hc24/HC24-1-Microprocessor/HC24.28.117-HotChips_IvyBridge_Power_04.pdf.
[19]
Matt Poremba and Yuan Xie. 2012. NVMain: An architectural-level main memory simulator for emerging nonvolatile memories. In Proceedings of ISVLSI. 392--397.
[20]
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K. Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R. Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib, Nilay Vaish, Mark D. Hill, and David A. Wood. 2011. The GEM5 simulator. SIGARCH Comput. Archit. News 39, 2 (2011), 1--7.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
HP3C 2020: Proceedings of the 2020 4th International Conference on High Performance Compilation, Computing and Communications
June 2020
191 pages
ISBN:9781450376914
DOI:10.1145/3407947
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • Xi'an Jiaotong-Liverpool University: Xi'an Jiaotong-Liverpool University
  • City University of Hong Kong: City University of Hong Kong
  • Guangdong University of Technology: Guangdong University of Technology

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 August 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Non-Volatile Memory
  2. cache filter
  3. dead blocks
  4. fairness
  5. hybrid memory systems
  6. memory scheduling

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

HP3C 2020

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 133
    Total Downloads
  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media