skip to main content
10.1145/3522784.3522801acmotherconferencesArticle/Chapter ViewAbstractPublication PagesrapidoConference Proceedingsconference-collections
research-article

HyCSim: A rapid design space exploration tool for emerging hybrid last-level caches

Published: 23 June 2022 Publication History

Abstract

Recent years have seen a rising trend in the exploration of nonvolatile memory (NVM) technologies in the memory subsystem. Particularly in the cache hierarchy, hybrid last-level cache (LLC) solutions are proposed to meet the wide-ranging performance and energy requirements of modern days applications. These emerging hybrid solutions need simulation and detailed exploration to fully understand their capabilities before exploiting them. Existing simulation tools are either too slow or incapable of prototyping such systems and optimizing for NVM devices. To this end, we propose HyCSim1, a trace-driven simulation infrastructure that enables rapid comparison of various hybrid LLC configurations for different optimization objectives. Notably, HyCSim makes it possible to quickly estimate the impact of various hybrid LLC insertion and replacement policies, disabling of a cache region at byte or cache frame granularity for different fault maps. In addition, HyCSim allows to evaluate the impact of various compression schemes on the overall performance (hit and miss rate) and the number of writes to the LLC. Our evaluation on ten multi-program workloads from the SPEC 2006 benchmarks suite shows that HyCSim accelerates the simulation time by 24 ×, compared to the cycle-accurate Gem5 simulator, with high-fidelity.

References

[1]
Sukarn Agarwal. 2020. LiNoVo: Longevity Enhancement of Non-Volatile Caches by Placement, Write-Restriction & Victim Caching in Chip Multi-Processors. Ph. D. Dissertation.
[2]
Dmytro Apalkov, Alexey Khvalkovskiy, Steven Watts, Vladimir Nikitin, Xueti Tang, Daniel Lottis, Kiseok Moon, Xiao Luo, Eugene Chen, Adrian Ong, 2013. Spin-transfer torque magnetic random access memory (STT-MRAM). ACM Journal on Emerging Technologies in Computing Systems (JETC) 9, 2(2013), 1–35.
[3]
Hadi Brais, Rajshekar Kalayappan, and Preeti Ranjan Panda. 2020. A Survey of Cache Simulators. ACM Comput. Surv. 53, 1, Article 19 (feb 2020), 32 pages. https://doi.org/10.1145/3372393
[4]
Trevor E. Carlson, Wim Heirman, and Lieven Eeckhout. 2011. Sniper: Exploring the Level of Abstraction for Scalable and Accurate Parallel Multi-core Simulation. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (Seattle, Washington) (SC ’11). ACM, New York, NY, USA, Article 52, 12 pages. https://doi.org/10.1145/2063384.2063454
[5]
Jonathan Chang, Ming Huang, Jonathan Shoemaker, John Benoit, Szu-Liang Chen, Wei Chen, Siufu Chiu, Raghuraman Ganesan, Gloria Leong, Venkata Lukka, Stefan Rusu, and Durgesh Srivastava. 2007. The 65-nm 16-MB shared on-die L3 cache for the dual-core Intel Xeon processor 7100 series. IEEE Journal of Solid-State Circuits 42, 4 (2007), 846–852.
[6]
Hsiang-Yun Cheng, Jishen Zhao, Jack Sampson, Mary Jane Irwin, Aamer Jaleel, Yu Lu, and Yuan Xie. 2016. LAP: Loop-Block Aware Inclusion Properties for Energy-Efficient Asymmetric Last Level Caches. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA). 103–114. https://doi.org/10.1109/ISCA.2016.19
[7]
Jan Edler and Mark D. Hill. [n. d.]. Dinero IV: Trace-Driven Uniprocessor Cache Simulator. http://pages.cs.wisc.edu/~markhill/DineroIV/. Accessed: 2021-11-23.
[8]
Carlos Escuín Blasco, Teresa Monreal Arnal, José M Llaberia Griñó, Victor Viñals Yúfera, and Pablo Ibáñez Marín. 2019. STT-RAM memory hierarchy designs aimed to performance, reliability and energy consumption. In ACACES 2019: July 17, 2019, Fiuggi, Italy: poster abstracts. European Network of Excellence on High Performance and Embedded Architecture …, 231–234.
[9]
Alexandra Ferrerón, Darío Suárez-Gracia, Jesús Alastruey-Benedé, Teresa Monreal-Arnal, and Pablo Ibáñez. 2016. Concertina: Squeezing in Cache Content to Operate at Near-Threshold Voltage. IEEE Trans. Comput. 65, 3 (2016), 755–769. https://doi.org/10.1109/TC.2015.2479585
[10]
Fazal Hameed and Jeronimo Castrillon. 2019. A Novel Hybrid DRAM/STT-RAM Last-Level-Cache Architecture for Performance, Energy, and Endurance Enhancement. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27, 10(2019), 2375–2386.
[11]
Julian Hammer. [n. d.]. Pycachesim: A single-core cache hierarchy simulator in Python. https://github.com/RRZE-HPC/pycachesim. Accessed: 2021-11-23.
[12]
John L Henning. 2006. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Computer Architecture News 34, 4 (2006), 1–17.
[13]
Ravi Iyer. 2003. On modeling and analyzing cache hierarchies using CASPER. In 11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer Telecommunications Systems, 2003. MASCOTS 2003. IEEE, 182–187.
[14]
Aamer Jaleel, Robert S Cohn, Chi-Keung Luk, and Bruce Jacob. 2008. CMP$im: A Pin-based on-the-fly multi-core cache simulator. In Proceedings of the Fourth Annual Workshop on Modeling, Benchmarking and Simulation (MoBS), co-located with ISCA. 28–36.
[15]
Asif Ali Khan, Fazal Hameed, and Jeronimo Castrillon. 2018. NVMain Extension for Multi-Level Cache Systems. In Proceedings of the Rapido’18 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools (Manchester, United Kingdom) (RAPIDO ’18). Association for Computing Machinery, New York, NY, USA, Article 7, 6 pages. https://doi.org/10.1145/3180665.3180672
[16]
Kunal Korgaonkar, Ishwar Bhati, Huichu Liu, Jayesh Gaur, Sasikanth Manipatruni, Sreenivas Subramoney, Tanay Karnik, Steven Swanson, Ian Young, and Hong Wang. 2018. Density tradeoffs of non-volatile memory as a replacement for SRAM based last level cache. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). IEEE, 315–327.
[17]
Benjamin C Lee, Engin Ipek, Onur Mutlu, and Doug Burger. 2009. Architecting phase change memory as a scalable dram alternative. In Proceedings of the 36th annual international symposium on Computer architecture. 2–13.
[18]
Jason Lowe-Power, Abdul Mutaal Ahmad, Ayaz Akram, Mohammad Alian, Rico Amslinger, Matteo Andreozzi, Adrià Armejach, Nils Asmussen, Brad Beckmann, Srikant Bharadwaj, 2020. The gem5 simulator: Version 20.0+. arXiv preprint arXiv:2007.03152(2020).
[19]
Jing-Yuan Luo, Hsiang-Yun Cheng, Chao Lin, and Da-Wei Chang. 2019. TAP: Reducing the energy of asymmetric hybrid last-level cache via thrashing aware placement and migration. IEEE Trans. Comput. 68, 12 (2019), 1704–1719.
[20]
Agustín Navarro-Torres, Jesús Alastruey-Benedé, Pablo Ibáñez-Marín, and Víctor Viñals-Yúfera. 2019. Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SP. Plos one 14, 8 (2019), e0220135.
[21]
Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavyweight dynamic binary instrumentation. ACM Sigplan notices 42, 6 (2007), 89–100.
[22]
M. Poremba, T. Zhang, and Y. Xie. 2015. NVMain 2.0: A User-Friendly Memory Simulator to Model (Non-)Volatile Memory Systems. IEEE Computer Architecture Letters 14, 2 (July 2015), 140–143.
[23]
Moinuddin K Qureshi, Sudhanva Gurumurthi, and Bipin Rajendran. 2011. Phase change memory: From devices to systems. Synthesis Lectures on Computer Architecture 6, 4 (2011), 1–134.
[24]
Stuart Schechter, Gabriel H Loh, Karin Strauss, and Doug Burger. 2010. Use ECP, not ECC, for hard failures in resistive memories. ACM SIGARCH Computer Architecture News 38, 3 (2010), 141–152.
[25]
Su Myat Min Shwe, Haris Javaid, and Sri Parameswaran. 2013. RExCache: Rapid exploration of unified last-level cache. In 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 582–587.
[26]
Zhe Wang, Daniel A Jiménez, Cong Xu, Guangyu Sun, and Yuan Xie. 2014. Adaptive placement and migration policy for an STT-RAM-based hybrid cache. In 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA). IEEE, 13–24.
[27]
J. Wuu, D. Weiss, C. Morganti, and M. Dreesen. 2005. The asynchronous 24MB on-chip level-3 cache for a dual-core Itanium/sup /spl reg//-family processor. In ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005.488–612 Vol. 1. https://doi.org/10.1109/ISSCC.2005.1494082
[28]
Cong Xu, Dimin Niu, Naveen Muralimanohar, Rajeev Balasubramonian, Tao Zhang, Shimeng Yu, and Yuan Xie. 2015. Overcoming the challenges of crossbar resistive memory architectures. In 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA). IEEE, 476–488.
[29]
Doe Hyun Yoon, Naveen Muralimanohar, Jichuan Chang, Parthasarathy Ranganathan, Norman P Jouppi, and Mattan Erez. 2011. FREE-p: Protecting non-volatile memory against both hard and soft errors. In 2011 IEEE 17th International Symposium on High Performance Computer Architecture. IEEE, 466–477.
[30]
Lunkai Zhang, Brian Neely, Diana Franklin, Dmitri Strukov, Yuan Xie, and Frederic T. Chong. 2016. Mellow Writes: Extending Lifetime in Resistive Memories through Selective Slow Write Backs. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA). 519–531. https://doi.org/10.1109/ISCA.2016.52

Cited By

View all
  • (2024)Hybrid Cache Design Under Varying Power Supply Stability - A Comparative StudyProceedings of the International Symposium on Memory Systems10.1145/3695794.3695819(257-269)Online publication date: 30-Sep-2024
  • (2023)Compression-Aware and Performance-Efficient Insertion Policies for Long-Lasting Hybrid LLCs2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10070968(179-192)Online publication date: Feb-2023
  1. HyCSim: A rapid design space exploration tool for emerging hybrid last-level caches

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    DroneSE and RAPIDO: System Engineering for constrained embedded systems
    January 2022
    58 pages
    ISBN:9781450395663
    DOI:10.1145/3522784
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 June 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. NVMs
    2. emerging architecture.
    3. hybrid last-level caches (LLCs)
    4. prototyping
    5. rapid simulation

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    DroneSE and RAPIDO '22

    Acceptance Rates

    Overall Acceptance Rate 14 of 28 submissions, 50%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)42
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Hybrid Cache Design Under Varying Power Supply Stability - A Comparative StudyProceedings of the International Symposium on Memory Systems10.1145/3695794.3695819(257-269)Online publication date: 30-Sep-2024
    • (2023)Compression-Aware and Performance-Efficient Insertion Policies for Long-Lasting Hybrid LLCs2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10070968(179-192)Online publication date: Feb-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media