skip to main content
10.1145/3422575.3422800acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmemsysConference Proceedingsconference-collections
research-article

CLAM: Compiler Lease of Cache Memory

Published: 21 March 2021 Publication History

Abstract

Traditional caching is transparent to software but cannot utilize program information directly. With Moore’s Law ending and general-purpose processor speed plateauing, there is increasing importance and interest in specialization including the interaction between the software and the cache.
This paper presents Compiler Lease of cAche Memory (CLAM) which augments the interface between software and hardware and lets a compiler control cache management. The new software control enables optimization beyond what is possible in traditional memory system designs. CLAM has been implemented on a CycloneV-GT FPGA card with a RISC-V processor and the new hardware cache, and the evaluation has shown performance improvements over existing techniques in all of the 7 programs tested from the Polybench suite.

References

[1]
Shahid Alam and R. Nigel Horspool. 2015. A Survey: Software-Managed On-Chip Memories. Comput. Informatics 34, 5 (2015), 1168–1200.
[2]
Altera. [n.d.]. Cyclone V GT FPGA Development Board Reference Manual. Altera.
[3]
Amro Awad, Arkaprava Basu, Sergey Blagodurov, Yan Solihin, and Gabriel H. Loh. 2017. Avoiding TLB Shootdowns Through Self-Invalidating TLB Entries. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. 273–287. https://doi.org/10.1109/PACT.2017.38
[4]
Wenlei Bao, Sriram Krishnamoorthy, Louis-Noël Pouchet, and P. Sadayappan. 2018. Analytical modeling of cache behavior for affine programs. Proceedings of the ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages 2, POPL (2018), 32:1–32:26.
[5]
Nathan Beckmann and Daniel Sanchez. 2015. Talus: A simple way to remove cliffs in cache performance. In Proceedings of the International Symposium on High-Performance Computer Architecture. 64–75. https://doi.org/10.1109/HPCA.2015.7056022
[6]
Kristof Beyls and Erik H. D’Hollander. 2005. Generating cache hints for improved program efficiency. Journal of Systems Architecture 51, 4 (2005), 223–250.
[7]
Dong Chen, Fangzhou Liu, Chen Ding, and Sreepathi Pai. 2018. Locality analysis through static parallel sampling. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. 557–570. https://doi.org/10.1145/3192366.3192402
[8]
Edward G. Coffman Jr. and Peter J. Denning. 1973. Operating Systems Theory. Prentice-Hall.
[9]
Peter J. Denning. 1968. The working set model for program behaviour. Commun. ACM 11, 5 (1968), 323–333.
[10]
Peter J. Denning. 1980. Working sets past and present. IEEE Transactions on Software Engineering SE-6, 1 (Jan. 1980).
[11]
Nam Duong, Dali Zhao, Taesu Kim, Rosario Cammarota, Mateo Valero, and Alexander V. Veidenbaum. 2012. Improving Cache Management Policies Using Dynamic Reuse Distances. In Proceedings of the ACM/IEEE International Symposium on Microarchitecture. 389–400. https://doi.org/10.1109/MICRO.2012.43
[12]
Brad Fitzpatrick. 2004. Distributed caching with Memcached. Linux Journal 2004, 124 (2004), 5.
[13]
Cary G. Gray and David R. Cheriton. 1989. Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency. In Proceedings of the ACM Symposium on Operating System Principles. 202–210. https://doi.org/10.1145/74850.74870
[14]
Xiaoming Gu and Chen Ding. 2012. A generalized theory of collaborative caching. In Proceedings of the International Symposium on Memory Management. 109–120.
[15]
Aamer Jaleel, Kevin B Theobald, Simon C Steely Jr, and Joel Emer. 2010. High performance cache replacement using re-reference interval prediction (RRIP). In ACM SIGARCH Computer Architecture News, Vol. 38. ACM, 60–71.
[16]
Lian Li, Jingling Xue, and Jens Knoop. 2010. Scratchpad memory allocation for data aggregates via interval coloring in superperfect graphs. ACM Trans. Embedded Comput. Syst. 10, 2 (2010), 28:1–28:42.
[17]
Pengcheng Li, Colin Pronovost, William Wilson, Benjamin Tait, Jie Zhou, Chen Ding, and John Criswell. 2019. Beating OPT with Statistical Clairvoyance and Variable Size Caching. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 243–256.
[18]
R. L. Mattson, J. Gecsei, D. Slutz, and I. L. Traiger. 1970. Evaluation techniques for storage hierarchies. IBM System Journal 9, 2 (1970), 78–117.
[19]
Louis-Noel Pouchet and Tomofumi Yuki. 2016. PolyBench/C 4.2.1. http://https://sourceforge.net/projects/polybench/files/.
[20]
Kimming So and Rudolph N. Rechtschaffen. 1988. Cache Operations by MRU Change. IEEE Trans. Comput. 37, 6 (1988), 700–709. https://doi.org/10.1109/12.2208
[21]
Sumesh Udayakumaran, Angel Dominguez, and Rajeev Barua. 2006. Dynamic allocation for scratch-pad memory using compile-time decisions. ACM Transactions on Embedded Computer Systems 5, 2 (2006), 472–511.
[22]
Carl A. Waldspurger, Trausti Saemundsson, Irfan Ahmad, and Nohhyun Park. 2017. Cache Modeling and Optimization using Miniature Simulations. In Proceedings of USENIX Annual Technical Conference. 487–498. https://www.usenix.org/conference/atc17/technical-sessions/presentation/waldspurger
[23]
Qingsen Wang, Xu Liu, and Milind Chabbi. 2019. Featherlight Reuse-Distance Measurement. In Proceedings of the International Symposium on High-Performance Computer Architecture. IEEE, 440–453.
[24]
Z. Wang, K. S. McKinley, A. L.Rosenberg, and C. C. Weems. 2002. Using the compiler to improve cache replacement decisions. In Proceedings of the International Conference on Parallel Architecture and Compilation Techniques. Charlottesville, Virginia.
[25]
Andrew Waterman, Yunsup Lee, David A. Patterson, and Krste Asanović. 2019. The RISC-V Instruction Set Manual, Volume I: Unpriviledged ISA. Technical Report. EECS Department, University of California, Berkeley.

Cited By

View all
  • (2024)Implementation of a Two-Level Programmable Cache Emulation and Test SystemProceedings of the International Symposium on Memory Systems10.1145/3695794.3695821(140-156)Online publication date: 30-Sep-2024
  • (2023)Cache Programming for Scientific Loops Using LeasesACM Transactions on Architecture and Code Optimization10.1145/360009020:3(1-25)Online publication date: 19-Jul-2023
  • (2022)CARL: Compiler Assigned Reference LeasingACM Transactions on Architecture and Code Optimization10.1145/349873019:1(1-28)Online publication date: 17-Mar-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
MEMSYS '20: Proceedings of the International Symposium on Memory Systems
September 2020
362 pages
ISBN:9781450388993
DOI:10.1145/3422575
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 March 2021

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

MEMSYS 2020
MEMSYS 2020: The International Symposium on Memory Systems
September 28 - October 1, 2020
DC, Washington, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Implementation of a Two-Level Programmable Cache Emulation and Test SystemProceedings of the International Symposium on Memory Systems10.1145/3695794.3695821(140-156)Online publication date: 30-Sep-2024
  • (2023)Cache Programming for Scientific Loops Using LeasesACM Transactions on Architecture and Code Optimization10.1145/360009020:3(1-25)Online publication date: 19-Jul-2023
  • (2022)CARL: Compiler Assigned Reference LeasingACM Transactions on Architecture and Code Optimization10.1145/349873019:1(1-28)Online publication date: 17-Mar-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media