skip to main content
10.1145/2818950.2818968acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmemsysConference Proceedingsconference-collections
research-article

Shared Last-Level Caches and The Case for Longer Timeslices

Published: 05 October 2015 Publication History

Abstract

Memory performance is important in modern systems. Contention at various levels in memory hierarchy can lead to significant application performance degradation due to interference. Further, modern, large, last-level caches (LLC) have fill times greater than the OS scheduling window. When several threads are running concurrently and timesharing the CPU cores, they may never be able to load their working sets into the cache before being rescheduled, thus permanently stuck in the "cold-start" regime. We show that by increasing the system scheduling timeslice length it is possible to amortize the cache cold-start penalty due to the multitasking and improve application performance by 10--15%.

References

[1]
"Intel haswell." http://ark.intel.com/products/81061/Intel-Xeon-Processor-E5-2699-v3-45M-Cache-2_30-GHz. 2014.
[2]
B. Sinharoy, R. Kalla, W. J. Starke, H. Q. Le, R. Cargnoni, J. A. Van Norstrand, B. J. Ronchetti, J. Stuecheli, J. Leenstra, G. L. Guthrie, D. Q. Nguyen, B. Blaner, C. F. Marino, E. Retter, and P. Williams, "Ibm power7 multicore server processor," IBM Journal of Research and Development, vol. 55, no. 3, pp. 1:1--1:29, 2011.
[3]
G. H. Loh, "Extending the effectiveness of 3d-stacked dram caches with an adaptive multi-queue policy," in Microarchitecture, 2009. MICRO-42. 42nd Annual IEEE/ACM International Symposium on, pp. 201--212, IEEE, 2009.
[4]
N. P. Jouppi, "Cache Write Policies and Performance," in ISCA, 1993.
[5]
S. M. Khan, Y. Tian, and D. A. Jimenez, "Sampling dead block prediction for last-level caches," in Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO '43, (Washington, DC, USA), pp. 175--186, IEEE Computer Society, 2010.
[6]
D. Lee, J. Choi, J.-H. Kim, S. H. Noh, S. L. Min, Y. Cho, and C. S. Kim, "On the Existence of a Spectrum of Policies that Subsumes the Least Recently Used (LRU) and Least Frequently Used (LFU) Policies," in SIGMETRICS, 1999.
[7]
R. Subramanian, Y. Smaragdakis, and G. H. Loh, "Adaptive Caches: Effective Shaping of Cache Behavior to Workloads," in MICRO, 2006.
[8]
M. K. Qureshi and Y. N. Patt, "Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches," in Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 39, (Washington, DC, USA), pp. 423--432, IEEE Computer Society, 2006.
[9]
M. K. Qureshi, D. Thompson, and Y. N. Patt, "The v-way cache: Demand based associativity via global replacement," in In Proceedings of the 32nd Annual International Symposium on Computer Architecture, pp. 544--555, 2004.
[10]
D. Sanchez and C. Kozyrakis, "Vantage: scalable and efficient fine-grain cache partitioning," in Proceedings of the 38th annual international symposium on Computer architecture, ISCA '11, (New York, NY, USA), pp. 57--68, ACM, 2011.
[11]
G. E. Suh, L. Rudolph, and S. Devadas, "Dynamic partitioning of shared cache memory," J. Supercomput., vol. 28, pp. 7--26, Apr. 2004.
[12]
D. Thiebaut and H. S. Stone, "Footprints in the cache," ACM Trans. Comput. Syst., vol. 5, pp. 305--329, Oct. 1987.
[13]
S. Zhuravlev, J. C. Saez, S. Blagodurov, A. Fedorova, and M. Prieto, "Survey of scheduling techniques for addressing shared resources in multicore processors," ACM Comput. Surv., vol. 45, pp. 4:1--4:28, Dec. 2012.
[14]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood, "Pin: Building customized program analysis tools with dynamic instrumentation," in Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '05, (New York, NY, USA), pp. 190--200, ACM, 2005.
[15]
C. Bienia, S. Kumar, J. P. Singh, and K. Li, "The parsec benchmark suite: characterization and architectural implications," in Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT '08, (New York, NY, USA), pp. 72--81, ACM, 2008.
[16]
C. Bienia, S. Kumar, and K. Li, "Parsec vs. splash-2: A quantitative comparison of two multithreaded benchmark suites on chip-multiprocessors," in Workload Characterization, 2008. IISWC 2008. IEEE International Symposium on, pp. 47--56, Sept 2008.
[17]
J. L. Henning, "Spec cpu2006 benchmark descriptions," SIGARCH Comput. Archit. News, vol. 34, pp. 1--17, Sept. 2006.
[18]
G. Audemard and L. Simon, "Glucose 2.1: Aggressive, but reactive, clause database management, dynamic restarts (system description)," in Pragmatics of SAT 2012 (POS'12), jun 2012. dans le cadre de SAT'2012.

Cited By

View all
  • (2022)Effectiveness Evaluation of Replacement Policies for On-Chip Caches in MultiprocessorsInternational Journal of Embedded and Real-Time Communication Systems10.4018/IJERTCS.28920213:1(1-12)Online publication date: 14-Jan-2022
  • (2017)Hardware-Software Co-design to Mitigate DRAM Refresh OverheadsACM SIGARCH Computer Architecture News10.1145/3093337.303772445:1(723-736)Online publication date: 4-Apr-2017
  • (2017)Hardware-Software Co-design to Mitigate DRAM Refresh OverheadsACM SIGPLAN Notices10.1145/3093336.303772452:4(723-736)Online publication date: 4-Apr-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
MEMSYS '15: Proceedings of the 2015 International Symposium on Memory Systems
October 2015
278 pages
ISBN:9781450336048
DOI:10.1145/2818950
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Last-level cache
  2. Multiprocessing
  3. Operating systems
  4. Scheduling

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

MEMSYS '15
MEMSYS '15: International Symposium on Memory Systems
October 5 - 8, 2015
DC, Washington DC, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Effectiveness Evaluation of Replacement Policies for On-Chip Caches in MultiprocessorsInternational Journal of Embedded and Real-Time Communication Systems10.4018/IJERTCS.28920213:1(1-12)Online publication date: 14-Jan-2022
  • (2017)Hardware-Software Co-design to Mitigate DRAM Refresh OverheadsACM SIGARCH Computer Architecture News10.1145/3093337.303772445:1(723-736)Online publication date: 4-Apr-2017
  • (2017)Hardware-Software Co-design to Mitigate DRAM Refresh OverheadsACM SIGPLAN Notices10.1145/3093336.303772452:4(723-736)Online publication date: 4-Apr-2017
  • (2017)Hardware-Software Co-design to Mitigate DRAM Refresh OverheadsACM SIGOPS Operating Systems Review10.1145/3093315.303772451:2(723-736)Online publication date: 4-Apr-2017
  • (2017)Hardware-Software Co-design to Mitigate DRAM Refresh OverheadsProceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3037697.3037724(723-736)Online publication date: 4-Apr-2017

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media