research-article

Courteous cache sharing: being nice to others in capacity management

Authors:
Akbar Sharifi

The Pennsylvania State University, University Park, PA

The Pennsylvania State University, University Park, PA
View Profile

,
Shekhar Srikantaiah

The Pennsylvania State University, University Park, PA

The Pennsylvania State University, University Park, PA
View Profile

,
Mahmut Kandemir

The Pennsylvania State University, University Park, PA

The Pennsylvania State University, University Park, PA
View Profile

,
Mary Jane Irwin

The Pennsylvania State University, University Park, PA

The Pennsylvania State University, University Park, PA
View Profile

DAC '12: Proceedings of the 49th Annual Design Automation ConferenceJune 2012Pages 678–687https://doi.org/10.1145/2228360.2228482

Published:03 June 2012Publication History

DAC '12: Proceedings of the 49th Annual Design Automation Conference

Pages 678–687

ABSTRACT

This paper proposes a cache management scheme for multiprogrammed, multithreaded applications, with the objective of obtaining maximum performance for both individual applications and the multithreaded workload mix. In this scheme, each individual application's performance is improved by increasing the priority of its slowest thread, while the overall system performance is improved by ensuring that each individual application's performance benefit does not come at the cost of a significant degradation to other application's threads that are sharing the same cache. Averaged over six workloads, our shared cache management scheme improves the performance of the combination of applications by 18%. These improvements across applications in each mix are also fair, as indicated by average fair speedup improvements of 10% across the threads of each application (averaged over all the workloads).

References

M. Bhadauria and S. A. McKee. An approach to resource-aware co-scheduling for CMPs. In ICS'10. Google ScholarDigital Library
J. Chang and G. S. Sohi. Cooperative cache partitioning for chip multiprocessors. In ICS'07. Google ScholarDigital Library
M. Chaudhuri. Pseudo-LIFO: the foundation of a new family of replacement policies for last-level caches. In MICRO'09. Google ScholarDigital Library
E. Ebrahimi, et al. Fairness via source throttling: a configurable and high-performance fairness substrate for multi-core memory systems. In ASPLOS'10. Google ScholarDigital Library
F. Guo, et al. From chaos to QoS: case studies in CMP resource management. SIGARCH Comput. Archit. News, 35(1):21--30, 2007. Google ScholarDigital Library
L. R. Hsu, et al. Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource. In PACT'06. Google ScholarDigital Library
R. Iyer. CQoS: a framework for enabling QoS in shared caches of CMP platforms. In ICS'04. Google ScholarDigital Library
R. Iyer, et al. QoS policies and architecture for cache/memory in CMP platforms. SIGMETRIGS Perform. Eval. Rev., 35(1):25--36, 2007. Google ScholarDigital Library
A. Jaleel, et al. Adaptive insertion policies for managing shared caches. In PACT'08. Google ScholarDigital Library
J. A. Kahle, et al. Introduction to the CELL multiprocessor. IBM J. Res. Dev., 49(4/5):589--604, 2005. Google ScholarCross Ref
M. Kandemir, et al. A helper thread based dynamic cache partitioning scheme for multithreaded applications. In DAC'11. Google ScholarDigital Library
P. Kongetira, et al. Niagara: A 32-way multithreaded sparc processor. IEEE Micro, 25(2):21--29, 2005. Google ScholarDigital Library
P. S. Magnusson, et al. SIMICS: A full system simulation platform. Computer, 35(2):50--58, 2002. Google ScholarDigital Library
R. Manikantan, et al. NUcache: An efficient multicore cache organization based on next-use distance. In HPCA, 2011. Google ScholarDigital Library
C. McNairy and R. Bhatia. Montecito: A dual-core, dual-thread Itanium processor. IEEE Micro, 25(2): 10--20, 2005. Google ScholarDigital Library
S. P. Muralidhara, M. Kandemir, and P. Raghavan. Intra-application shared cache partitioning for multithreaded applications. In PPoPP'10. Google ScholarDigital Library
M. K. Qureshi, et al. Adaptive insertion policies for high performance caching. In ISCA '07. Google ScholarDigital Library
M. K. Qureshi and Y. N. Patt. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In MICRO'06. Google ScholarDigital Library
N. Rafique, et al. Architectural support for operating system-driven CMP cache management. In PACT'06. Google ScholarDigital Library
D. Sanchez, et al. Flexible architectural support for fine-grain scheduling. In ASPLOS'10. Google ScholarDigital Library
S. Srikantaiah, et al. SHARP control: controlled shared cache management in chip multiprocessors. In MICRO'09. Google ScholarDigital Library
G. E. Suh, et al. Dynamic partitioning of shared cache memory. J. Supercomput., 28(1):7--26, 2004. Google ScholarDigital Library
Y. Xie and G. H. Loh. Pipp: promotion/insertion pseudo-partitioning of multi-core shared caches. In ISCA '09. Google ScholarDigital Library
S. Zhuravlev, et al. Addressing shared resource contention in multicore processors via scheduling. In ASPLOS'10. Google ScholarDigital Library

Index Terms

Courteous cache sharing: being nice to others in capacity management
1. General and reference
  1. Cross-computing tools and techniques
    1. Design
2. Hardware
  1. Integrated circuits
    1. Semiconductor memory

Recommendations

Code-based cache partitioning for improving hardware cache performance
ICUIMC '12: Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication

Recently, improving hardware cache performance is getting more important, because the performance gap between processor and memory has caused "memory wall" problem. Most cache designs are based on the LRU replacement policy which is effective for high-...
Read More
Using Aggressor Thread Information to Improve Shared Cache Management for CMPs
PACT '09: Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques

Shared cache allocation policies play an important role in determining CMP performance. The simplest policy, LRU, allocates cache implicitly as a consequence of its replacement decisions. But under high cache interference, LRU performs poorly because ...
Read More
Cache Exclusivity and Sharing: Theory and Optimization

A problem on multicore systems is cache sharing, where the cache occupancy of a program depends on the cache usage of peer programs. Exclusive cache hierarchy as used on AMD processors is an effective solution to allow processor cores to have a large ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
DAC '12: Proceedings of the 49th Annual Design Automation Conference
June 2012
1357 pages
ISBN:9781450311991
DOI:10.1145/2228360
General Chair:
Patrick Groeneveld
Magma Design Automation, Inc., San Jose, CA
,
Program Chairs:
Donatella Sciuto
Politecnico di Milano, Milano, Italy
,
Soha Hassoun
Tufts Univ., Medford, MA
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 June 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
multithreaded applications
shared cache management
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,770of5,499submissions,32%
Upcoming Conference
DAC '24

Sponsor:

sigda

61st ACM/IEEE Design Automation Conference

June 23 - 27, 2024

San Francisco , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 16
  Total Citations
  View Citations
- 220
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Courteous cache sharing: being nice to others in capacity management

DAC '12: Proceedings of the 49th Annual Design Automation Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Code-based cache partitioning for improving hardware cache performance

Using Aggressor Thread Information to Improve Shared Cache Management for CMPs

Cache Exclusivity and Sharing: Theory and Optimization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Courteous cache sharing: being nice to others in capacity management

DAC '12: Proceedings of the 49th Annual Design Automation Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Code-based cache partitioning for improving hardware cache performance

Using Aggressor Thread Information to Improve Shared Cache Management for CMPs

Cache Exclusivity and Sharing: Theory and Optimization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media