skip to main content
10.1145/2370816.2370891acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
poster

PS-Dir: a scalable two-level directory cache

Published: 19 September 2012 Publication History

Abstract

As the number of cores increases in both incoming and future chip multiprocessors, coherence protocols must address novel hardware structures in order to scale in terms of performance, power, and area. It is well known that most blocks accessed by parallel applications are private (i.e., accessed by a single core). These blocks present different directory requirements and behavior than shared blocks. Based on this fact, this paper proposes a two-level directory cache that tracks shared blocks in a small and fast first-level cache and private blocks in a larger and slower second-level cache, namely Shared and Private caches, respectively. Speed and area reasons suggest the use of eDRAM technology much dense but slower than SRAM technology for the Private cache, which in turn brings energy savings. Experimental results for a 16-core system show improvements in performance by 11.1%, in area by 25.4%, and in energy consumption by 20.5% compared to a conventional directory cache.

References

[1]
S. Bell, et al. TILE64#8482; processor: A 64-core SoC with mesh interconnect. In ISSCC, pages 88--598, Jan. 2008.
[2]
B. Cuesta, A. Ros, M. E. Gómez, A. Robles, and J. Duato. Increasing the effectiveness of directory caches by deactivating coherence for private memory blocks. In 38th ISCA, pages 93--103, June 2011.
[3]
R. Kalla, B. Sinharoy, W. J. Starke, and M. Floyd. Power7: IBM's Next-Generation Server Processor. IEEE Micro, 30:7--15, 2010.
[4]
M. M. Martin, et al. Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset. Computer Architecture News, 33(4):92--99, Sept. 2005.
[5]
R. E. Matick and S. E. Schuster. Logic-based eDRAM: Origins and rationale for use. IBM Journal of Research and Development, 49(1):145--165, 2005.

Cited By

View all
  • (2024)Scalable short-entry dual-grain coherence directories with flexible region granularityThe Journal of Supercomputing10.1007/s11227-023-05559-880:2(2889-2911)Online publication date: 1-Jan-2024
  • (2021)Efficient classification of private memory blocksJournal of Parallel and Distributed Computing10.1016/j.jpdc.2021.07.005Online publication date: Jul-2021
  • (2020)TLB-based Block-Grain Classification of Private Data2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP50117.2020.00025(122-130)Online publication date: Mar-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PACT '12: Proceedings of the 21st international conference on Parallel architectures and compilation techniques
September 2012
512 pages
ISBN:9781450311823
DOI:10.1145/2370816

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 September 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cache coherence
  2. directory protocol
  3. multicore
  4. private/shared blocks
  5. two-level directory

Qualifiers

  • Poster

Conference

PACT '12
Sponsor:
  • IFIP WG 10.3
  • SIGARCH
  • IEEE CS TCPP
  • IEEE CS TCAA

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Scalable short-entry dual-grain coherence directories with flexible region granularityThe Journal of Supercomputing10.1007/s11227-023-05559-880:2(2889-2911)Online publication date: 1-Jan-2024
  • (2021)Efficient classification of private memory blocksJournal of Parallel and Distributed Computing10.1016/j.jpdc.2021.07.005Online publication date: Jul-2021
  • (2020)TLB-based Block-Grain Classification of Private Data2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP50117.2020.00025(122-130)Online publication date: Mar-2020
  • (2017)Using Multicore Reuse Distance to Study Coherence DirectoriesACM Transactions on Computer Systems10.1145/309270235:2(1-49)Online publication date: 28-Jul-2017
  • (2016)A Hybrid Static-Dynamic Classification for Dual-Consistency Cache CoherenceIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2016.252824127:11(3101-3115)Online publication date: 1-Nov-2016
  • (2016)Cache coherence: A walkthrough of mechanisms and challenges2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT)10.1109/ICEEOT.2016.7755093(2251-2256)Online publication date: Mar-2016
  • (2016)A Directory Cache with Dynamic Private-Shared Partitioning2016 IEEE 23rd International Conference on High Performance Computing (HiPC)10.1109/HiPC.2016.051(382-391)Online publication date: Dec-2016
  • (2016)A dedicated private‐shared cache design for scalable multiprocessorsConcurrency and Computation: Practice and Experience10.1002/cpe.387129:2Online publication date: 12-May-2016
  • (2015)Studying the impact of multicore processor scaling on directory techniques via reuse distance analysis2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2015.7056065(590-602)Online publication date: Feb-2015
  • (2015)PS directoryThe Journal of Supercomputing10.1007/s11227-014-1332-571:8(2847-2876)Online publication date: 1-Aug-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media