skip to main content
10.1145/2628071.2628081acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

SpongeDirectory: flexible sparse directories utilizing multi-level memristors

Published: 24 August 2014 Publication History

Abstract

Cache-coherent shared memory is critical for programmability in many-core systems. Several directory-based schemes have been proposed, but dynamic, non-uniform sharing make efficient directory storage challenging, with each giving up storage space, performance or energy.
We introduce SpongeDirectory, a sparse directory structure that exploits multi-level memristory technology. SpongeDirectory expands directory storage in-place when needed by increasing the number of bits stored on a single memristor device, trading latency and energy for storage.
We explore several SpongeDirectory configurations, finding that a provisioning rate of 0.5x with memristors optimized for low energy consumption is the most competitive. This optimal SpongeDirectory configuration has performance comparable to a conventional sparse directory, requires 18× less storage space, and consumes 8× less energy.

References

[1]
M. E. Acacio, J. Gonzalez, J. M. Garcia, and J. Duato, "A two-level directory architecture for highly scalable cc-numa multiprocessors," Parallel and Distributed Systems, IEEE Transactions on, vol. 16, no. 1, pp. 67--79, 2005.
[2]
A. Alameldeen and D. Wood, "Variability in architectural simulations of multi-threaded workloads," in 9th IEEE International Symposium on High-Performance Computer Architecture, 2003, pp. 7--18.
[3]
F. Alibart, L. Gao, B. Hoskins, and D. B. Strukov, "High-precision tuning of state for memristive devices by adaptable variation-tolerant algorithm," CoRR, vol. abs/1110.1393, 2011.
[4]
M. Alisafaee, "Spatiotemporal coherence tracking," in 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012, pp. 341--350.
[5]
M. Bach, M. Charney, R. Cohn, E. Demikhovsky, T. Devor, K. Hazelwood, A. Jaleel, C.-K. Luk, G. Lyons, H. Patil et al., "Analyzing parallel programs with pin," IEEE Computer, vol. 43, no. 3, pp. 34--41, 2010.
[6]
R. J. Baker, CMOS: circuit design, layout, and simulation. Wiley-IEEE Press, 2011, vol. 18.
[7]
C. Bienia, S. Kumar, J. P. Singh, and K. Li, "The PARSEC benchmark suite: Characterization and architectural implications," in 17th International Conference on Parallel Architectures and Compilation Techniques, 2008, pp. 72--81.
[8]
P. Conway, N. Kalyanasundharam, G. Donley, K. Lepak, and B. Hughes, "Cache hierarchy and memory subsystem of the AMD Opteron processor," IEEE Micro, vol. 30, pp. 16--29, Mar.-Apr. 2010.
[9]
B. A. Cuesta, A. Ros, M. E. Gómez, A. Robles, and J. F. Duato, "Increasing the effectiveness of directory caches by deactivating coherence for private memory blocks," in 38th International Symposium on Computer Architecture, 2011, pp. 93--104.
[10]
X. Dong, C. Xu, Y. Xie, and N. P. Jouppi, "NVSim: A circuit-level performance, energy, and area model for emerging nonvolatile memory." IEEE Trans. on CAD of Integrated Circuits and Systems, vol. 31, no. 7, pp. 994--1007, 2012.
[11]
L. Fang, P. Liu, Q. Hu, M. C. Huang, and G. Jiang, "Building expressive, area-efficient coherence directories," in Proceedings of the 22nd international conference on Parallel architectures and compilation techniques, 2013, pp. 299--308.
[12]
M. Ferdman, P. Lotfi-Kamran, K. Balet, and B. Falsafi, "Cuckoo directory: A scalable directory for many-core systems," in 17th IEEE International Symposium on High-Performance Computer Architecture, 2011, pp. 169--180.
[13]
L. Gao, F. Merrikh-Bayat, X. Guo, D. B. Strukov, and K.-T. Cheng, "Digital-to-analog and analog-to-digital conversion with metal oxide memristors for ultra-low power computing," in IEEE/ACM International Symposium on Nanoscale Architectures, 2013, pp. 19--22.
[14]
G. Grohoski, "Niagara-2: A highly threaded server-on-a-chip," in Hot Chips 20, 2008.
[15]
S.-L. Guo, H.-X. Wang, Y.-B. Xue, C.-M. Li, and D.-S. Wang, "Hierarchical cache directory for CMP," Journal of Computer Science and Technology, vol. 25, pp. 246--256, Mar. 2010.
[16]
A. Gupta, W.-D. Weber, and T. Mowry, "Reducing memory and traffic requirements for scalable directory-based cache coherence schemes," in 19th International Conference on Parallel Processing, 1990, pp. 312--321.
[17]
L. Jiang, B. Zhao, Y. Zhang, and J. Yang, "Constructing large and fast multi-level cell stt-mram based cache for embedded processors," in Design Automation Conference (DAC), 2012 49th ACM/EDAC/IEEE, 2012, pp. 907--912.
[18]
L. Jiang, B. Zhao, Y. Zhang, J. Yang, and B. R. Childers, "Improving write operations in mlc phase change memory," in High Performance Computer Architecture (HPCA), 2012 IEEE 18th International Symposium on. IEEE, 2012, pp. 1--10.
[19]
M. Joshi, W. Zhang, and T. Li, "Mercury: A fast and energy-efficient multi-level cell based phase change memory system," in High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on. IEEE, 2011, pp. 345--356.
[20]
J. H. Kelm, M. R. Johnson, S. S. Lumettta, and S. J. Patel, "WAYPOINT: Scaling coherence to thousand-core architectures," in 19th International Conference on Parallel Architectures and Compilation Techniques, 2010, pp. 99--110.
[21]
J. Li, C.-I. Wu, S. C. Lewis, J. Morrish, T.-Y. Wang, R. Jordan, T. Maffitt, M. Breitwisch, A. Schrott, R. Cheek et al., "A novel reconfigurable sensing scheme for variable level storage in phase change memory," in Memory Workshop (IMW), 2011 3rd IEEE International, 2011, pp. 1--4.
[22]
M. M. K. Martin, M. D. Hill, and D. J. Sorin, "Why on-chip cache coherence is here to stay," Commun. ACM, vol. 55, pp. 78--89, Jul. 2012.
[23]
J. E. Miller, H. Kasture, G. Kurian, C. Gruenwald, N. Beckmann, C. Celio, J. Eastep, and A. Agarwal, "Graphite: A distributed parallel simulator for multicores," in 16th IEEE International Symposium on High-Performance Computer Architecture, 2010, pp. 1--12.
[24]
D. Niu, Q. Zou, C. Xu, and Y. Xie, "Low power multi-level-cell resistive memory design with incomplete data mapping," in Computer Design (ICCD), 2013 IEEE 31st International Conference on. IEEE, 2013, pp. 131--137.
[25]
M. K. Qureshi, M. M. Franceschini, L. A. Lastras-Montaño, and J. P. Karidis, "Morphable memory system: A robust architecture for exploiting multi-level phase change memories," in 37th International Symposium on Computer Architecture, 2010, pp. 153--162.
[26]
A. Ros and S. Kaxiras, "Complexity-effective multicore coherence," in 21st International Conference on Parallel Architectures and Compilation Techniques, 2012, pp. 241--252.
[27]
H. Saadeldeen, D. Franklin, G. Long, C. Hill, A. Browne, D. Strukov, T. Sherwood, and F. T. Chong, "Memristors for neural branch prediction: a case study in strict latency and write endurance challenges," in Proceedings of the ACM International Conference on Computing Frontiers, 2013, p. 26.
[28]
A. Sampson, J. Nelson, K. Strauss, and L. Ceze, "Approximate storage in solid-state memories," in Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013, pp. 25--36.
[29]
D. Sanchez and C. Kozyrakis, "The ZCache: Decoupling ways and associativity," in 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010, pp. 187--198.
[30]
___, "SCD: A scalable coherence directory with flexible sharer set encoding," in 18th IEEE International Symposium on High-Performance Computer Architecture, 2012, pp. 1--12.
[31]
R. Simoni, "Cache coherence directories for scalable multiprocessors," Stanford University, Technical Report CSL-TR-92-550, Oct. 1992.
[32]
A. C. Torrezan, J. P. Strachan, G. Medeiros-Ribeiro, and R. S. Williams, "Sub-nanosecond switching of a tantalum oxide memristor," Nanotechnology, vol. 22, no. 48, 2011.
[33]
D. A. Wallach, "PHD: a hierarchical cache coherent protocol,"Master's thesis, Massachusetts Institute of Technology, 1992.
[34]
J. Wang, X. Dong, G. Sun, D. Niu, and Y. Xie, "Energy-efficient multi-level cell phase-change memory system with data encoding," in Computer Design (ICCD), 2011 IEEE 29th International Conference on, 2011, pp. 175--182.
[35]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta, "The SPLASH-2 programs: Characterization and methodological considerations," in 22nd International Symposium on Computer Architecture, 1995, pp. 24--36.
[36]
Q. Xia, W. Robinett, M. W. Cumbie, N. Banerjee, T. J. Cardinali, J. J. Yang, W. Wu, X. Li, W. M. Tong, D. B. Strukov, and Others, "Memristor-CMOS Hybrid Integrated Circuits for Reconfigurable Logic," Nano letters, vol. 9, no. 10, pp. 3640--3645, 2009.
[37]
C. Xu, X. Dong, N. P. Jouppi, and Y. Xie, "Design implications of memristor-based rram cross-point structures." in DATE, 2011, pp. 734--739.
[38]
C. Xu, D. Niu, N. Muralimanohar, N. P. Jouppi, and Y. Xie, "Understanding the trade-offs in multi-level cell reram memory design," in Design Automation Conference (DAC), 2013 50th ACM/EDAC/IEEE. IEEE, 2013, pp. 1--6.
[39]
J. J. Yang, M. D. Pickett, X. Li, O. A. A., D. R. Stewart, and R. S. Williams, "Memristive switching mechanism for metal//oxide//metal nanodevices," Nature Nanotechnology, vol. 3, pp. 429--433, 2008.
[40]
J. J. Yang, D. B. Strukov, and D. R. Stewart, "Memristive devices for computing," Nature Nanotechnology, vol. 8, pp. 13--24, 2013.
[41]
J. Zebchuk, B. Falsafi, and A. Moshovos, "Multi-grain coherence directory," in Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2013), no. EPFL-CONF-195669, 2013.
[42]
J. Zebchuk, V. Srinivasan, M. K. Qureshi, and A. Moshovos, "A tagless coherence directory," in 42nd Annual IEEE/ACM International Symposium on Microarchitecture, 2009, pp. 423--434.
[43]
H. Zhao, S. Arrvindh, S. Dwarkadas, and V. Srinivasan, "SPATL: Honey, i shrunk the coherence directory," in 20th International Conference on Parallel Architectures and Compilation Techniques, 2011, pp. 33--44.
[44]
H. Zhao, A. Shriraman, and S. Dwarkadas, "SPACE: Sharing pattern-based directory coherence for multicore scalability," in 19th International Conference on Parallel Architectures and Compilation Techniques, 2010, pp. 135--146.

Cited By

View all
  • (2023)Tag-Sharer-Fusion Directory: A Scalable Coherence Directory With Flexible Entry FormatsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.321795634:1(262-274)Online publication date: 1-Jan-2023
  • (2021)Zero Directory Eviction Victim: Unbounded Coherence Directory and Core Cache Isolation2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA51647.2021.00032(277-290)Online publication date: Feb-2021
  • (2020)Memristor-based in-memory logic and its application in image processingMemristive Devices for Brain-Inspired Computing10.1016/B978-0-08-102782-0.00007-1(175-194)Online publication date: 2020
  • Show More Cited By

Index Terms

  1. SpongeDirectory: flexible sparse directories utilizing multi-level memristors

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation
      August 2014
      514 pages
      ISBN:9781450328098
      DOI:10.1145/2628071
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 August 2014

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. multi-level memristors
      2. sparse directories

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      PACT '14
      Sponsor:
      • IFIP WG 10.3
      • SIGARCH
      • IEEE CS TCPP
      • IEEE CS TCAA

      Acceptance Rates

      PACT '14 Paper Acceptance Rate 54 of 144 submissions, 38%;
      Overall Acceptance Rate 121 of 471 submissions, 26%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)20
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Tag-Sharer-Fusion Directory: A Scalable Coherence Directory With Flexible Entry FormatsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.321795634:1(262-274)Online publication date: 1-Jan-2023
      • (2021)Zero Directory Eviction Victim: Unbounded Coherence Directory and Core Cache Isolation2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA51647.2021.00032(277-290)Online publication date: Feb-2021
      • (2020)Memristor-based in-memory logic and its application in image processingMemristive Devices for Brain-Inspired Computing10.1016/B978-0-08-102782-0.00007-1(175-194)Online publication date: 2020
      • (2019)Reliability Enhancements in Memristive Neural Network ArchitecturesIEEE Transactions on Nanotechnology10.1109/TNANO.2019.293380618(866-878)Online publication date: 2019
      • (2019)Quick-and-Dirty: An Architecture for High-Performance Temporary Short Writes in MLC PCMIEEE Transactions on Computers10.1109/TC.2019.2900036(1-1)Online publication date: 2019
      • (2018)Cooperative NV-NUMAProceedings of the International Symposium on Memory Systems10.1145/3240302.3240308(67-78)Online publication date: 1-Oct-2018
      • (2018)An Aging Resilient Neural Network ArchitectureProceedings of the 14th IEEE/ACM International Symposium on Nanoscale Architectures10.1145/3232195.3232208(25-30)Online publication date: 17-Jul-2018
      • (2018)IMAGING-In-Memory AlGorithms for Image processiNGIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2018.2846699(1-14)Online publication date: 2018
      • (2018)Dynamic Coherent Cluster: A Scalable Sharing Set Management Approach2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP)10.1109/ASAP.2018.8445107(1-8)Online publication date: Jul-2018
      • (2017)Thermal-aware, heterogeneous materials for improved energy and reliability in 3D PCM architecturesProceedings of the International Symposium on Memory Systems10.1145/3132402.3132407(223-236)Online publication date: 2-Oct-2017
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media