extended-abstract

pmAddr: a persistent memory centric computing architecture

Authors:
Amit Golander

Toga Networks

Toga Networks
View Profile

,
Shai Taharlev

Toga Networks

Toga Networks
View Profile

,
Yigal Korman

Toga Networks

Toga Networks
View Profile

SYSTOR '22: Proceedings of the 15th ACM International Conference on Systems and StorageJune 2022Pages 144https://doi.org/10.1145/3534056.3535000

Published:06 June 2022Publication History

SYSTOR '22: Proceedings of the 15th ACM International Conference on Systems and Storage

Pages 144

ABSTRACT

Traditional Processor-centric computing architectures do not scale-out well, because servers do not share their local main memories. To bypass this architectural limitation, programmers place their shared state on shared storage. But since Storage is slow (many hundreds of microseconds), they speed performance by duplicating the shared state to the compute nodes and have a complex coherent protocols to try keep all copies in sync. In recent years, Memory-centric architectures were proposed as an alternative.

In memory-centric architectures the shared state is placed on a shared memory pool that can be accessed with extreme low latency. Good memory-centric architectures should also be elastic, reliable, load balanced, cheaper than DRAM, thinly provisioned and support multi-tenancy.

We present an industrial shared memory solution with all these features. As shown in Figure 1, it is comprised of: 1) a scale-out persistent memory (PM) pool, 2) random-access client-side libraries, 3) a control plane and 4) RDMA fabric. Scale-out application owners use a library called pmAddr, to allocate or connect to a shared logical address space. The client library hides most of the complexity. It communicates with the relevant Data Server (DS) using RDMA and only communicates with the control plane when it does not know which DS holds the relevant memory region, or when its speculative destination turned out to be incorrect.

Shared Memory, unlike Storage, should primarily be optimized for low Read latency. pmAddr achieves this by combining: zero copy on the client, direct client-server communication (i.e. typically no redundant hops), and no software on the servers read data path. This extremely read-performant design is valid for both the 1-- and 3-copy reliability configurations, and regardless of the number of clients.

Shared memory Writes are optimized for low latency in a similar manner, but server software is involved for 3-copy configurations. The primary DS for the given memory region, will only expose the newly-written data to readers and will only return an Ack to the client after it replicated and validated that the data was successfully written to PM in two other DSs.

The experimental setup uses a synthetic benchmark (FIO) with meta-data like I/O sizes (0.5KB), Linux (CentOS 8.3) and commodity off-the-shelf hardware. The hardware included 8 single-socket clients and 8 DS servers, equipped with eight 128GB Optane PM 200 and a 200GbE switch.

Figure 2a shows the latency for synchronous I/O as a function of different load levels (IOPS). Reads were measured to be available to the application within 4us. The read latency is stable as long as the network is not saturated (as also shown in Figure 2b). Performance of single-copy writes is similar to reads.

Triple-copy writes, as expected, are slower and are sensitive to the number of writes per second. Writes complete within 10us for low to medium loads, but take longer when the DS CPUs are preoccupied and requests are queued. Using CPUs with more cores, even if weaker ones (e.g. Arm) may help improve that in the future.

The pmAddr memory-centric results are 2-3 orders-of-magnitude lower latency compared to modern storage, and proof that PM-centric processing is possible even today, using 2021 off-the-shelf hardware.

Index Terms

pmAddr: a persistent memory centric computing architecture
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
2. Information systems
  1. Information storage systems
    1. Information storage technologies
      1. Storage class memory

Recommendations

WOM-Code Solutions for Low Latency and High Endurance in Phase Change Memory
This paper describes a write-once-memory-code phase change memory (WOM-code PCM) architecture for next-generation non-volatile memory applications. Specifically, we address the long latency of the write operation in PCM—attributed to PCM SET—...
Read More
Mellow writes: extending lifetime in resistive memories through selective slow write backs
ISCA'16

Emerging resistive memory technologies, such as PCRAM and ReRAM, have been proposed as promising replacements for DRAM-based main memory, due to their better scalability, low standby power, and non-volatility. However, limited write endurance is a major ...
Read More
FlexFS: a flexible flash file system for MLC NAND flash memory
USENIX'09: Proceedings of the 2009 conference on USENIX Annual technical conference

The multi-level cell (MLC) NAND flash memory technology enables multiple bits of information to be stored on a single cell, thus making it possible to increase the density of the memory without increasing the die size. For most MLC flash memories, each ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SYSTOR '22: Proceedings of the 15th ACM International Conference on Systems and Storage
June 2022
163 pages
ISBN:9781450393805
DOI:10.1145/3534056
General Chairs:
Michal Malka
IBM Research - Haifa, Israel
,
Hillel Kolodner
IBM Research - Haifa, Israel
,
Program Chairs:
Frank Bellosa
Karlsruhe Institute of Technology, Germany
,
Moshe (Mickey) Gabel
University of Toronto, Canada
Copyright © 2022 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 June 2022
Check for updates
Qualifiers
- extended-abstract
Conference

Acceptance Rates
SYSTOR '22 Paper Acceptance Rate12of41submissions,29%Overall Acceptance Rate94of285submissions,33%
More
Upcoming Conference
SYSTOR '24

Sponsor:

sigops

The 17th ACM International Systems and Storage Conference

September 23 - 25, 2024

Tel-Aviv , Israel
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 83
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

pmAddr: a persistent memory centric computing architecture

SYSTOR '22: Proceedings of the 15th ACM International Conference on Systems and Storage

ABSTRACT

Cited By

Index Terms

Recommendations

WOM-Code Solutions for Low Latency and High Endurance in Phase Change Memory

Mellow writes: extending lifetime in resistive memories through selective slow write backs

FlexFS: a flexible flash file system for MLC NAND flash memory