skip to main content
10.1145/3534056.3535000acmconferencesArticle/Chapter ViewAbstractPublication PagessystorConference Proceedingsconference-collections
extended-abstract

pmAddr: a persistent memory centric computing architecture

Published:06 June 2022Publication History

ABSTRACT

Traditional Processor-centric computing architectures do not scale-out well, because servers do not share their local main memories. To bypass this architectural limitation, programmers place their shared state on shared storage. But since Storage is slow (many hundreds of microseconds), they speed performance by duplicating the shared state to the compute nodes and have a complex coherent protocols to try keep all copies in sync. In recent years, Memory-centric architectures were proposed as an alternative.

In memory-centric architectures the shared state is placed on a shared memory pool that can be accessed with extreme low latency. Good memory-centric architectures should also be elastic, reliable, load balanced, cheaper than DRAM, thinly provisioned and support multi-tenancy.

We present an industrial shared memory solution with all these features. As shown in Figure 1, it is comprised of: 1) a scale-out persistent memory (PM) pool, 2) random-access client-side libraries, 3) a control plane and 4) RDMA fabric. Scale-out application owners use a library called pmAddr, to allocate or connect to a shared logical address space. The client library hides most of the complexity. It communicates with the relevant Data Server (DS) using RDMA and only communicates with the control plane when it does not know which DS holds the relevant memory region, or when its speculative destination turned out to be incorrect.

Shared Memory, unlike Storage, should primarily be optimized for low Read latency. pmAddr achieves this by combining: zero copy on the client, direct client-server communication (i.e. typically no redundant hops), and no software on the servers read data path. This extremely read-performant design is valid for both the 1-- and 3-copy reliability configurations, and regardless of the number of clients.

Shared memory Writes are optimized for low latency in a similar manner, but server software is involved for 3-copy configurations. The primary DS for the given memory region, will only expose the newly-written data to readers and will only return an Ack to the client after it replicated and validated that the data was successfully written to PM in two other DSs.

The experimental setup uses a synthetic benchmark (FIO) with meta-data like I/O sizes (0.5KB), Linux (CentOS 8.3) and commodity off-the-shelf hardware. The hardware included 8 single-socket clients and 8 DS servers, equipped with eight 128GB Optane PM 200 and a 200GbE switch.

Figure 2a shows the latency for synchronous I/O as a function of different load levels (IOPS). Reads were measured to be available to the application within 4us. The read latency is stable as long as the network is not saturated (as also shown in Figure 2b). Performance of single-copy writes is similar to reads.

Triple-copy writes, as expected, are slower and are sensitive to the number of writes per second. Writes complete within 10us for low to medium loads, but take longer when the DS CPUs are preoccupied and requests are queued. Using CPUs with more cores, even if weaker ones (e.g. Arm) may help improve that in the future.

The pmAddr memory-centric results are 2-3 orders-of-magnitude lower latency compared to modern storage, and proof that PM-centric processing is possible even today, using 2021 off-the-shelf hardware.

Index Terms

  1. pmAddr: a persistent memory centric computing architecture

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SYSTOR '22: Proceedings of the 15th ACM International Conference on Systems and Storage
        June 2022
        163 pages
        ISBN:9781450393805
        DOI:10.1145/3534056

        Copyright © 2022 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 6 June 2022

        Check for updates

        Qualifiers

        • extended-abstract

        Acceptance Rates

        SYSTOR '22 Paper Acceptance Rate12of41submissions,29%Overall Acceptance Rate94of285submissions,33%

        Upcoming Conference

        SYSTOR '24
        The 17th ACM International Systems and Storage Conference
        September 23 - 25, 2024
        Tel-Aviv , Israel

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader