research-article

Public Access

μManycore: A Cloud-Native CPU for Tail at Scale

Authors:

Jovan Stojkovic,

Chunao Liu,

Muhammad Shahbaz,

Josep TorrellasAuthors Info & Claims

ISCA '23: Proceedings of the 50th Annual International Symposium on Computer Architecture

Article No.: 33, Pages 1 - 15

https://doi.org/10.1145/3579371.3589068

Published: 17 June 2023 Publication History

PDF eReader

Abstract

Microservices are emerging as a popular cloud-computing paradigm. Microservice environments execute typically-short service requests that interact with one another via remote procedure calls (often across machines), and are subject to stringent tail-latency constraints. In contrast, current processors are designed for traditional monolithic applications. They support global hardware cache coherence, provide large caches, incorporate microarchitecture for long-running, predictable applications (such as advanced prefetching), and are optimized to minimize average latency rather than tail latency.

To address this imbalance, this paper proposes μManycore, an architecture optimized for cloud-native microservice environments. Based on a characterization of microservice applications, μManycore is designed to minimize unnecessary microarchitecture and mitigate overheads to reduce tail latency. Indeed, rather than supporting manycore-wide hardware cache coherence, μManycore has multiple small hardware cache-coherent domains, called Villages. Clusters of villages are interconnected with an on-package leaf-spine network, which has many redundant, low-hop-count paths between clusters. To minimize latency overheads, μManycore schedules and queues service requests in hardware, and includes hardware support to save and restore process state when doing a context-switch. Our simulation-based results show that μManycore delivers high performance. A cluster of 10 servers with a 1024-core μManycore in each server delivers 3.7× lower average latency, 15.5× higher throughput, and, importantly, 10.4× lower tail latency than a cluster with iso-power conventional server-class multicores. Similar good results are attained compared to a cluster with power-hungry iso-area conventional server-class multicores.

References

[1]

Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa. 2020. Firecracker: Lightweight Virtualization for Serverless Applications. In Proceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI '20).

Abstract

References

Cited By

Index Terms

Recommendations

Exploring a Hybrid Voting-based Eviction Policy for Caches and Sparse Directories on Manycore Architectures

Virtualized environments in cloud can have superlinear speedup

CPU Cache Prefetching: Timing Evaluation of Hardware Implementations

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations