skip to main content
10.1145/3552326acmconferencesBook PagePublication PageseurosysConference Proceedingsconference-collections
EuroSys '23: Proceedings of the Eighteenth European Conference on Computer Systems
ACM2023 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
EuroSys '23: Eighteenth European Conference on Computer Systems Rome Italy May 8 - 12, 2023
ISBN:
978-1-4503-9487-1
Published:
08 May 2023
Sponsors:

Bibliometrics
Effective Performance Issue Diagnosis with Value-Assisted Cost Profiling

Diagnosing performance issues is often difficult, especially when they occur only during some program executions. Profilers can help with performance debugging, but are ineffective when the most costly functions are not the root causes of performance ...

research-article
Open Access
Foxhound: Server-Grade Observability for Network-Augmented Applications

There is a growing move to offload functionality, e.g., TCP or key-value stores, into programmable networks - either on SmartNICs or programmable switches. While offloading promises significant performance boosts, these programmable devices often ...

research-article
OFence: Pairing Barriers to Find Concurrency Bugs in the Linux Kernel

Knowing which functions may execute concurrently is key to finding concurrency-related bugs. Existing tools infer the possibility of concurrency using dynamic analysis or by pairing functions that use the same locks. Code that relies on more relaxed ...

Pocket: ML Serving from the Edge

One of the major challenges in serving ML applications is the resource pressure introduced by the underlying ML frameworks. This becomes a bigger problem at resource-constrained, multi-tenant edge server locations, where it is necessary to scale to a ...

Efficient and Safe I/O Operations for Intermittent Systems

Task-based intermittent software systems always re-execute peripheral input/output (I/O) operations upon power failures since tasks have all-or-nothing semantics. Re-executed I/O wastes significant time and energy and risks memory inconsistency. This ...

research-article
ICE: Collaborating Memory and Process Management for User Experience on Resource-limited Mobile Devices

Mobile devices with limited resources are prevalent as they have a relatively low price. Providing a good user experience with limited resources has been a big challenge. This paper found that foreground applications are often unexpectedly interfered ...

research-article
Diagnosing Kernel Concurrency Failures with AITIA

Kernel concurrency failures are notoriously difficult to identify and diagnose their fundamental reason, the root cause. Kernel concurrency bugs frequently involve challenging patterns such as multi-variable races, data races with asynchronous kernel ...

WAFFLE: Exposing Memory Ordering Bugs Efficiently with Active Delay Injection

Concurrency bugs are difficult to detect, reproduce, and diagnose, as they manifest under rare timing conditions. Recently, active delay injection has proven efficient for exposing one such type of bug --- thread-safety violations --- with low ...

research-article
Open Access
Model Checking Guided Testing for Distributed Systems

Distributed systems have become the backbone of cloud computing. Incorrect system designs and implementations can greatly impair the reliability of distributed systems. Although a distributed system design modelled in the formal specification can be ...

MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks

We study training of Graph Neural Networks (GNNs) for large-scale graphs. We revisit the premise of using distributed training for billion-scale graphs and show that for graphs that fit in main memory or the SSD of a single machine, out-of-core ...

Accelerating Graph Mining Systems with Subgraph Morphing

Graph mining applications analyze the structural properties of large graphs. These applications are computationally expensive because finding structural patterns requires checking subgraph isomorphism, which is NP-complete. This paper exploits the sub-...

research-article
Open Access
TEA: A General-Purpose Temporal Graph Random Walk Engine

Many real-world graphs are temporal in nature, where the temporal information indicates when a particular edge is changed (e.g., edge insertion and deletion). Performing random walks on such temporal graphs is of paramount value. The state-of-the-art ...

research-article
ALT: Breaking the Wall between Data Layout and Loop Optimizations for Deep Learning Compilation

Deep learning models rely on highly optimized tensor libraries for efficient inference on heterogeneous hardware. Current deep compilers typically predetermine layouts of tensors and then optimize loops of operators. However, such unidirectional and ...

REFL: Resource-Efficient Federated Learning

Federated Learning (FL) enables distributed training by learners using local data, thereby enhancing privacy and reducing communication. However, it presents numerous challenges relating to the heterogeneity of the data distribution, device ...

research-article
Tabi: An Efficient Multi-Level Inference System for Large Language Models

Today's trend of building ever larger language models (LLMs), while pushing the performance of natural language processing, adds significant latency to the inference stage. We observe that due to the diminishing returns of adding parameters to LLMs, a ...

Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access

As deep learning (DL) inference has been widely adopted for building user-facing applications in many domains, it is increasingly important for DL inference servers to achieve high throughput while preserving bounded latency. DL inference requests can ...

DiLOS: Do Not Trade Compatibility for Performance in Memory Disaggregation

Memory disaggregation has replaced the landscape of dat-acenters by physically separating compute and memory nodes, achieving improved utilization. As early efforts, kernel paging-based approaches offer transparent virtual memory abstraction for ...

research-article
vTMM: Tiered Memory Management for Virtual Machines

The memory demand of virtual machines (VMs) is increasing, while the traditional DRAM-only memory system has limited capacity and high power consumption. The tiered memory system can effectively expand the memory capacity and increase the cost ...

research-article
Making Dynamic Page Coalescing Effective on Virtualized Clouds

Using huge pages has become a mainstream method to reduce address translation overhead for big memory workloads in modern computer systems. To create huge pages, system software usually uses page coalescing methods to dynamically combine contiguous ...

research-article
Open Access
Omni-Paxos: Breaking the Barriers of Partial Connectivity

Omni-Paxos is a system for state machine replication that is completely resilient to partial network partitions, a major source of service disruptions in recent years. Omni-Paxos achieves its resilience through a decoupled design that separates the ...

research-article
Open Access
CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical Sections

There is a fundamental tension between metadata scalability and POSIX semantics within distributed file systems. The bottleneck lies in the coordination, mainly locking, used for ensuring strong metadata consistency, namely, atomicity and isolation. ...

OLPart: Online Learning based Resource Partitioning for Colocating Multiple Latency-Critical Jobs on Commodity Computers

Colocating multiple jobs on the same server has been a commonly used approach for improving resource utilization in cloud environments. However, performance interference due to the contention over shared resources makes resource partitioning an ...

research-article
Palette Load Balancing: Locality Hints for Serverless Functions

Function-as-a-Service (FaaS) serverless computing enables a simple programming model with almost unbounded elasticity. Unfortunately, current FaaS platforms achieve this flexibility at the cost of lower performance for data-intensive applications ...

With Great Freedom Comes Great Opportunity: Rethinking Resource Allocation for Serverless Functions

Current serverless offerings give users limited flexibility for configuring the resources allocated to their function invocations. This simplifies the interface for users to deploy server-less computations but creates deployments that are resource ...

Groundhog: Efficient Request Isolation in FaaS

Security is a core responsibility for Function-as-a-Service (FaaS) providers. The prevailing approach isolates concurrent executions of functions in separate containers. However, successive invocations of the same function commonly reuse the runtime ...

research-article
Understanding and Optimizing Workloads for Unified Resource Management in Large Cloud Platforms

To fully utilize computing resources, cloud providers such as Google and Alibaba choose to co-locate online services with batch processing applications in their data centers. By implementing unified resource management policies, different types of ...

Fail through the Cracks: Cross-System Interaction Failures in Modern Cloud Systems

Modern cloud systems are orchestrations of independent and interacting (sub-)systems, each specializing in important services (e.g., data processing, storage, resource management, etc.). Hence, cloud system reliability is affected not only by the ...

LogGrep: Fast and Cheap Cloud Log Storage by Exploiting both Static and Runtime Patterns

In cloud systems, near-line logs are mainly used for debugging, which means they prefer a low query latency for a better user experience, and like any other logs, they also prefer a low overall cost including storage cost to store compressed logs and ...

research-article
Open Access
Aggregate VM: Why Reduce or Evict VM's Resources When You Can Borrow Them From Other Nodes?

Hardware resource fragmentation is a common issue in data centers. Traditional solutions based on migration or overcommitment are unacceptably slow, and modern commercial or research solutions like Spot VM may reduce or evict VM's resources anytime. ...

R2C: AOCR-Resilient Diversity with Reactive and Reflective Camouflage

Address-oblivious code reuse, AOCR for short, poses a substantial security risk, as it remains unchallenged. If neglected, adversaries have a reliable way to attack systems, offering an operational and profitable strategy. AOCR's authors conclude that ...

Contributors
  • The University of British Columbia
  • Microsoft Research
  • Sapienza University of Rome
  • Sapienza University of Rome
Index terms have been assigned to the content through auto-classification.

Recommendations

Acceptance Rates

Overall Acceptance Rate241of1,308submissions,18%
YearSubmittedAcceptedRate
EuroSys '211813821%
EuroSys '202344318%
EuroSys '182624316%
EuroSys '161803821%
EuroSys '141472718%
EuroSys '131432820%
EuroSys '111612415%
Overall1,30824118%