skip to main content
10.1145/3642970acmconferencesBook PagePublication PageseurosysConference Proceedingsconference-collections
EuroMLSys '24: Proceedings of the 4th Workshop on Machine Learning and Systems
ACM2024 Proceeding
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
Conference:
EuroSys '24: Nineteenth European Conference on Computer Systems Athens Greece 22 April 2024
ISBN:
979-8-4007-0541-0
Published:
22 April 2024
Sponsors:

Bibliometrics
Abstract

No abstract available.

Skip Table Of Content Section
research-article
Open Access
GuaranTEE: Towards Attestable and Private ML with CCA

Machine-learning (ML) models are increasingly being deployed on edge devices to provide a variety of services. However, their deployment is accompanied by challenges in model privacy and auditability. Model providers want to ensure that (i) their ...

research-article
Open Access
IA2: Leveraging Instance-Aware Index Advisor with Reinforcement Learning for Diverse Workloads

This study introduces the Instance-Aware Index Advisor (IA2), a novel deep reinforcement learning (DRL)-based approach for optimizing index selection in databases facing large action spaces of potential candidates. IA2 introduces the Twin Delayed Deep ...

research-article
Free
Temporal Graph Generative Models: An empirical study

Graph Neural Networks (GNNs) have recently emerged as popular methods for learning representations of non-euclidean data often encountered in diverse areas ranging from chemistry to source code generation. Recently, researchers have focused on learning ...

research-article
Open Access
Deploying Stateful Network Functions Efficiently using Large Language Models

Stateful network functions are increasingly used in data centers. However, their scalability remains a significant challenge since parallelizing packet processing across multiple cores requires careful configuration t o avoid compromising the application'...

research-article
Free
The Importance of Workload Choice in Evaluating LLM Inference Systems

The success of Large Language Models (LLMs) across a wide range of applications and use cases has created the need for faster and more scalable systems for LLM inference. These systems speed up LLM inference by optimizing scheduling decisions or ...

research-article
Free
Characterizing Training Performance and Energy for Foundation Models and Image Classifiers on Multi-Instance GPUs

GPUs are becoming a scarce resource in high demand, as many teams build and train increasingly advanced artificial intelligence workloads. As GPUs become more performant, they consume more energy, with NVIDIA's latest A100 and H100 graphics cards ...

research-article
Free
ALS Algorithm for Robust and Communication-Efficient Federated Learning

Federated learning is a distributed approach to machine learning in which a centralised server coordinates the learning task while training data is distributed among a potentially large set of clients. The focus of this paper is on top-N recommendations ...

research-article
Free
SpeedyLoader: Efficient Pipelining of Data Preprocessing and Machine Learning Training

Data preprocessing consisting of tasks like sample resizing, cropping, and filtering, is a crucial step in machine learning (ML) workflows. Even though the preprocessing step is largely ignored by work that focuses on optimizing training algorithms, in ...

research-article
Free
Towards Low-Energy Adaptive Personalization for Resource-Constrained Devices

The personalization of machine learning (ML) models to address data drift is a significant challenge in the context of Internet of Things (IoT) applications. Presently, most approaches focus on fine-tuning either the full base model or its last few ...

research-article
Open Access
An Analysis of Collocation on GPUs for Deep Learning Training

Deep learning training is an expensive process that extensively uses GPUs. However, not all model training saturates modern powerful GPUs. To create guidelines for such cases, this paper examines the performance of the different collocation methods ...

research-article
Free
Priority Sampling of Large Language Models for Compilers

Large Language Models show great potential in generating and optimizing code. Widely used sampling methods such as Nucleus Sampling increase the diversity of generation but often produce repeated samples for low temperatures and incoherent samples for ...

research-article
Free
Deferred Continuous Batching in Resource-Efficient Large Language Model Serving

Despite that prior work of batched inference and parameter-efficient fine-tuning techniques have reduced the resource requirements of large language models (LLMs), challenges remain in resource-constrained environments such as on-premise infrastructures ...

research-article
Free
ML Training with Cloud GPU Shortages: Is Cross-Region the Answer?

The widespread adoption of ML has led to a high demand for GPU hardware and consequently, severe shortages of GPUs in the public cloud. Allocating a sufficient number of GPUs to train or fine-tune today's large ML models in a single cloud region is often ...

research-article
Free
ALTO: An Efficient Network Orchestrator for Compound AI Systems

We present ALTO, a network orchestrator for efficiently serving compound AI systems such as pipelines of language models. ALTO leverages an optimization opportunity specific to generative language models, which is streaming intermediate outputs from the ...

research-article
Free
FedRDMA: Communication-Efficient Cross-Silo Federated LLM via Chunked RDMA Transmission

Communication overhead is a significant bottleneck in federated learning (FL), which has been exaggerated with the increasing size of AI models. In this paper, we propose FedRDMA, a communication-efficient cross-silo FL system that integrates RDMA into ...

research-article
Open Access
De-DSI: Decentralised Differentiable Search Index

This study introduces De-DSI, a novel framework that fuses large language models (LLMs) with genuine decentralization for information retrieval, particularly employing the differentiable search index (DSI) concept in a decentralized setting. Focused on ...

research-article
Free
Towards Pareto Optimal Throughput in Small Language Model Serving

Large language models (LLMs) have revolutionized the state-of-the-art of many different natural language processing tasks. Although serving LLMs is computationally and memory demanding, the rise of Small Language Models (SLMs) offers new opportunities ...

research-article
Free
Do Predictors for Resource Overcommitment Even Predict?

Resource overcommitment allows datacenters to improve resource efficiency. In this approach, the system allocates to the users the amount of resources to be most likely used, not necessarily the ones requested. To do so, the system monitors resource ...

research-article
Free
A Hybrid Decentralised Learning Topology for Recommendations with Improved Privacy

Many recent studies have investigated the extent to which decentralised topologies for machine learning can preserve privacy, showing that in various scenarios the exchanged model updates can leak user information. In this work, we analyse the privacy ...

research-article
Free
Evaluating Deep Learning Recommendation Model Training Scalability with the Dynamic Opera Network

Deep learning is commonly used to make personalized recommendations to users for a wide variety of activities. However, deep learning recommendation model (DLRM) training is increasingly dominated by all-to-all and many-to-many communication patterns. ...

research-article
Free
Comparative Profiling: Insights Into Latent Diffusion Model Training

Generative AI models are at the forefront of advancing creative and analytical tasks, pushing the boundaries of what machines can generate and comprehend. Among these, latent diffusion models represent significant advancements in generating high-fidelity ...

research-article
Free
Sponge: Inference Serving with Dynamic SLOs Using In-Place Vertical Scaling

Mobile and IoT applications increasingly adopt deep learning inference to provide intelligence. Inference requests are typically sent to a cloud infrastructure over a wireless network that is highly variable, leading to the challenge of dynamic Service ...

research-article
Free
Navigating Challenges and Technical Debt in Large Language Models Deployment

Large Language Models (LLMs) have become an essential tool in advancing artificial intelligence and machine learning, enabling outstanding capabilities in natural language processing, and understanding. However, the efficient deployment of LLMs in ...

research-article
Open Access
The Environmental Cost of Engineering Machine Learning-Enabled Systems: A Mapping Study

The integration of Machine Learning (ML) across public and industrial sectors has become widespread, posing unique challenges in comparison to conventional software development methods throughout the lifecycle of ML-Enabled Systems. Particularly, with ...

research-article
Free
Enhancing Named Entity Recognition for Agricultural Commodity Monitoring with Large Language Models

Agriculture, as one of humanity's most essential industries, faces the challenge of adapting to an increasingly data-driven world. Strategic decisions in this sector hinge on access to precise and actionable data.

Governments, major agriculture companies,...

Recommendations

Acceptance Rates

Overall Acceptance Rate18of26submissions,69%
YearSubmittedAcceptedRate
EuroMLSys '21261869%
Overall261869%