Proceedings of the 4th Workshop on Machine Learning and Systems

EuroMLSys '24: Proceedings of the 4th Workshop on Machine Learning and Systems

April 2024

2024 Proceeding

Publisher:

Association for Computing Machinery
New York
NY
United States

Conference:

EuroSys '24: Nineteenth European Conference on Computer Systems Athens Greece 22 April 2024

ISBN:

979-8-4007-0541-0

Published:

22 April 2024

Sponsors:

SIGOPS

PDF eReader

Bibliometrics

Abstract

No abstract available.

Proceeding Downloads

PDFFront matter (Title, Copyright, Message from EuroMLSys'24 Chairs, Contents)

Select All

Export Citations Save to Binder

research-article

Open Access

GuaranTEE: Towards Attestable and Private ML with CCA

pp 1–9https://doi.org/10.1145/3642970.3655845

Machine-learning (ML) models are increasingly being deployed on edge devices to provide a variety of services. However, their deployment is accompanied by challenges in model privacy and auditability. Model providers want to ensure that (i) their ...

research-article

Open Access

IA2: Leveraging Instance-Aware Index Advisor with Reinforcement Learning for Diverse Workloads

pp 10–17https://doi.org/10.1145/3642970.3655839

This study introduces the Instance-Aware Index Advisor (IA2), a novel deep reinforcement learning (DRL)-based approach for optimizing index selection in databases facing large action spaces of potential candidates. IA2 introduces the Twin Delayed Deep ...

research-article

Free

Temporal Graph Generative Models: An empirical study

pp 18–27https://doi.org/10.1145/3642970.3655829

Graph Neural Networks (GNNs) have recently emerged as popular methods for learning representations of non-euclidean data often encountered in diverse areas ranging from chemistry to source code generation. Recently, researchers have focused on learning ...

research-article

Open Access

Deploying Stateful Network Functions Efficiently using Large Language Models

pp 28–38https://doi.org/10.1145/3642970.3655836

Stateful network functions are increasingly used in data centers. However, their scalability remains a significant challenge since parallelizing packet processing across multiple cores requires careful configuration t o avoid compromising the application'...

research-article

Free

The Importance of Workload Choice in Evaluating LLM Inference Systems

pp 39–46https://doi.org/10.1145/3642970.3655823

The success of Large Language Models (LLMs) across a wide range of applications and use cases has created the need for faster and more scalable systems for LLM inference. These systems speed up LLM inference by optimizing scheduling decisions or ...

research-article

Free

Characterizing Training Performance and Energy for Foundation Models and Image Classifiers on Multi-Instance GPUs

pp 47–55https://doi.org/10.1145/3642970.3655830

GPUs are becoming a scarce resource in high demand, as many teams build and train increasingly advanced artificial intelligence workloads. As GPUs become more performant, they consume more energy, with NVIDIA's latest A100 and H100 graphics cards ...

research-article

Free

ALS Algorithm for Robust and Communication-Efficient Federated Learning

pp 56–64https://doi.org/10.1145/3642970.3655842

Federated learning is a distributed approach to machine learning in which a centralised server coordinates the learning task while training data is distributed among a potentially large set of clients. The focus of this paper is on top-N recommendations ...

research-article

Free

SpeedyLoader: Efficient Pipelining of Data Preprocessing and Machine Learning Training

pp 65–72https://doi.org/10.1145/3642970.3655824

Data preprocessing consisting of tasks like sample resizing, cropping, and filtering, is a crucial step in machine learning (ML) workflows. Even though the preprocessing step is largely ignored by work that focuses on optimizing training algorithms, in ...

research-article

Free

Towards Low-Energy Adaptive Personalization for Resource-Constrained Devices

pp 73–80https://doi.org/10.1145/3642970.3655826

The personalization of machine learning (ML) models to address data drift is a significant challenge in the context of Internet of Things (IoT) applications. Presently, most approaches focus on fine-tuning either the full base model or its last few ...

research-article

Open Access

An Analysis of Collocation on GPUs for Deep Learning Training

pp 81–90https://doi.org/10.1145/3642970.3655827

Deep learning training is an expensive process that extensively uses GPUs. However, not all model training saturates modern powerful GPUs. To create guidelines for such cases, this paper examines the performance of the different collocation methods ...

research-article

Free

Priority Sampling of Large Language Models for Compilers

pp 91–97https://doi.org/10.1145/3642970.3655831

Large Language Models show great potential in generating and optimizing code. Widely used sampling methods such as Nucleus Sampling increase the diversity of generation but often produce repeated samples for low temperatures and incoherent samples for ...

research-article

Free

Deferred Continuous Batching in Resource-Efficient Large Language Model Serving

pp 98–106https://doi.org/10.1145/3642970.3655835

Despite that prior work of batched inference and parameter-efficient fine-tuning techniques have reduced the resource requirements of large language models (LLMs), challenges remain in resource-constrained environments such as on-premise infrastructures ...

research-article

Free

ML Training with Cloud GPU Shortages: Is Cross-Region the Answer?

pp 107–116https://doi.org/10.1145/3642970.3655843

The widespread adoption of ML has led to a high demand for GPU hardware and consequently, severe shortages of GPUs in the public cloud. Allocating a sufficient number of GPUs to train or fine-tune today's large ML models in a single cloud region is often ...

research-article

Free

ALTO: An Efficient Network Orchestrator for Compound AI Systems

pp 117–125https://doi.org/10.1145/3642970.3655844

We present ALTO, a network orchestrator for efficiently serving compound AI systems such as pipelines of language models. ALTO leverages an optimization opportunity specific to generative language models, which is streaming intermediate outputs from the ...

research-article

Free

FedRDMA: Communication-Efficient Cross-Silo Federated LLM via Chunked RDMA Transmission

pp 126–133https://doi.org/10.1145/3642970.3655834

Communication overhead is a significant bottleneck in federated learning (FL), which has been exaggerated with the increasing size of AI models. In this paper, we propose FedRDMA, a communication-efficient cross-silo FL system that integrates RDMA into ...

research-article

Open Access

De-DSI: Decentralised Differentiable Search Index

pp 134–143https://doi.org/10.1145/3642970.3655837

This study introduces De-DSI, a novel framework that fuses large language models (LLMs) with genuine decentralization for information retrieval, particularly employing the differentiable search index (DSI) concept in a decentralized setting. Focused on ...

research-article

Free

Towards Pareto Optimal Throughput in Small Language Model Serving

pp 144–152https://doi.org/10.1145/3642970.3655832

Large language models (LLMs) have revolutionized the state-of-the-art of many different natural language processing tasks. Although serving LLMs is computationally and memory demanding, the rise of Small Language Models (SLMs) offers new opportunities ...

research-article

Free

Do Predictors for Resource Overcommitment Even Predict?

pp 153–160https://doi.org/10.1145/3642970.3655838

Resource overcommitment allows datacenters to improve resource efficiency. In this approach, the system allocates to the users the amount of resources to be most likely used, not necessarily the ones requested. To do so, the system monitors resource ...

research-article

Free

A Hybrid Decentralised Learning Topology for Recommendations with Improved Privacy

pp 161–168https://doi.org/10.1145/3642970.3655841

Many recent studies have investigated the extent to which decentralised topologies for machine learning can preserve privacy, showing that in various scenarios the exchanged model updates can leak user information. In this work, we analyse the privacy ...

research-article

Free

Evaluating Deep Learning Recommendation Model Training Scalability with the Dynamic Opera Network

pp 169–175https://doi.org/10.1145/3642970.3655825

Deep learning is commonly used to make personalized recommendations to users for a wide variety of activities. However, deep learning recommendation model (DLRM) training is increasingly dominated by all-to-all and many-to-many communication patterns. ...

research-article

Free

Comparative Profiling: Insights Into Latent Diffusion Model Training

pp 176–183https://doi.org/10.1145/3642970.3655847

Generative AI models are at the forefront of advancing creative and analytical tasks, pushing the boundaries of what machines can generate and comprehend. Among these, latent diffusion models represent significant advancements in generating high-fidelity ...

research-article

Free

Sponge: Inference Serving with Dynamic SLOs Using In-Place Vertical Scaling

pp 184–191https://doi.org/10.1145/3642970.3655833

Mobile and IoT applications increasingly adopt deep learning inference to provide intelligence. Inference requests are typically sent to a cloud infrastructure over a wireless network that is highly variable, leading to the challenge of dynamic Service ...

research-article

Free

Navigating Challenges and Technical Debt in Large Language Models Deployment

pp 192–199https://doi.org/10.1145/3642970.3655840

Large Language Models (LLMs) have become an essential tool in advancing artificial intelligence and machine learning, enabling outstanding capabilities in natural language processing, and understanding. However, the efficient deployment of LLMs in ...

research-article

Open Access

The Environmental Cost of Engineering Machine Learning-Enabled Systems: A Mapping Study

pp 200–207https://doi.org/10.1145/3642970.3655828

The integration of Machine Learning (ML) across public and industrial sectors has become widespread, posing unique challenges in comparison to conventional software development methods throughout the lifecycle of ML-Enabled Systems. Particularly, with ...

research-article

Free

Enhancing Named Entity Recognition for Agricultural Commodity Monitoring with Large Language Models

pp 208–213https://doi.org/10.1145/3642970.3655846

Agriculture, as one of humanity's most essential industries, faces the challenge of adapting to an increasingly data-driven world. Strategic decisions in this sector hinge on access to precise and actionable data.

Governments, major agriculture companies,...

Recommendations

Acceptance Rates

Overall Acceptance Rate18of26submissions,69%

Year	Submitted	Accepted	Rate
EuroMLSys '21	26	18	69%
Overall	26	18	69%

Comments

EUROSYS

Sections

Proceeding Downloads

GuaranTEE: Towards Attestable and Private ML with CCA

IA2: Leveraging Instance-Aware Index Advisor with Reinforcement Learning for Diverse Workloads

Temporal Graph Generative Models: An empirical study

Deploying Stateful Network Functions Efficiently using Large Language Models

The Importance of Workload Choice in Evaluating LLM Inference Systems

Characterizing Training Performance and Energy for Foundation Models and Image Classifiers on Multi-Instance GPUs

ALS Algorithm for Robust and Communication-Efficient Federated Learning

SpeedyLoader: Efficient Pipelining of Data Preprocessing and Machine Learning Training

Towards Low-Energy Adaptive Personalization for Resource-Constrained Devices

An Analysis of Collocation on GPUs for Deep Learning Training

Priority Sampling of Large Language Models for Compilers

Deferred Continuous Batching in Resource-Efficient Large Language Model Serving

ML Training with Cloud GPU Shortages: Is Cross-Region the Answer?

ALTO: An Efficient Network Orchestrator for Compound AI Systems

FedRDMA: Communication-Efficient Cross-Silo Federated LLM via Chunked RDMA Transmission

De-DSI: Decentralised Differentiable Search Index

Towards Pareto Optimal Throughput in Small Language Model Serving

Do Predictors for Resource Overcommitment Even Predict?

A Hybrid Decentralised Learning Topology for Recommendations with Improved Privacy

Evaluating Deep Learning Recommendation Model Training Scalability with the Dynamic Opera Network

Comparative Profiling: Insights Into Latent Diffusion Model Training

Sponge: Inference Serving with Dynamic SLOs Using In-Place Vertical Scaling

Navigating Challenges and Technical Debt in Large Language Models Deployment

The Environmental Cost of Engineering Machine Learning-Enabled Systems: A Mapping Study

Enhancing Named Entity Recognition for Agricultural Commodity Monitoring with Large Language Models

UbiMob '08: Proceedings of the 4th French-speaking conference on Mobility and ubiquity computing

New Foundations of Machine Learning for Combinatorial Optimization

TADDS '12: Proceedings of the 4th International Workshop on Theoretical Aspects of Dynamic Distributed Systems

Acceptance Rates