abstract

LLaMPS: Large Language Models Placement System

Authors:

Rekha SinghalAuthors Info & Claims

ICPE '24 Companion: Companion of the 15th ACM/SPEC International Conference on Performance Engineering

Pages 87 - 88

https://doi.org/10.1145/3629527.3651404

Published: 07 May 2024 Publication History

Get Access

Abstract

The rapid expansion of Large Language Models (LLMs) presents significant challenges in efficient deployment for inference tasks, primarily due to their substantial memory and computational resource requirements. Many enterprises possess a variety of computing resources-servers, VMs, PCs, laptops-that cannot individually host a complete LLM. Collectively, however, these resources may be adequate for even the most demanding LLMs. We introduce LLaMPS, a novel tool, designed to optimally distribute blocks 1 of LLMs across available computing resources within an enterprise. LLaMPS leverages the unused capacities of these machines, allowing for the decentralized hosting of LLMs. This tool enables users to contribute their machine's resources to a shared pool, facilitating others within the network to access and utilize these resources for inference tasks. At its core, LLaMPS employs a sophisticated distributed framework to allocate transformer blocks of LLMs across various servers. In cases where a model is pre-deployed, users can directly access inference results (GUI and API). Our tool has undergone extensive testing with several open-source LLMs, including BLOOM-560m, BLOOM-3b, BLOOM-7b1, Falcon 40b, and LLaMA-70b. It is currently implemented in a real-world enterprise network setting, demonstrating its practical applicability and effectiveness.

Reference

[1]

Ravi Kumar Singh, Likhit Bandamudi, Shruti Kunde, Mayank Mishra, and Rekha Singhal. 2024. Leftovers for LlaMA. In International Conference on Performance Engineering(accepted). ICPE.

Google Scholar

Index Terms

LLaMPS: Large Language Models Placement System
1. Software and its engineering
  1. Software notations and tools
    1. Development frameworks and environments

Recommendations

Leftovers for LLaMA
ICPE '24: Proceedings of the 15th ACM/SPEC International Conference on Performance Engineering

n recent years, large language models (LLMs) have become pervasive in our day-to-day lives, with enterprises utilizing their services for a wide range of NLP-based applications. The exponential growth in the size of LLMs poses a significant challenge for ...
PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs)
CCS '24: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security

The capability of generating high-quality source code using large language models (LLMs) reduces software development time and costs. However, recent literature and our empirical investigation in this work show that while LLMs can generate functioning ...
Large Language Models Playing Mixed Strategy Nash Equilibrium Games
Network Games, Artificial Intelligence, Control and Optimization
Abstract
Generative artificial intelligence (Generative AI), and in particular Large Language Models (LLMs), has gained significant popularity among researchers and industrial communities, paving the way for the integration of LLMs in different domains, ...

Comments

Information & Contributors

Information

Published In

ICPE '24 Companion: Companion of the 15th ACM/SPEC International Conference on Performance Engineering

May 2024

305 pages

ISBN:9798400704451

DOI:10.1145/3629527

General Chairs:
Simonetta Balsamo
Ca'Foscari University of Venice, Italy
,
William Knottenbelt
Imperial College London, UK
,
Program Chairs:
Cristina L. Abad
Escuela Superior Politecnica del Litoral, Ecuador
,
Weiyi Shang
University of Waterloo, Canada

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2024

Check for updates

Author Tags

Qualifiers

Abstract

Conference

ICPE '24

Sponsor:

ICPE '24: 15th ACM/SPEC International Conference on Performance Engineering

May 7 - 11, 2024

London, United Kingdom

Acceptance Rates

Overall Acceptance Rate 252 of 851 submissions, 30%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
65
Total Downloads

Downloads (Last 12 months)65
Downloads (Last 6 weeks)15

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

Reference

Index Terms

Recommendations

Leftovers for LLaMA

PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs)

Large Language Models Playing Mixed Strategy Nash Equilibrium Games

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations