skip to main content
10.1145/3629527.3651436acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
abstract

Performance Optimization in the LLM World 2024

Published: 07 May 2024 Publication History

Abstract

The popularity and adoption of large language models (LLM) like ChatGPT has evolved rapidly. LLM pre-training is expensive. ChatGPT is estimated to cost over 700,000 per day to operate, and using GPT-4 to support customer service can cost a small business over 21,000 a month. The high infrastructure and financial costs, coupled with the specialized talent required, make LLM technology inaccessible to most organizations. For instance, the up-front costs include the emissions generated to manufacture the relevant hardware and the cost to run that hardware during the training procedure, both while the machines are operating at full capacity and while they are not. The best estimate of the dynamic computing cost in the case of GPT-3, the model behind the original ChatGPT, is approximately 1,287,000 kWh, or 552 tons of carbon dioxide. The goal of this workshop is to address the urgency of reducing energy consumption of LLM applications, by bringing together researchers from the academia and industry to share their experience and insights in performance engineering in the LLM world.

Cited By

View all
  • (2024)Parameter-efficient fine-tuning of large language models using semantic knowledge tuningScientific Reports10.1038/s41598-024-75599-414:1Online publication date: 28-Dec-2024

Index Terms

  1. Performance Optimization in the LLM World 2024

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICPE '24 Companion: Companion of the 15th ACM/SPEC International Conference on Performance Engineering
    May 2024
    305 pages
    ISBN:9798400704451
    DOI:10.1145/3629527
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 May 2024

    Check for updates

    Author Tags

    1. hardware optimization
    2. large language model
    3. llm
    4. software optimization
    5. system performance

    Qualifiers

    • Abstract

    Conference

    ICPE '24

    Acceptance Rates

    Overall Acceptance Rate 252 of 851 submissions, 30%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)307
    • Downloads (Last 6 weeks)24
    Reflects downloads up to 01 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Parameter-efficient fine-tuning of large language models using semantic knowledge tuningScientific Reports10.1038/s41598-024-75599-414:1Online publication date: 28-Dec-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media