skip to main content
10.1145/3600061.3603136acmotherconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
poster

Training ChatGPT-like Models with In-network Computation

Published: 05 September 2023 Publication History

Abstract

ChatGPT shows the enormous potential of large language models (LLMs). These models can easily reach the size of billions of parameters and create training difficulties for the majority. We propose a paradigm to train LLMs using distributed in-network computation on routers. Our preliminary result shows that our design allows LLMs to be trained at a reasonable learning rate without demanding extensive GPU resources.

References

[1]
Jiarui Fang, Zilin Zhu, Shenggui Li, Hui Su, Yang Yu, Jie Zhou, and Yang You. 2023. Parallel Training of Pre-Trained Models via Chunk-Based Dynamic Memory Management. IEEE Transactions on Parallel and Distributed Systems 34, 1 (jan 2023), 304–315. https://doi.org/10.1109/tpds.2022.3219819
[2]
fka. 2023. fka/awesome-chatgpt-prompts. Retrieved March 7, 2023 from https://huggingface.co/datasets/fka/awesome-chatgpt-prompts/blob/main/prompts.csv
[3]
Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, and Yuxiong He. 2019. ZeRO: Memory Optimizations Toward Training Trillion Parameter Models. https://doi.org/10.48550/ARXIV.1910.02054
[4]
Mateus Saquetti, Ronaldo Canofre, Arthur F Lorenzon, Fábio D Rossi, José Rodrigo Azambuja, Weverton Cordeiro, and Marcelo C Luizelli. 2021. Toward in-network intelligence: Running distributed artificial neural networks in the data plane. IEEE Communications Letters 25, 11 (2021), 3551–3555.
[5]
Giuseppe Siracusano and Roberto Bifulco. 2018. In-network neural networks. arXiv preprint arXiv:1801.05731 (2018).
[6]
Zhaoqi Xiong and Noa Zilberman. 2019. Do switches dream of machine learning? toward in-network classification. In Proceedings of the 18th ACM workshop on hot topics in networks. 25–33.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
APNet '23: Proceedings of the 7th Asia-Pacific Workshop on Networking
June 2023
229 pages
ISBN:9798400707827
DOI:10.1145/3600061
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 September 2023

Check for updates

Author Tags

  1. ChatGPT
  2. In-network Computation
  3. Large Language Model
  4. Pipeline Parallelism

Qualifiers

  • Poster
  • Research
  • Refereed limited

Conference

APNET 2023
APNET 2023: 7th Asia-Pacific Workshop on Networking
June 29 - 30, 2023
Hong Kong, China

Acceptance Rates

Overall Acceptance Rate 50 of 118 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 117
    Total Downloads
  • Downloads (Last 12 months)53
  • Downloads (Last 6 weeks)6
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media