Thorough Characterization and Analysis of Large Transformer Model Training At-Scale
Abstract
References
Index Terms
- Thorough Characterization and Analysis of Large Transformer Model Training At-Scale
Recommendations
Thorough Characterization and Analysis of Large Transformer Model Training At-Scale
SIGMETRICS '24Large transformer models have recently achieved great success across various domains. With a growing number of model parameters, a large transformer model training today typically involves model sharding, data parallelism, and model parallelism. Thus, ...
Thorough Characterization and Analysis of Large Transformer Model Training At-Scale
SIGMETRICS/PERFORMANCE '24: Abstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer SystemsLarge transformer models have recently achieved great success across various domains. With a growing number of model parameters, a large transformer model training today typically involves model sharding, data parallelism, and model parallelism. Thus, ...
Small and Large Time Scale Analysis of a Network Traffic Model
Empirical studies of the internet and WAN traffic data have observed multifractal behavior at time scales below a few hundred milliseconds. There have been some attempts to model this phenomenon, but there is no model to connect the small time scale ...
Comments
Information & Contributors
Information
Published In

Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Check for updates
Qualifiers
- Research-article
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 674Total Downloads
- Downloads (Last 12 months)568
- Downloads (Last 6 weeks)32
Other Metrics
Citations
Cited By
View allView Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in