skip to main content
10.1145/3437801.3441624acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
poster

Dynamic scaling for low-precision learning

Published: 17 February 2021 Publication History

Abstract

In recent years, distributed deep learning is becoming popular in industry and academia. Although researchers want to use distributed systems for training, it has been reported that the communication cost for synchronizing gradients can be a bottleneck. Using low-precision gradients is a promising technique for reducing the bandwidth requirement. In this work, we propose Auto Precision Scaling (APS), an algorithm that can improve the accuracy when we communicate gradients by low-precision floating-point values. APS can improve the accuracy for all precisions with a trivial communication cost. Our experimental results show that for both image classification and segmentation, applying APS can train the state-of-the-art models by 8-bit floating-point gradients with no or only a tiny accuracy loss (<0.05%). Furthermore, we can avoid any accuracy loss by designing a hybrid-precision technique. Finally, we propose a performance model to evaluate the proposed method. Our experimental results show that APS can get a significant speedup over the state-of-the-art method. To make it available to researchers and developers, we design and implement a high-performance system for customized precision Deep Learning(CPD), which can simulate the training process using an arbitrary low-precision customized floating-point format. We integrate CPD into PyTorch and make it open-source to the public1.

References

[1]
Takuya Akiba, Shuji Suzuki, and Keisuke Fukuda. 2017. Extremely large minibatch SGD: training resnet-50 on imagenet in 15 minutes. arXiv preprint arXiv:1711.04325 (2017).
[2]
Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 (2017).
[3]
Nicholas J Higham. 2002. Accuracy and stability of numerical algorithms. Vol. 80. Siam.
[4]
Xianyan Jia, Shutao Song, Wei He, Yangzihao Wang, Haidong Rong, Feihu Zhou, Liqiang Xie, Zhenyu Guo, Yuanzhou Yang, Liwei Yu, et al. 2018. Highly scalable deep learning training system with mixed-precision: Training imagenet in four minutes. arXiv preprint arXiv:1807.11205 (2018).
[5]
Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, et al. 2017. Mixed precision training. arXiv preprint arXiv:1710.03740 (2017).
[6]
Peng Sun, Yonggang Wen, Ruobing Han, Wansen Feng, and Shengen Yan. 2019. GradientFlow: Optimizing Network Performance for Large-Scale Distributed DNN Training. IEEE Transactions on Big Data (2019).
[7]
Chris Ying, Sameer Kumar, Dehao Chen, Tao Wang, and Youlong Cheng. 2018. Image classification at supercomputer scale. arXiv preprint arXiv:1811.06992 (2018).
[8]
Yang You, Zhao Zhang, Cho-Jui Hsieh, James Demmel, and Kurt Keutzer. 2018. Imagenet training in minutes. In Proceedings of the 47th International Conference on Parallel Processing. ACM, 1.

Cited By

View all
  • (2025)Efficient deep neural network training via decreasing precision with layer capacityFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-40669-319:10Online publication date: 1-Oct-2025
  • (2024)Systematic Analysis of Low-Precision Training in Deep Neural Networks: Factors Influencing Matrix ComputationsApplied Sciences10.3390/app14211002514:21(10025)Online publication date: 2-Nov-2024
  • (2023)Love of Variety Based Latency Analysis for High Definition Map Updating: Age of Information and Distributional Robust PerspectivesIEEE Transactions on Intelligent Vehicles10.1109/TIV.2022.32246558:2(1751-1764)Online publication date: Feb-2023

Index Terms

  1. Dynamic scaling for low-precision learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PPoPP '21: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
    February 2021
    507 pages
    ISBN:9781450382946
    DOI:10.1145/3437801
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 February 2021

    Check for updates

    Badges

    Author Tags

    1. distributed training
    2. low precision

    Qualifiers

    • Poster

    Conference

    PPoPP '21

    Acceptance Rates

    PPoPP '21 Paper Acceptance Rate 31 of 150 submissions, 21%;
    Overall Acceptance Rate 230 of 1,014 submissions, 23%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)26
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Efficient deep neural network training via decreasing precision with layer capacityFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-024-40669-319:10Online publication date: 1-Oct-2025
    • (2024)Systematic Analysis of Low-Precision Training in Deep Neural Networks: Factors Influencing Matrix ComputationsApplied Sciences10.3390/app14211002514:21(10025)Online publication date: 2-Nov-2024
    • (2023)Love of Variety Based Latency Analysis for High Definition Map Updating: Age of Information and Distributional Robust PerspectivesIEEE Transactions on Intelligent Vehicles10.1109/TIV.2022.32246558:2(1751-1764)Online publication date: Feb-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media