Fine-Grained Exploitation of Mixed Precision for Faster CNN Training

Johnston, Travis; Young, Steven; Schuman, Catherine; Chae, Junghoon; March, Don; Patton, Robert; Potok, Thomas

doi:10.1109/MLHPC49564.2019.00007

Title: Fine-Grained Exploitation of Mixed Precision for Faster CNN Training

Conference · Fri Nov 01 00:00:00 EDT 2019

DOI:https://doi.org/10.1109/MLHPC49564.2019.00007· OSTI ID:1608214

^[1];

^[1]

ORNL

As deep convolutional neural networks (CNNs) have become increasingly popular and successful at an ever-widening number of machine learning tasks specialized hardware has become increasingly available for training and deploying them. NVIDIA's recent Volta architecture includes tensor cores which perform a fused operation reduced and mixed precision (16-bit multiply, 32-bit accumulate). Recent research indicates that, typically, very little is lost (in terms of training accuracy) when half precision is used in place of single precision, and performance gains can be made by doing arithmetic in reduced precision. In this work we demonstrate that making layer-by-layer choices as to the arithmetic/data precision can lead to further performance improvement. In our study of 25,200 CNNs we demonstrate an average speedup (over purely half precision) of 1.27x and speedups as high as 3.64x by appropriately combining single and half precision arithmetic and data types on a layer-by-layer basis.c

View Conference

Cite

Export

Save

Research Organization:: Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-00OR22725

OSTI ID:: 1608214

Resource Relation:: Conference: IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC) - Denver, Colorado, United States of America - 11/17/2019 9:00:00 AM-11/18/2019 9:00:00 AM

Country of Publication:: United States

Language:: English

Similar Records

Fine-Grained Exploitation of Mixed Precision for Faster CNN Training

Conference · Fri Nov 01 00:00:00 EDT 2019 · OSTI ID:1608214

Johnston, Travis; Young, Steven; Schuman, Catherine; +4 more

Mixed-precision iterative refinement using tensor cores on GPUs to accelerate solution of linear systems

Journal Article · Wed Nov 25 00:00:00 EST 2020 · Proceedings of the Royal Society. A. Mathematical, Physical and Engineering Sciences · OSTI ID:1608214

Haidar, Azzam; Bayraktar, Harun; Tomov, Stanimire; +2 more

Improving scalability of parallel CNN training by adaptively adjusting parameter update frequency

Journal Article · Wed Sep 29 00:00:00 EDT 2021 · Journal of Parallel and Distributed Computing · OSTI ID:1608214

Lee, Sunwoo; Kang, Qiao; Al-Bahrani, Reda; +3 more

Title: Fine-Grained Exploitation of Mixed Precision for Faster CNN Training

Citation Formats

Similar Records

Related Subjects