research-article

Compensated-DNN: energy efficient low-precision deep neural networks by compensating quantization errors

Authors:
Shubham Jain

Purdue University

Purdue University
View Profile

,
Swagath Venkataramani

Yorktown Heights

Yorktown Heights
View Profile

,
Vijayalakshmi Srinivasan

Yorktown Heights

Yorktown Heights
View Profile

,
Jungwook Choi

Yorktown Heights

Yorktown Heights
View Profile

,
Pierce Chuang

Yorktown Heights

Yorktown Heights
View Profile

,
Leland Chang

Yorktown Heights

Yorktown Heights
View Profile

DAC '18: Proceedings of the 55th Annual Design Automation ConferenceJune 2018Article No.: 38Pages 1–6https://doi.org/10.1145/3195970.3196012

Published:24 June 2018Publication History

DAC '18: Proceedings of the 55th Annual Design Automation Conference

Pages 1–6

ABSTRACT

Deep Neural Networks (DNNs) represent the state-of-the-art in many Artificial Intelligence (AI) tasks involving images, videos, text, and natural language. Their ubiquitous adoption is limited by the high computation and storage requirements of DNNs, especially for energy-constrained inference tasks at the edge using wearable and IoT devices. One promising approach to alleviate the computational challenges is implementing DNNs using low-precision fixed point (<16 bits) representation. However, the quantization error inherent in any Fixed Point (FxP) implementation limits the choice of bit-widths to maintain application-level accuracy. Prior efforts recommend increasing the network size and/or re-training the DNN to minimize loss due to quantization, albeit with limited success.

Complementary to the above approaches, we present Compensated-DNN, wherein we propose to dynamically compensate the error introduced due to quantization during execution. To this end, we introduce a new fixed-point representation viz. Fixed Point with Error Compensation (FPEC). The bits in FPEC are split between computation bits vs. compensation bits. The computation bits use conventional FxP notation to represent the number at low-precision. On the other hand, the compensation bits (1 or 2 bits at most) explicitly capture an estimate (direction and magnitude) of the quantization error in the representation. For a given word length, since FPEC uses fewer computation bits compared to FxP representation, we achieve a near-quadratic improvement in energy in the multiply-and-accumulate (MAC) operations. The compensation bits are simultaneously used by a low-overhead sparse compensation scheme to estimate the error accrued during MAC operations, which is then added to the MAC output to minimize the impact of quantization. We build compensated-DNNs for 7 popular image recognition benchmarks with 0.05--20.5 million neurons and 0.01-15.5 billion connections. Based on gate-level analysis at 14nm technology, we achieve 2.65×-4.88× and 1.13×-1.7× improvement in energy compared to 16-bit and 8-bit FxP implementations respectively, while maintaining <0.5% loss in classification accuracy.

References

R. Parloff. The AI Revolution: Why Deep Learning Is Suddenly Changing Your Life. http://fortune.com/ai-artificial-intelligence-deep-machine-learning/. Online. Accessed Sept. 17, 2017.Google Scholar
C. Metz. Google, Facebook and Microsoft are remaking themselves around AI. https://wired.com/2016/11/google-facebook-microsoft-remaking-around-ai/. Online. Accessed Sept. 17, 2017.Google Scholar
M Courbariaux et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations. CoRR, abs/1511.00363, 2015. Google ScholarDigital Library
Z Lin et al. Neural Networks with Few Multiplications. CoRR, abs, 2015.Google Scholar
F. Li and B. Liu. Ternary Weight Networks. CoRR, abs/1605.04711, 2016.Google Scholar
M. Courbariaux et al. BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. CoRR, abs/1602.02830, 2016.Google Scholar
M Rastegari et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. CoRR, abs/1603.05279, 2016.Google Scholar
S Zhou et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. CoRR, abs/1606.06160, 2016.Google Scholar
H. Alemdar et al. Ternary Neural Networks for Resource-Efficient AI Applications. CoRR, abs/1609.00222, 2016.Google Scholar
Z. Cheng et al. Training binary multilayer neural networks for image classification using expectation backpropagation. CoRR, abs/1503.03562, 2015.Google Scholar
S Gupta et al. Deep Learning with Limited Numerical Precision. CoRR, abs/1502.02551, 2015.Google Scholar
S. Venkataramani et al. AxNN: Energy-efficient neuromorphic systems using approximate computing. In Proc. ISLPED, Aug 2014. Google ScholarDigital Library
P. Judd et al. Proteus: Exploiting numerical precision variability in deep neural networks. In Proc. ICS, June. 2016. Google ScholarDigital Library
S. Hashemi et al. Understanding the impact of precision quantization on the accuracy and energy of neural networks. In Proc. DATE, March 2017. Google ScholarDigital Library
H. Tann et al. Hardware-Software Codesign of Accurate, Multiplier-free Deep Neural Networks. In Proc. DAC, 2017. Google ScholarDigital Library
A. Krizhevsky et al. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, 2012. Google ScholarDigital Library
S. Han et al. Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. CoRR, abs, 2015.Google Scholar
S. Venkataramani et al. ScaleDeep: A Scalable Compute Architecture for Learning and Evaluating Deep Networks. In Proc. ISCA, June, 2017. Google ScholarDigital Library
B. Reagen et al. Minerva: Enabling low-power, highly-accurate deep neural network accelerators. In Proc. ISCA, June 2016. Google ScholarDigital Library
S Sen et al. Sparce: Sparsity aware general purpose core extensions to accelerate deep neural networks. CoRR, abs/1711.06315, 2017.Google Scholar
Y Jia et al. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arXiv:1408.5093, 2014.Google Scholar

Recommendations

Compensated-DNN: Energy Efficient Low-Precision Deep Neural Networks by Compensating Quantization Errors
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)
Deep Neural Networks (DNNs) represent the state-of-the-art in many Artificial Intelligence (AI) tasks involving images, videos, text, and natural language. Their ubiquitous adoption is limited by the high computation and storage requirements of DNNs, ...
Read More
A motion compensated lifting wavelet codec for 3D video coding
Abstract
A motion compensated lifting (MCLIFT) framework for the 3D wavelet video coding is proposed in this paper. By using bi-directional motion compensation in each lifting step of the temporal direction, the video frames are effectively de-correlated. ...
Read More
Motion compensated multiresolution transmission of high definition video

Several methods to perform multiresolution transmission of HDTV through subband coding are compared. The first is simple intraframe encoding of the spatial subbands of the video. The authors have used the subband decomposition for both hierarchical ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

DAC '18: Proceedings of the 55th Annual Design Automation Conference
June 2018
1089 pages
ISBN:9781450357005
DOI:10.1145/3195970

Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 June 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,770of5,499submissions,32%
Upcoming Conference
DAC '24

Sponsor:

sigda

61st ACM/IEEE Design Automation Conference

June 23 - 27, 2024

San Francisco , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 29
  Total Citations
  View Citations
- 1,238
  Total Downloads
- Downloads (Last 12 months)73
- Downloads (Last 6 weeks)6
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Compensated-DNN: energy efficient low-precision deep neural networks by compensating quantization errors

DAC '18: Proceedings of the 55th Annual Design Automation Conference

ABSTRACT

References

Cited By

Recommendations

Compensated-DNN: Energy Efficient Low-Precision Deep Neural Networks by Compensating Quantization Errors

A motion compensated lifting wavelet codec for 3D video coding

Motion compensated multiresolution transmission of high definition video

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Compensated-DNN: energy efficient low-precision deep neural networks by compensating quantization errors

DAC '18: Proceedings of the 55th Annual Design Automation Conference

ABSTRACT

References

Cited By

Recommendations

Compensated-DNN: Energy Efficient Low-Precision Deep Neural Networks by Compensating Quantization Errors

A motion compensated lifting wavelet codec for 3D video coding

Motion compensated multiresolution transmission of high definition video

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media