research-article

Speeding up Convolutional Neural Network Training with Dynamic Precision Scaling and Flexible Multiplier-Accumulator

Authors:
Taesik Na

Georgia Institute of Technology KACB, Atlanta, GA, USA

Georgia Institute of Technology KACB, Atlanta, GA, USA
View Profile

,
Saibal Mukhopadhyay

Georgia Institute of Technology KACB, Atlanta, GA, USA

Georgia Institute of Technology KACB, Atlanta, GA, USA
View Profile

ISLPED '16: Proceedings of the 2016 International Symposium on Low Power Electronics and DesignAugust 2016Pages 58–63https://doi.org/10.1145/2934583.2934625

Published:08 August 2016Publication History

ISLPED '16: Proceedings of the 2016 International Symposium on Low Power Electronics and Design

Pages 58–63

ABSTRACT

Training convolutional neural network is a major bottleneck when developing a new neural network topology. This paper presents a dynamic precision scaling (DPS) algorithm and flexible multiplier-accumulator (MAC) to speed up convolutional neural network training. The DPS algorithm utilizes dynamic fixed point and finds good enough numerical precision for target network while training. The precision information from DPS is used to configure our proposed MAC. The proposed MAC can perform fixed point computation with variable precision mode providing differentiated computation time which enables speeding up training for lower precision computation. Simulation results show that our work can achieve 5.7x speed-up while consuming 31% energy compared to baseline for modified Alexnet on Flickr image style recognition task.

References

J. Dean et al., "Large scale distributed deep networks," Advances in Neural Information Processing Systems, pp. 1223--1231, 2012.Google Scholar
T. Chilimbi et al., "Project Adam: Building an efficient and scalable deep learning training system," OSDI, pp. 571--582, October 2014 Google ScholarDigital Library
A. Krizhevsky, I. Sutskever, and G. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Proc. Neural Information and Processing Systems, 2012.Google Scholar
C. Zhang et al., "Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks," FPGA'2015, 2015. Google ScholarDigital Library
J. Cong and B. Xiao. "Minimizing computation in convolutional neural networks," ICANN, pages 281--290. Springer, 2014.Google Scholar
Z. Lin et al., "Neural Networks with Few Multiplications," arXiv preprint arXiv:1510.03009v3, 2016Google Scholar
S. Gupta et al., "Deep learning with limited numerical precision," ICML, 2015Google Scholar
M. Courbariaux et al., "Training Deep Neural Networks with Low Precision Multiplications," arXiv preprint arXiv:1412.7024, 2014.Google Scholar
D. Williamson, "Dynamically scaled fixed point arithmetic," in proc. IEEE Conf. Commun., Comput. Syst. Signal Process., May 1991, pp. 315--318.Google Scholar
D. Lin et al., "Fixed Point Quantization of Deep Convolutional Networks," arXiv preprint arXiv:1511.06393v2, 2016.Google Scholar
Y. Lecun et al., "Gradient-based learning applied to document recognition," Proceedings of the IEEE, 86, no. 112278-2324, 1998.Google Scholar
C. Szegedy et al., "Going Deeper with Convolutions," CVPR, 2015Google Scholar
Y. Jia et al., "Caffe: Convolutional Architecture for Fast Feature Embedding," ACM International Conference on Multimedia, 2014. Google ScholarDigital Library
S. Chetlur et al., "cudnn: Efficient primitives for deep learning". CoRR, abs/1410.0759, 2014.Google Scholar
H.T. Kung, "Why systolic architectures?," Computer, 15(1):37--46, Jan 1982. Google ScholarDigital Library

Index Terms

Speeding up Convolutional Neural Network Training with Dynamic Precision Scaling and Flexible Multiplier-Accumulator
1. Applied computing
  1. Physical sciences and engineering
    1. Electronics
2. Hardware
  1. Hardware validation
    1. Functional verification
      1. Simulation and emulation
  2. Very large scale integration design
    1. Application-specific VLSI designs
      1. Application specific processors

Recommendations

Scene text recognition using residual convolutional recurrent neural network

Text is a significant tool for human communication, and text recognition in scene images becomes more and more important. In this paper, we propose a residual convolutional recurrent neural network for solving the task of scene text recognition. The ...
Read More
Capacity of several neural networks with respect to digital adder and multiplier
SSST '95: Proceedings of the 27th Southeastern Symposium on System Theory (SSST'95)

Many neural network designers are often curious about the capacity of a neural network. If they are able to know move about the capacity of neural networks, they would have an easier time deciding what neural network architecture to use as well as how ...
Read More
Performance comparison of text-based sentiment analysis using recurrent neural network and convolutional neural network
ICCIP '17: Proceedings of the 3rd International Conference on Communication and Information Processing

One biggest challenge in sentiment analysis is that it should include Natural Language Processing (NLP), to make the machine understand the human language. With the current development of Artificial Neural Network (ANN), with its implementation, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ISLPED '16: Proceedings of the 2016 International Symposium on Low Power Electronics and Design
August 2016
392 pages
ISBN:9781450341851
DOI:10.1145/2934583

Copyright © 2016 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 8 August 2016
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Convolutional neural network
Training
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
ISLPED '16 Paper Acceptance Rate60of190submissions,32%Overall Acceptance Rate398of1,159submissions,34%
More
Upcoming Conference
ISLPED '24

Sponsor:

sigda

ACM/IEEE International Symposium on Low Power Electronics and Design

August 5 - 7, 2024

Newport Beach , CA , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 21
  Total Citations
  View Citations
- 1,110
  Total Downloads
- Downloads (Last 12 months)48
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Speeding up Convolutional Neural Network Training with Dynamic Precision Scaling and Flexible Multiplier-Accumulator

ISLPED '16: Proceedings of the 2016 International Symposium on Low Power Electronics and Design

ABSTRACT

References

Cited By

Index Terms

Recommendations

Scene text recognition using residual convolutional recurrent neural network

Capacity of several neural networks with respect to digital adder and multiplier

Performance comparison of text-based sentiment analysis using recurrent neural network and convolutional neural network

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Speeding up Convolutional Neural Network Training with Dynamic Precision Scaling and Flexible Multiplier-Accumulator

ISLPED '16: Proceedings of the 2016 International Symposium on Low Power Electronics and Design

ABSTRACT

References

Cited By

Index Terms

Recommendations

Scene text recognition using residual convolutional recurrent neural network

Capacity of several neural networks with respect to digital adder and multiplier

Performance comparison of text-based sentiment analysis using recurrent neural network and convolutional neural network

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media