ABSTRACT
Training convolutional neural network is a major bottleneck when developing a new neural network topology. This paper presents a dynamic precision scaling (DPS) algorithm and flexible multiplier-accumulator (MAC) to speed up convolutional neural network training. The DPS algorithm utilizes dynamic fixed point and finds good enough numerical precision for target network while training. The precision information from DPS is used to configure our proposed MAC. The proposed MAC can perform fixed point computation with variable precision mode providing differentiated computation time which enables speeding up training for lower precision computation. Simulation results show that our work can achieve 5.7x speed-up while consuming 31% energy compared to baseline for modified Alexnet on Flickr image style recognition task.
- J. Dean et al., "Large scale distributed deep networks," Advances in Neural Information Processing Systems, pp. 1223--1231, 2012.Google Scholar
- T. Chilimbi et al., "Project Adam: Building an efficient and scalable deep learning training system," OSDI, pp. 571--582, October 2014 Google ScholarDigital Library
- A. Krizhevsky, I. Sutskever, and G. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Proc. Neural Information and Processing Systems, 2012.Google Scholar
- C. Zhang et al., "Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks," FPGA'2015, 2015. Google ScholarDigital Library
- J. Cong and B. Xiao. "Minimizing computation in convolutional neural networks," ICANN, pages 281--290. Springer, 2014.Google Scholar
- Z. Lin et al., "Neural Networks with Few Multiplications," arXiv preprint arXiv:1510.03009v3, 2016Google Scholar
- S. Gupta et al., "Deep learning with limited numerical precision," ICML, 2015Google Scholar
- M. Courbariaux et al., "Training Deep Neural Networks with Low Precision Multiplications," arXiv preprint arXiv:1412.7024, 2014.Google Scholar
- D. Williamson, "Dynamically scaled fixed point arithmetic," in proc. IEEE Conf. Commun., Comput. Syst. Signal Process., May 1991, pp. 315--318.Google Scholar
- D. Lin et al., "Fixed Point Quantization of Deep Convolutional Networks," arXiv preprint arXiv:1511.06393v2, 2016.Google Scholar
- Y. Lecun et al., "Gradient-based learning applied to document recognition," Proceedings of the IEEE, 86, no. 112278-2324, 1998.Google Scholar
- C. Szegedy et al., "Going Deeper with Convolutions," CVPR, 2015Google Scholar
- Y. Jia et al., "Caffe: Convolutional Architecture for Fast Feature Embedding," ACM International Conference on Multimedia, 2014. Google ScholarDigital Library
- S. Chetlur et al., "cudnn: Efficient primitives for deep learning". CoRR, abs/1410.0759, 2014.Google Scholar
- H.T. Kung, "Why systolic architectures?," Computer, 15(1):37--46, Jan 1982. Google ScholarDigital Library
Index Terms
- Speeding up Convolutional Neural Network Training with Dynamic Precision Scaling and Flexible Multiplier-Accumulator
Recommendations
Scene text recognition using residual convolutional recurrent neural network
Text is a significant tool for human communication, and text recognition in scene images becomes more and more important. In this paper, we propose a residual convolutional recurrent neural network for solving the task of scene text recognition. The ...
Capacity of several neural networks with respect to digital adder and multiplier
SSST '95: Proceedings of the 27th Southeastern Symposium on System Theory (SSST'95)Many neural network designers are often curious about the capacity of a neural network. If they are able to know move about the capacity of neural networks, they would have an easier time deciding what neural network architecture to use as well as how ...
Performance comparison of text-based sentiment analysis using recurrent neural network and convolutional neural network
ICCIP '17: Proceedings of the 3rd International Conference on Communication and Information ProcessingOne biggest challenge in sentiment analysis is that it should include Natural Language Processing (NLP), to make the machine understand the human language. With the current development of Artificial Neural Network (ANN), with its implementation, ...
Comments