skip to main content
10.1145/3287624.3287628acmconferencesArticle/Chapter ViewAbstractPublication PagesaspdacConference Proceedingsconference-collections
research-article

Simulate-the-hardware: training accurate binarized neural networks for low-precision neural accelerators

Published: 21 January 2019 Publication History

Abstract

This work investigates how to effectively train binarized neural networks (BNNs) for the specialized low-precision neural accelerators. When mapping BNNs onto the specialized neural accelerators that adopt fixed-point feature data representation and binary parameters, due to the operation overflow caused by short fixed-point coding, the BNN inference results from the deep learning frameworks on CPU/GPU will be inconsistent with those from the accelerators. This issue leads to a large deviation between the training environment and the inference implementation, and causes potential model accuracy losses when deployed on the accelerators. Therefore, we present a series of methods to contain the overflow phenomenon, and enable typical deep learning frameworks like Tensorflow to effectively train BNNs that could work with high accuracy and convergence speed on the specialized neural accelerators.

References

[1]
Guoguo Chen, Carolina Parada, and Tara N Sainath. 2015. Query-by-example keyword spotting using long short-term memory networks. In ICASSP. IEEE, 5236--5240.
[2]
Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. 2015. Binaryconnect: Training deep neural networks with binary weights during propagations. In Advances in neural information processing systems. 3123--3131.
[3]
Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized neural networks: Training deep neural networks with weights and activations constrained to + 1 or -1. arXiv preprint arXiv:1602.02830 (2016).
[4]
Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Deep sparse rectifier neural networks. In AISTATS. 315--323.
[5]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770--778.
[6]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.
[7]
Qinghao Hu, Peisong Wang, and Jian Cheng. 2018. From hashing to CNNs: Training Binary Weight networks via hashing. arXiv preprint arXiv:1802.02733 (2018).
[8]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015).
[9]
Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. 2014. The CIFAR-10 dataset. (2014).
[10]
Shiqi Lian, Yinhe Han, Xiaoming Chen, Ying Wang, and Hang Xiao. 2018. Dadu-P: a scalable accelerator for robot motion planning in a dynamic environment. In 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). IEEE, 1--6.
[11]
Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. Xnor-net: Imagenet classification using binary convolutional neural networks. In ECCV. Springer, 525--542.
[12]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[13]
Lili Song, Ying Wang, Yinhe Han, Xin Zhao, Bosheng Liu, and Xiaowei Li. 2016. C-brain: a deep learning accelerator that tames the diversity of CNNs through adaptive data-level parallelization. In Design Automation Conference (DAC), 2016 53nd ACM/EDAC/IEEE. IEEE, 1--6.
[14]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, et al. 2015. Going deeper with convolutions. In CVPR.
[15]
Wei Tang, Gang Hua, and Liang Wang. 2017. How to train a compact binary neural network with high accuracy?. In AAAI. 2625--2631.
[16]
Ying Wang, Jie Xu, Yinhe Han, Huawei Li, and Xiaowei Li. 2016. DeepBurning: automatic generation of FPGA-based learning accelerators for the neural network family. In Proceedings of the 53rd Annual Design Automation Conference. ACM, 110.
[17]
Pete Warden. 2018. Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition. arXiv preprint arXiv:1804.03209 (2018).
[18]
Aojun Zhou, Anbang Yao, Yiwen Guo, Lin Xu, and Yurong Chen. 2017. Incremental network quantization: Towards lossless cnns with low-precision weights. arXiv preprint arXiv:1702.03044 (2017).
[19]
Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, and Yuheng Zou. 2016. DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160 (2016).

Cited By

View all
  • (2022)Document image analysis and recognition: a surveyComputer Optics10.18287/2412-6179-CO-102046:4(567-589)Online publication date: Aug-2022
  • (2021)ELC-ECG: Efficient LSTM Cell for ECG Classification based on Quantized Architecture2021 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS51556.2021.9401261(1-5)Online publication date: May-2021
  • (2021)ResNet-like Architecture with Low Hardware Requirements2020 25th International Conference on Pattern Recognition (ICPR)10.1109/ICPR48806.2021.9413186(6204-6211)Online publication date: 10-Jan-2021
  • Show More Cited By

Index Terms

  1. Simulate-the-hardware: training accurate binarized neural networks for low-precision neural accelerators

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ASPDAC '19: Proceedings of the 24th Asia and South Pacific Design Automation Conference
      January 2019
      794 pages
      ISBN:9781450360074
      DOI:10.1145/3287624
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      • IEICE ESS: Institute of Electronics, Information and Communication Engineers, Engineering Sciences Society
      • IEEE CAS
      • IEEE CEDA
      • IPSJ SIG-SLDM: Information Processing Society of Japan, SIG System LSI Design Methodology

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 January 2019

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. binarized neural networks
      2. containing
      3. overflow
      4. simulating

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China
      • Strategic Priority Research Program of the Chinese Academy of Sciences
      • Beijing Municipal Science & Technology Commission

      Conference

      ASPDAC '19
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 466 of 1,454 submissions, 32%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 15 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Document image analysis and recognition: a surveyComputer Optics10.18287/2412-6179-CO-102046:4(567-589)Online publication date: Aug-2022
      • (2021)ELC-ECG: Efficient LSTM Cell for ECG Classification based on Quantized Architecture2021 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS51556.2021.9401261(1-5)Online publication date: May-2021
      • (2021)ResNet-like Architecture with Low Hardware Requirements2020 25th International Conference on Pattern Recognition (ICPR)10.1109/ICPR48806.2021.9413186(6204-6211)Online publication date: 10-Jan-2021
      • (2021)Bipolar Morphological Neural Networks: Gate-Efficient Architecture for Computer VisionIEEE Access10.1109/ACCESS.2021.30944849(97569-97581)Online publication date: 2021

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media