A high-throughput scalable BNN accelerator with fully pipelined architecture

Han, Zhe; Jiang, Jingfei; Xu, Jinwei; Zhang, Peng; Zhao, Xiaoqiang; Wen, Dong; Dou, Yong

doi:10.1007/s42514-020-00059-0

A high-throughput scalable BNN accelerator with fully pipelined architecture

Regular Paper
Published: 23 March 2021

Volume 3, pages 17–30, (2021)
Cite this article

CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Zhe Han ORCID: orcid.org/0000-0002-7095-071X¹,
Jingfei Jiang¹,
Jinwei Xu¹,
Peng Zhang¹,
Xiaoqiang Zhao¹,
Dong Wen¹ &
…
Yong Dou¹

358 Accesses
2 Citations
Explore all metrics

Abstract

By replacing multiplication with XNOR operation, Binarized Neural Networks (BNN) are hardware-friendly and extremely suitable for FPGA acceleration. Previous researches highlighted the potential exploitation of BNNs performance. However, most of the present researches targeted at minimizing chip areas. They achieved excellent energy and resource efficiency in small FPGA while the results in larger FPGA were unsatisfying. Thus, we proposed a scalable fully pipelined BNN architecture, which targeted on maximizing throughput and keeping energy and resource efficiency in large FPGA. By exploiting multi-levels parallelism and balancing pipeline stages, it achieved excellent performance. Moreover, we shared on-chip memory and balanced the computation resources to further utilizing the resource. Then a methodology is proposed that explores design space for the optimal configuration. This work is evaluated based on Xilinx UltraScale XCKU115. The results show that the proposed architecture achieves 2.24×–11.24× performance and 2.43×–11.79× resource efficiency improvement compared with other BNN accelerators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive review of Binary Neural Network

Article 30 March 2023

A comprehensive survey on model compression and acceleration

Article 08 February 2020

Design of a reversible ALU using a novel coplanar reversible full adder and MF gate in QCA nanotechnology

Article 06 January 2023

References

Abdel-Hamid, O., Mohamed, A.-R., Jiang, H., Deng, L., Penn, G., Yu, D.: Convolutional neural networks for speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 22(10), 1533–1545 (2014)
Article Google Scholar
Alvarez, R., Prabhavalkar, R., Bakhtin, A.: On the efficient representation and execution of deep acoustic models. arXiv preprint Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1607.04683 (2016)
Blott, M., Preußer, T.B., Fraser, N.J., Gambardella, G., O’brien, K., Umuroglu, Y., Leeser, M., Vissers, K.: FINN-R: An end-to-end deep-learning framework for fast exploration of quantized neural networks. ACM Trans. Reconfig. Technol. Syst. (TRETS) 11(3), 1–23 (2018)
Article Google Scholar
Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282 (2017)
Courbariaux, M., Bengio, Y., David, J.-P.: Binaryconnect: Training deep neural networks with binary weights during propagations. NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems, vol 2, 2015
Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv preprint arXiv:1602.02830 (2016)
Fu, C., Zhu, S., Su, H., Lee, C.-E., Zhao, J.: Towards fast and energy-efficient binarized neural network inference on fpga. arXiv preprint arXiv:1810.02068 (2018)
Gong, C., Chen, Y., Lu, Y., Li, T., Hao, C., Chen, D.: VecQ: minimal loss DNN model compression with vectorized weight quantization. IEEE Trans. Comput. (2020). https://doi.org/10.1109/TC.2020.2995593
Article Google Scholar
Guo, P., Ma, H., Chen, R., Li, P., Xie, S., Wang, D.: FBNA: A fully binarized neural network accelerator. In: 2018 28th International Conference on Field Programmable Logic and Applications (FPL) 2018, pp. 51–513. IEEE
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Quantized neural networks: training neural networks with low precision weights and activations. J. Mach. Learn. Res. 18(1), 6869–6898 (2017)
MathSciNet MATH Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition 2014, pp. 1725–1732
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Handbook of Systemic Autoimmune Diseases, 2009, 1(4)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 60(6), 84–90 (2012). https://doi.org/10.1145/3065386
Google Scholar
Liang, S., Yin, S., Liu, L., Luk, W., Wei, S.: FP-BNN: Binarized neural network on FPGA. Neurocomputing 275, 1072–1086 (2018)
Article Google Scholar
Lin et al.: NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 344–352 (2017)
Lin, X., Zhao, C., Pan, W.: Towards accurate binary convolutional neural network. Adv. Neural Inf. Process. Syst., 345–353 (2020)
Liu, B., Cai, H., Wang, Z., Sun, Y., Shen, Z., Zhu, W., Li, Y., Gong, Y., Ge, W., Yang, J.: A 22nm, 10.8 μW/15.1 μW dual computing modes high power-performance-area efficiency domained background noise aware keyword-spotting processor. In: IEEE Transactions on Circuits and Systems I: Regular Papers (2020)
Migacz, S.: 8-bit inference with tensorrt. In: GPU Technology Conference 2017, vol. 4, p. 5
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. (2011). NIPS Workshop on Deep Learning and Unsupervised Feature Learning 2011.
Song, L., Wu, Y., Qian, X., Li, H., Chen, Y.: ReBNN: in-situ acceleration of binarized neural networks in ReRAM using complementary resistive cell. CCF Trans. High Perform. Comput. 1(3), 196–208 (2019)
Article Google Scholar
Tang, W., Hua, G., Wang, L.: How to train a compact binary neural network with high accuracy? In: Thirty-First AAAI conference on artificial intelligence 2017
Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M., Leong, P., Jahre, M., Vissers, K.: Finn: a framework for fast, scalable binarized neural network inference. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays 2017, pp. 65–74
Yang, L., He, Z., Fan, D.: A fully onchip binarized convolutional neural network fpga impelmentation with accurate inference. In: Proceedings of the International Symposium on Low Power Electronics and Design 2018, pp. 1–6
Zhao, R., Song, W., Zhang, W., Xing, T., Lin, J.-H., Srivastava, M., Gupta, R., Zhang, Z.: Accelerating binarized convolutional neural networks with software-programmable fpgas. In: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays 2017, pp. 15–24

Download references

Funding

This work is supported by National Science and Technology Major Project 2018ZX01028101.

Author information

Authors and Affiliations

National University of Defense Technology, Changsha, China
Zhe Han, Jingfei Jiang, Jinwei Xu, Peng Zhang, Xiaoqiang Zhao, Dong Wen & Yong Dou

Authors

Zhe Han
View author publications
You can also search for this author in PubMed Google Scholar
Jingfei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jinwei Xu
View author publications
You can also search for this author in PubMed Google Scholar
Peng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoqiang Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Dong Wen
View author publications
You can also search for this author in PubMed Google Scholar
Yong Dou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jingfei Jiang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

This paper is submitted for possible publication in the Special Issue on Reliability and Power Efficiency for HPC.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Han, Z., Jiang, J., Xu, J. et al. A high-throughput scalable BNN accelerator with fully pipelined architecture. CCF Trans. HPC 3, 17–30 (2021). https://doi.org/10.1007/s42514-020-00059-0

Download citation

Received: 18 June 2020
Accepted: 03 November 2020
Published: 23 March 2021
Issue Date: March 2021
DOI: https://doi.org/10.1007/s42514-020-00059-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A high-throughput scalable BNN accelerator with fully pipelined architecture

Abstract

Access this article

Similar content being viewed by others

A comprehensive review of Binary Neural Network

A comprehensive survey on model compression and acceleration

Design of a reversible ALU using a novel coplanar reversible full adder and MF gate in QCA nanotechnology

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A high-throughput scalable BNN accelerator with fully pipelined architecture

Abstract

Access this article

Similar content being viewed by others

A comprehensive review of Binary Neural Network

A comprehensive survey on model compression and acceleration

Design of a reversible ALU using a novel coplanar reversible full adder and MF gate in QCA nanotechnology

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation