research-article

Squeezing Accumulators in Binary Neural Networks for Extremely Resource-Constrained Applications

Authors:

Jongeun LeeAuthors Info & Claims

ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design

Article No.: 141, Pages 1 - 7

https://doi.org/10.1145/3508352.3549418

Published: 22 December 2022 Publication History

Abstract

The cost and power consumption of BNN (Binarized Neural Network) hardware is dominated by additions. In particular, accumulators account for a large fraction of hardware overhead, which could be effectively reduced by using reduced-width accumulators. However, it is not straightforward to find the optimal accumulator width due to the complex interplay between width, scale, and the effect of training. In this paper we present algorithmic and hardware-level methods to find the optimal accumulator size for BNN hardware with minimal impact on the quality of result. First, we present partial sum scaling, a top-down approach to minimize the BNN accumulator size based on advanced quantization techniques. We also present an efficient, zero-overhead hardware design for partial sum scaling. Second, we evaluate a bottom-up approach that is to use saturating accumulator, which is more robust against overflows. Our experimental results using CIFAR-10 dataset demonstrate that our partial sum scaling along with our optimized accumulator architecture can reduce the area and power consumption of datapath by 15.50% and 27.03%, respectively, with little impact on inference performance (less than 2%), compared to using 16-bit accumulator.

References

[1]

Azat Azamat, Faaiz Asim, and Jongeun Lee. 2021. Quarry: Quantization-based ADC Reduction for ReRAM-based Deep Neural Network Accelerators. In 2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD). IEEE, 1--7.

[2]

Peng Chen, Jing Liu, Bohan Zhuang, Mingkui Tan, and Chunhua Shen. 2021. AQD: Towards Accurate Quantized Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 104--113.

[3]

Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. arXiv preprint arXiv:1602.02830 (2016).

[4]

Barry de Bruin, Zoran Zivkovic, and Henk Corporaal. 2020. Quantization of Deep Neural Networks for Accumulator-constrained Processors. Microprocessors and Microsystems 72 (2020), 102872.

Digital Library

[5]

Steven K Esser, Jeffrey L McKinstry, Deepika Bablani, Rathinakumar Appuswamy, and Dharmendra S Modha. 2019. Learned Step Size Quantization. arXiv preprint arXiv:1902.08153 (2019).

[6]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 770--778.

[7]

Chun-Hsian Huang. 2021. An FPGA-based Hardware/Software Design Using Binarized Neural Networks for Agricultural Applications: A Case Study. IEEE Access 9 (2021), 26523--26531.

[8]

Sugil Lee, Hyeonuk Sim, Jooyeon Choi, and Jongeun Lee. 2019. Successive Log Quantization for Cost-Efficient Neural Networks Using Stochastic Computing. In 2019 56th ACM/IEEE Design Automation Conference (DAC). IEEE, 1--6.

[9]

Yuhang Li, Xin Dong, and Wei Wang. 2019. Additive Powers-of-Two Quantization: An Efficient Non-uniform Discretization for Neural Networks. arXiv preprint arXiv:1909.13144 (2019).

[10]

Zechun Liu, Baoyuan Wu, Wenhan Luo, Xin Yang, Wei Liu, and Kwang-Ting Cheng. 2018. Bi-Real Net: Enhancing the Performance of 1-bit CNNs with Improved Representational Capability and Advanced Training Algorithm. In Proceedings of the European conference on computer vision (ECCV). 722--737.

Digital Library

[11]

Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, et al. 2019. A Hardware-Software Blueprint for Flexible Deep Learning Specialization. IEEE Micro 39, 5 (2019), 8--16.

[12]

Renkun Ni, Hong-min Chu, Oscar Castañeda Fernández, Ping-yeh Chiang, Christoph Studer, and Tom Goldstein. 2021. WrapNet: Neural Net Inference with Ultra-Low-Precision Arithmetic. In International Conference on Learning Representations ICLR 2021. OpenReview.

[13]

Eriko Nurvitadhi, David Sheffield, Jaewoong Sim, Asit Mishra, Ganesh Venkatesh, and Debbie Marr. 2016. Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC. In 2016 International Conference on Field-Programmable Technology (FPT). IEEE, 77--84.

[14]

Sangyun Oh, Hyeonuk Sim, Jounghyun Kim, and Jongeun Lee. 2022. Non-Uniform Step Size Quantization for Accurate Post-Training Quantization. In Proceedings of the 17th European Conference on Computer Vision (ECCV). Springer International Publishing.

Digital Library

[15]

Sangyun Oh, Hyeonuk Sim, Sugil Lee, and Jongeun Lee. 2021. Automated Log-Scale Quantization for Low-Cost Deep Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 742--751.

[16]

Charbel Sakr, Naigang Wang, Chia-Yu Chen, Jungwook Choi, Ankur Agrawal, Naresh Shanbhag, and Kailash Gopalakrishnan. 2019. Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks. arXiv preprint arXiv:1901.06588 (2019).

[17]

Yaman Umuroglu, Nicholas J Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, and Kees Vissers. 2017. FINN: A Framework for Fast, Scalable Binarized Neural Network Inference. In Proceedings of the 2017 ACM/SIGDA international symposium on field-programmable gate arrays. 65--74.

Digital Library

[18]

Xiandong Zhao, Ying Wang, Xuyi Cai, Cheng Liu, and Lei Zhang. 2020. Linear Symmetric Quantization of Neural Networks for Low-precision Integer Hardware. In International Conference on Learning Representations ICLR 2020. OpenReview.

Cited By

Colbert IPappalardo APetri-Koenig JUmuroglu YSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)A2Q+Proceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692439(9275-9291)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692439
Song MAsim FLee JKim T(2024)Extending Neural Processing Unit and Compiler for Advanced Binarized Neural NetworksProceedings of the 29th Asia and South Pacific Design Automation Conference10.1109/ASP-DAC58780.2024.10473822(115-120)Online publication date: 22-Jan-2024
https://dl.acm.org/doi/10.1109/ASP-DAC58780.2024.10473822

Index Terms

Squeezing Accumulators in Binary Neural Networks for Extremely Resource-Constrained Applications

Index terms have been assigned to the content through auto-classification.

Recommendations

Time-Component Complexity of Two Approaches to Multioperand Binary Addition

Component and time complexity measures in terms of number of gates and gate delays, respectively, are derived for two multioperand adder structures: a tree of carry-save adders and a tree of carry-lookahead adders. The parameters of the complexity ...
Performance comparison of CNN, QNN and BNN deep neural networks for real-time object detection using ZYNQ FPGA node
Abstract
In this manuscript, previously trained Convolutional neural network (CNN), Quantum Neural Network (QNN), and Binarized Neural Network (BNN) models performed employing Tensor Flow's Application Programming Interface (API) for real-time ...
Resource-optimized combinational binary neural network circuits
Abstract
Designing efficient machine learning algorithms for near-sensor data processing on the edge has been at the research forefront in recent years. To achieve the required edge processing constraints, massively parallel binary neural ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design

October 2022

1467 pages

ISBN:9781450392174

DOI:10.1145/3508352

Conference Chair:
Tulika Mitra
National University of Singapore
,
Program Chairs:
Evangeline Young
The Chinese University of Hong Kong
,
Jinjun Xiong
University at Buffalo (UB)

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGDA: ACM Special Interest Group on Design Automation

In-Cooperation

IEEE-EDS: Electronic Devices Society
IEEE CAS
IEEE CEDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 December 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICCAD '22

Sponsor:

SIGDA

ICCAD '22: IEEE/ACM International Conference on Computer-Aided Design

October 30 - November 3, 2022

California, San Diego

Acceptance Rates

Overall Acceptance Rate 457 of 1,762 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
212
Total Downloads

Downloads (Last 12 months)41
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Colbert IPappalardo APetri-Koenig JUmuroglu YSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)A2Q+Proceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692439(9275-9291)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692439
Song MAsim FLee JKim T(2024)Extending Neural Processing Unit and Compiler for Advanced Binarized Neural NetworksProceedings of the 29th Asia and South Pacific Design Automation Conference10.1109/ASP-DAC58780.2024.10473822(115-120)Online publication date: 22-Jan-2024
https://dl.acm.org/doi/10.1109/ASP-DAC58780.2024.10473822

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten