research-article

Scalable Time-Domain Compute-in-Memory BNN Engine with 2.06 POPS/W Energy Efficiency for Edge-AI Devices

Authors:
Jie Lou

RWTH Aachen University, Aachen, Germany

RWTH Aachen University, Aachen, Germany

0000-0003-0380-8585
View Profile

,
Florian Freye

RWTH Aachen University, Aachen, Germany

RWTH Aachen University, Aachen, Germany

0000-0003-3025-8910
View Profile

,
Christian Lanius

RWTH Aachen University, Aachen, Germany

RWTH Aachen University, Aachen, Germany

0000-0001-7107-3782
View Profile

,
Tobias Gemmeke

RWTH Aachen University, Aachen, Germany

RWTH Aachen University, Aachen, Germany

0000-0003-1583-3411
View Profile

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023June 2023Pages 665–670https://doi.org/10.1145/3583781.3590220

Published:05 June 2023Publication History

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023

Pages 665–670

ABSTRACT

Time-domain (TD) computing has attracted attention for its high computing efficiency and suitability for applications on energy-constrained edge devices. In this paper, we present a time-domain compute-in-memory (TDCIM) macro for binary neural networks (BNNs) realized by standard as well as custom delay cells. Multiply-and-accumulate (MAC) operations, batch normalization (BN) and binarization (Bin) are all processed in the time-domain, avoiding costly digital domain post-processing. In addition, it supports flexible mapping for different kernel sizes, achieving 100% utilization. Starting from a standard cell-based implementation, we propose two custom cells that provide interesting trade-offs between energy efficiency, area and accuracy. The two proposed custom designs can achieve 1.5 and 2.06 POPS/W energy efficiencies at 0.5V and 0.6V with less cell area while maintaining model test accuracy.

References

Zhengyu Chen et al. 2019. A Time-Domain Computing Accelerated Image Recognition Processor With Efficient Time Encoding and Non-Linear Logic Operation. IEEE J. Solid-State Circuits 54, 11 (2019), 3226--3237. https://doi.org/ 10.1109/JSSC.2018.2883394Google ScholarCross Ref
Zhengyu Chen et al. 2020. A Mixed-signal Time-Domain Generative Adversarial Network Accelerator with Efficient Subthreshold Time Multiplier and Mixedsignal On-chip Training for Low Power Edge Devices. In IEEE Symp. on VLSI Circuits. https://doi.org/10.1109/VLSICircuits18222.2020.9162829Google Scholar
Sai Kiran Cherupally et al. 2022. Improving DNN Hardware Accuracy by In- Memory Computing Noise Injection. IEEE Des. & Test 39, 4 (2022), 71--80. https: //doi.org/10.1109/MDAT.2021.3139047Google ScholarCross Ref
Yu-Der Chih et al. 2021. An 89TOPS/W and 16.3TOPS/mm2 All-Digital SRAMBased Full-Precision Compute-In Memory Macro in 22nm for Machine-Learning Edge Applications. In IEEE Int. Solid-State Circuits Conf. (ISSCC). 252--253. https: //doi.org/10.1109/ISSCC42613.2021.9365766Google Scholar
Luke R. Everson et al. 2019. An Energy-Efficient One-Shot Time-Based Neural Network Accelerator Employing Dynamic Threshold Error Correction in 65 nm. IEEE J. Solid-State Circuits 54, 10 (2019), 2777--2785. https://doi.org/10.1109/jssc. 2019.2914361Google ScholarCross Ref
Xin Fan et al. 2019. Synthesizable Memory Arrays Based on Logic Gates for Subthreshold Operation in IoT. IEEE Trans. Circuits Syst.I, Reg. Papers 66, 3 (2019), 941--954. https://doi.org/10.1109/TCSI.2018.2873026Google ScholarCross Ref
Florian Freye et al. 2022. Memristive Devices for Time Domain Compute-in- Memory. IEEE J. Explor. Solid-State Comput. Devices and Circuits 8, 2 (2022), 119--127. https://doi.org/10.1109/JXCDC.2022.3217098Google ScholarCross Ref
Hidehiro Fujiwara et al. 2022. A 5-nm 254-TOPS/W 221-TOPS/mm2 Fully-Digital Computing-in-Memory Macro Supporting Wide-Range Dynamic-Voltage-Frequency Scaling and Simultaneous MAC and Write Operations. In IEEE Int. Solid-State Circuits Conf. (ISSCC). 186--187. https://doi.org/10.1109/ISSCC42614.2022.9731754Google Scholar
Michael Gansen et al. 2022. Discrete Steps towards Approximate Computing. In Int. Symp. Qual. Electron. Des. (ISQED). https://doi.org/10.1109/ISQED54688.2022.9806215Google Scholar
Hyunjoon Kim et al. 2021. Colonnade: A Reconfigurable SRAM-Based Digital Bit- Serial Compute-In-Memory Macro for Processing Neural Networks. IEEE J. Solid- State Circuits 56, 7 (2021), 2221--2233. https://doi.org/10.1109/JSSC.2021.3061508Google ScholarCross Ref
Hyunjoon Kim et al. 2023. A 1--16b Reconfigurable 80Kb 7T SRAM-Based Digital Near-Memory Computing Macro for Processing Neural Networks. IEEE Trans. Circuits Syst.I, Reg. Papers (2023). https://doi.org/10.1109/TCSI.2022.3232648Google Scholar
Jinseok Kim et al. 2019. Area-Efficient and Variation-Tolerant In-Memory BNN Computing using 6T SRAM Array. In IEEE Symp. on VLSI Circuits. 118--119. https://doi.org/10.23919/VLSIC.2019.8778160Google Scholar
Gokul Krishnan et al. 2022. Hybrid RRAM/SRAM In-Memory Computing for Robust DNN Acceleration. IEEE Trans. Computer-Aided Design Integr. Circuits Syst. 41, 11 (2022), 4241--4252. https://doi.org/10.1109/TCAD.2022.3197516Google ScholarDigital Library
Jie Lou et al. 2022. All-Digital Time-Domain Compute-in-Memory Engine for Binary Neural Networks With 1.05POPS/W Energy Efficiency. In IEEE Eur. Solid State Circuits Conf. (ESSCIRC). 149--152. https://doi.org/10.1109/ESSCIRC55480.2022.9911382Google Scholar
Daisuke Miyashita et al. 2017. A Neuromorphic Chip Optimized for Deep Learning and CMOS Technology With Time-Domain Analog and Digital Mixed-Signal Processing. IEEE J. Solid-State Circuits 52, 10 (2017), 2679--2689. https: //doi.org/10.1109/JSSC.2017.2712626Google ScholarCross Ref
Jyotishman Saikia et al. 2021. Modeling and Optimization of SRAM-based In-Memory Computing Hardware Design. In IEEE Design Automat. Test Eur. Conf. (DATE). 942--947. https://doi.org/10.23919/DATE51398.2021.9473973Google Scholar
Aseem Sayal et al. 2020. A 12.08-TOPS/W All-Digital Time-Domain CNN Engine Using Bi-Directional Memory Delay Lines for Energy Efficient Edge Computing. IEEE J. Solid-State Circuits 55, 1 (2020), 60--75. https://doi.org/10.1109/JSSC.2019. 2939888Google ScholarCross Ref
Karen Simonyan et al. 2015. Very Deep Convolutional Networks for Large- Scale Image Recognition. In Proc. Int. Conf. Learn. Represent. (ICLR). https: //doi.org/arXiv:1409.1556Google Scholar
Jiahao Song et al. 2021. TD-SRAM: Time-Domain-Based In-Memory Computing Macro for Binary Neural Networks. IEEE Trans. Circuits Syst.I, Reg. Papers 68, 8 (2021), 3377--3387. https://doi.org/10.1109/TCSI.2021.3083275Google ScholarCross Ref
Tim Stadtmann et al. 2020. From Quantitative Analysis to Synthesis of Efficient Binary Neural Networks. In IEEE Int. Conf. Mach. Learn. Appl. (ICMLA). 93--100. https://doi.org/10.1109/ICMLA51294.2020.00024Google Scholar
Dewei Wang et al. 2022. DIMC: 2219TOPS/W 2569F2/b Digital In-Memory Computing Macro in 28nm Based on Approximate Arithmetic Hardware. In IEEE Int. Solid-State Circuits Conf. (ISSCC). 266--267. https://doi.org/10.1109/ ISSCC42614.2022.9731659Google Scholar
Cheng-Xin Xue et al. 2021. A 22nm 4Mb 8b-Precision ReRAM Computing-in- Memory Macro with 11.91 to 195.7 TOPS/W for Tiny AI Edge Devices. In IEEE Int. Solid-State Circuits Conf. (ISSCC). 246--247. https://doi.org/10.1109/ISSCC42613. 2021.9365769Google Scholar
Shihui Yin et al. 2020. XNOR-SRAM: In-Memory Computing SRAM Macro for Binary/Ternary Deep Neural Networks. IEEE J. Solid-State Circuits 55, 6 (2020), 1733--1743. https://doi.org/10.1109/JSSC.2019.2963616Google Scholar
Chengshuo Yu et al. 2021. A Zero-Skipping Reconfigurable SRAM In-Memory Computing Macro with Binary-Searching ADC. In IEEE Eur. Solid State Circuits Conf. (ESSCIRC). 131--134. https://doi.org/10.1109/ESSCIRC53450.2021.9567819Google Scholar
Chengshuo Yu et al. 2022. A 65-nm 8T SRAM Compute-in-Memory Macro With Column ADCs for Processing Neural Networks. IEEE J. Solid-State Circuits 57, 11 (2022), 3466--3476. https://doi.org/10.1109/JSSC.2022.3162602Google ScholarCross Ref

Index Terms

Scalable Time-Domain Compute-in-Memory BNN Engine with 2.06 POPS/W Energy Efficiency for Edge-AI Devices
1. Computing methodologies
  1. Machine learning
2. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs
      1. Hardware accelerators

Recommendations

BNN-Flip: Enhancing the Fault Tolerance and Security of Compute-in-Memory Enabled Binary Neural Network Accelerators
ASPDAC '24: Proceedings of the 29th Asia and South Pacific Design Automation Conference

Compute-in-memory based binary neural networks or CiM-BNNs offer high energy/area efficiency for the design of edge deep neural network (DNN) accelerators, with only a mild accuracy reduction. However, for successful deployment, the design of CiM-BNNs ...
Read More
Towards High Performance and Accurate BNN Inference on FPGA with Structured Fine-Grained Pruning
ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design

As the extreme case of quantization networks, Binary Neural Networks (BNNs) have received tremendous attention due to many hardware-friendly properties in terms of storage and computation. To reach the limit of compact models, we attempt to combine ...
Read More
Networking Low-Power Energy Harvesting Devices: Measurements and Algorithms

Recent advances in energy harvesting materials and ultra-low-power communications will soon enable the realization of networks composed of energy harvesting devices. These devices will operate using very low ambient energy, such as energy harvested from ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023
June 2023
731 pages
ISBN:9798400701252
DOI:10.1145/3583781
General Chairs:
Himanshu Thapliyal
University of Tennessee, Knoxville, USA
,
Ronald DeMara
University of Central Florida, USA
,
Program Chairs:
Inna Partin-Vaisband
University of Illinois Chicago, USA
,
Srinivas Katkoori
University of South Florida, USA
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 June 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
binary neural networks
compute-in-memory
double-edge operation
time-domain computing
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate312of1,156submissions,27%
Upcoming Conference
GLSVLSI '24

Sponsor:

sigda

Great Lakes Symposium on VLSI 2024

June 12 - 14, 2024

Clearwater , FL , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 222
  Total Downloads
- Downloads (Last 12 months)222
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Scalable Time-Domain Compute-in-Memory BNN Engine with 2.06 POPS/W Energy Efficiency for Edge-AI Devices

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023

ABSTRACT

References

Cited By

Index Terms

Recommendations

BNN-Flip: Enhancing the Fault Tolerance and Security of Compute-in-Memory Enabled Binary Neural Network Accelerators

Towards High Performance and Accurate BNN Inference on FPGA with Structured Fine-Grained Pruning

Networking Low-Power Energy Harvesting Devices: Measurements and Algorithms

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Scalable Time-Domain Compute-in-Memory BNN Engine with 2.06 POPS/W Energy Efficiency for Edge-AI Devices

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023

ABSTRACT

References

Cited By

Index Terms

Recommendations

BNN-Flip: Enhancing the Fault Tolerance and Security of Compute-in-Memory Enabled Binary Neural Network Accelerators

Towards High Performance and Accurate BNN Inference on FPGA with Structured Fine-Grained Pruning

Networking Low-Power Energy Harvesting Devices: Measurements and Algorithms

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media