research-article

RECOIN: A Low-Power Processing-in-ReRAM Architecture for Deformable Convolution

Authors:
Cheng Chu

Hefei University of Technology, Hefei, China

Hefei University of Technology, Hefei, China
View Profile

,
Fan Chen

Indiana University Bloomington, Bloomington, IN, USA

Indiana University Bloomington, Bloomington, IN, USA
View Profile

,
Dawen Xu

Hefei University of Technology, Hefei, China

Hefei University of Technology, Hefei, China
View Profile

,
Ying Wang

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
View Profile

GLSVLSI '21: Proceedings of the 2021 on Great Lakes Symposium on VLSIJune 2021Pages 235–240https://doi.org/10.1145/3453688.3461480

Published:22 June 2021Publication History

GLSVLSI '21: Proceedings of the 2021 on Great Lakes Symposium on VLSI

Pages 235–240

ABSTRACT

The recent proposed Deformable Convolutional Networks (DCNs)greatly enhance the performance of conventional Convolutional Neural Networks (CNNs) on vision recognition tasks by allowing flexible input sampling during inference runtime. DCNs introduce an additional convolutional layer for adaptive sampling offset generation, followed by a bilinear interpolation (BLI) algorithm to integerize the generated non-integer offset values. Finally, a regular convolution is performed on the loaded input pixels. Compared with conventional CNNs, DCN demonstrated significantly increased computational complexity and irregular input-dependentmemory access patterns, making it a great challenge for deploying DCNs onto edge devices for real-time computer vision tasks. In this work, we propose RECOIN, a processing-in-memory (PIM) architecture, which supports DCN inference on resistive memory (ReRAM)crossbars, thus making the first DCN inference accelerator possible. We present a novel BLI processing engine that leverage both row-and column-oriented computation for in-situ BLI calculation. Amapping scheme and an address converter are particular designed to accommodate the intensive computation and irregular data access. We implement the DCN inference in a 4-stage pipeline and evaluate the effectiveness of RECOIN on six DCN models. Experimental results show RECOIN achieves respectively 225×and 17.4×improvement in energy efficiency compared to general-purpose CPU and GPU. Compared to two state-of-the-art ASIC accelerators, RECOIN achieve 26.8× and 20.4× speedup respectively.

Supplemental Material

RECOIN_A Low-Power Processing-in-ReRAM Architecture for Deformable Convolution.mp4

mp4

193.8 MB

Download

References

Alex Krizhevsky et al. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.Google Scholar
Karen Simonyan et al. Very deep convolutional networks for large-scale image recognition. arXiv, 2014.Google Scholar
Kaiming He et al. Deep Residual Learning for Image Recognition. In CVPR, 2016.Google Scholar
Evan Shelhamer et al. Fully convolutional networks for semantic segmentation. IEEE TPAMI, 2017.Google Scholar
J. Dai et al. Deformable Convolutional Networks. In ICCV, 2017.Google ScholarCross Ref
Xizhou Zhu et al. Deformable convnets v2: More deformable, better results. In CVPR, 2019.Google Scholar
Hang Zhang et al. Resnest: Split-attention networks. arXiv, 2020.Google Scholar
Haozhi Qi et al. Deformable convolutional networks--coco detection and segmentation challenge 2017 entry. In ICCV COCO Challenge Workshop, 2017.Google Scholar
L. Deng et al. Restricted Deformable Convolution-Based Road Scene Semantic Segmentation Using Surround View Cameras. IEEE T-ITS, 2020.Google ScholarCross Ref
Xiao Sun et al. Integral Human Pose Regression. In ECCV, 2018.Google Scholar
K. Mac et al. Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action Detection. In ICCV, 2019.Google ScholarCross Ref
M Mitchell Waldrop. The chips are down for moore's law. Nature News, 2016.Google Scholar
Y. Chen et al. Dadiannao: A machine-learning supercomputer. In MICRO, 2014.Google ScholarDigital Library
A. Shafiee et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars. In ISCA, 2016.Google Scholar
N. P. Jouppi et al. In-data center performance analysis of a tensor processing unit. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), 2017.Google ScholarDigital Library
L. Song et al. Pipe Layer: A Pipelined ReRAM-Based Accelerator for Deep Learning. In HPCA), 2017.Google ScholarCross Ref
Qijing Huang et al. Algorithm-hardware Co-design for Deformable Convolution. arXiv, 2020.Google Scholar
H.. P. Wong et al. Metal--oxide rram. Proceedings of the IEEE, 2012.Google Scholar
Merced-Grafals. Repeatable, accurate, and high speed multi-level programming of memristor 1T1R arrays for power efficient analog computing applications. Nanotechnology, 2016.Google Scholar
Min Lin. Network in network. arXiv, 2013.Google Scholar
V. Badrinarayanan et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE TPAMI, 2017.Google ScholarCross Ref
S. Ren et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE TPAMI, 2017.Google Scholar
Jia Deng et al. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.Google Scholar
X. Peng et al. DNN+NeuroSim: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators with Versatile Device Technologies. In IEDM, 2019.Google ScholarCross Ref

Index Terms

RECOIN: A Low-Power Processing-in-ReRAM Architecture for Deformable Convolution
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. Hardware
  1. Emerging technologies
    1. Analysis and design of emerging devices and systems
      1. Emerging architectures

Recommendations

CNNWire: Boosting Convolutional Neural Network with Winograd on ReRAM based Accelerators
GLSVLSI '19: Proceedings of the 2019 on Great Lakes Symposium on VLSI

Resistive random access memory (ReRAM) demonstrates the great potential of in-memory processing for neural network (NN) acceleration. However, since the convolutional neural network (CNN) is widely known as compute-bound, current ReRAM-based ...
Read More
The restricted h -connectivity of the data center network DCell

Traditional data center networks (DCNs) are faced with many challenges with the development of cloud computing. This fact makes design of new DCNs represented by DCell networks become a hot research topic. For any integers k 0 and n 2 , the k -...
Read More
Efficient Process-in-Memory Architecture Design for Unsupervised GAN-based Deep Learning using ReRAM
GLSVLSI '19: Proceedings of the 2019 on Great Lakes Symposium on VLSI

The ending of Moore's Law makes domain-specific architecture as the future of computing. The most representative is the emergence of various deep learning accelerators. Among the proposed solutions, resistive random access memory (ReRAM) based process-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GLSVLSI '21: Proceedings of the 2021 on Great Lakes Symposium on VLSI
June 2021
504 pages
ISBN:9781450383936
DOI:10.1145/3453688
General Chairs:
Yiran Chen
Duke University, USA
,
Victor Zhirnov
Semiconductor Research Corporation, USA
,
Program Chairs:
Avesta Sasan
George Mason University, USA
,
Ioannis Savidis
Drexel University, USA
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 June 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
dcn
low power
reram
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate312of1,156submissions,27%
Upcoming Conference
GLSVLSI '24

Sponsor:

sigda

Great Lakes Symposium on VLSI 2024

June 12 - 14, 2024

Clearwater , FL , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 176
  Total Downloads
- Downloads (Last 12 months)23
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

RECOIN: A Low-Power Processing-in-ReRAM Architecture for Deformable Convolution

GLSVLSI '21: Proceedings of the 2021 on Great Lakes Symposium on VLSI

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

CNNWire: Boosting Convolutional Neural Network with Winograd on ReRAM based Accelerators

The restricted h -connectivity of the data center network DCell

Efficient Process-in-Memory Architecture Design for Unsupervised GAN-based Deep Learning using ReRAM

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

RECOIN: A Low-Power Processing-in-ReRAM Architecture for Deformable Convolution

GLSVLSI '21: Proceedings of the 2021 on Great Lakes Symposium on VLSI

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

CNNWire: Boosting Convolutional Neural Network with Winograd on ReRAM based Accelerators

The restricted h -connectivity of the data center network DCell

Efficient Process-in-Memory Architecture Design for Unsupervised GAN-based Deep Learning using ReRAM

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media