skip to main content
10.1145/3453688.3461480acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
research-article

RECOIN: A Low-Power Processing-in-ReRAM Architecture for Deformable Convolution

Authors Info & Claims
Published:22 June 2021Publication History

ABSTRACT

The recent proposed Deformable Convolutional Networks (DCNs)greatly enhance the performance of conventional Convolutional Neural Networks (CNNs) on vision recognition tasks by allowing flexible input sampling during inference runtime. DCNs introduce an additional convolutional layer for adaptive sampling offset generation, followed by a bilinear interpolation (BLI) algorithm to integerize the generated non-integer offset values. Finally, a regular convolution is performed on the loaded input pixels. Compared with conventional CNNs, DCN demonstrated significantly increased computational complexity and irregular input-dependentmemory access patterns, making it a great challenge for deploying DCNs onto edge devices for real-time computer vision tasks. In this work, we propose RECOIN, a processing-in-memory (PIM) architecture, which supports DCN inference on resistive memory (ReRAM)crossbars, thus making the first DCN inference accelerator possible. We present a novel BLI processing engine that leverage both row-and column-oriented computation for in-situ BLI calculation. Amapping scheme and an address converter are particular designed to accommodate the intensive computation and irregular data access. We implement the DCN inference in a 4-stage pipeline and evaluate the effectiveness of RECOIN on six DCN models. Experimental results show RECOIN achieves respectively 225×and 17.4×improvement in energy efficiency compared to general-purpose CPU and GPU. Compared to two state-of-the-art ASIC accelerators, RECOIN achieve 26.8× and 20.4× speedup respectively.

Skip Supplemental Material Section

Supplemental Material

RECOIN_A Low-Power Processing-in-ReRAM Architecture for Deformable Convolution.mp4

mp4

193.8 MB

References

  1. Alex Krizhevsky et al. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.Google ScholarGoogle Scholar
  2. Karen Simonyan et al. Very deep convolutional networks for large-scale image recognition. arXiv, 2014.Google ScholarGoogle Scholar
  3. Kaiming He et al. Deep Residual Learning for Image Recognition. In CVPR, 2016.Google ScholarGoogle Scholar
  4. Evan Shelhamer et al. Fully convolutional networks for semantic segmentation. IEEE TPAMI, 2017.Google ScholarGoogle Scholar
  5. J. Dai et al. Deformable Convolutional Networks. In ICCV, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  6. Xizhou Zhu et al. Deformable convnets v2: More deformable, better results. In CVPR, 2019.Google ScholarGoogle Scholar
  7. Hang Zhang et al. Resnest: Split-attention networks. arXiv, 2020.Google ScholarGoogle Scholar
  8. Haozhi Qi et al. Deformable convolutional networks--coco detection and segmentation challenge 2017 entry. In ICCV COCO Challenge Workshop, 2017.Google ScholarGoogle Scholar
  9. L. Deng et al. Restricted Deformable Convolution-Based Road Scene Semantic Segmentation Using Surround View Cameras. IEEE T-ITS, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  10. Xiao Sun et al. Integral Human Pose Regression. In ECCV, 2018.Google ScholarGoogle Scholar
  11. K. Mac et al. Learning Motion in Feature Space: Locally-Consistent Deformable Convolution Networks for Fine-Grained Action Detection. In ICCV, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  12. M Mitchell Waldrop. The chips are down for moore's law. Nature News, 2016.Google ScholarGoogle Scholar
  13. Y. Chen et al. Dadiannao: A machine-learning supercomputer. In MICRO, 2014.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Shafiee et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars. In ISCA, 2016.Google ScholarGoogle Scholar
  15. N. P. Jouppi et al. In-data center performance analysis of a tensor processing unit. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. L. Song et al. Pipe Layer: A Pipelined ReRAM-Based Accelerator for Deep Learning. In HPCA), 2017.Google ScholarGoogle ScholarCross RefCross Ref
  17. Qijing Huang et al. Algorithm-hardware Co-design for Deformable Convolution. arXiv, 2020.Google ScholarGoogle Scholar
  18. H.. P. Wong et al. Metal--oxide rram. Proceedings of the IEEE, 2012.Google ScholarGoogle Scholar
  19. Merced-Grafals. Repeatable, accurate, and high speed multi-level programming of memristor 1T1R arrays for power efficient analog computing applications. Nanotechnology, 2016.Google ScholarGoogle Scholar
  20. Min Lin. Network in network. arXiv, 2013.Google ScholarGoogle Scholar
  21. V. Badrinarayanan et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE TPAMI, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  22. S. Ren et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE TPAMI, 2017.Google ScholarGoogle Scholar
  23. Jia Deng et al. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.Google ScholarGoogle Scholar
  24. X. Peng et al. DNN+NeuroSim: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators with Versatile Device Technologies. In IEDM, 2019.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. RECOIN: A Low-Power Processing-in-ReRAM Architecture for Deformable Convolution

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        GLSVLSI '21: Proceedings of the 2021 on Great Lakes Symposium on VLSI
        June 2021
        504 pages
        ISBN:9781450383936
        DOI:10.1145/3453688

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 June 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate312of1,156submissions,27%

        Upcoming Conference

        GLSVLSI '24
        Great Lakes Symposium on VLSI 2024
        June 12 - 14, 2024
        Clearwater , FL , USA

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader