skip to main content
10.1145/3489517.3530446acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article
Open access

Processing-in-SRAM acceleration for ultra-low power visual 3D perception

Published: 23 August 2022 Publication History

Abstract

Real-time ego-motion tracking and 3D structural estimation are the fundamental tasks for the ubiquitous cyper-physical systems, and they can be conducted via the state-of-the-art Edge-Based Visual Odometry (EBVO) algorithm. However, the intrinsic data-intensive process of EBVO emplaces a memory-wall hurdle in practical deployment on conventional von-Neumann-style computing systems. In this work, we attempt to leverage SRAM based processing-in-memory (PIM) technique to alleviate such memory-wall bottleneck, so as to optimize the EBVO systematically from the perspectives of the algorithm layer and physical layer. In the algorithm layer, we first investigate the data reuse patterns of the essential computing kernels required for the feature detection and pose estimation steps in EBVO, and propose PIM friendly data layout and computing scheme for each kernel accordingly. We distill the basic logical and arithmetical operations required in the algorithm layer, and in the physical layer, we propose a novel bit-parallel and reconfigurable SRAM-PIM architecture to realize the operations with high computing precision and throughput. Our experimental result shows that the proposed multi-layer optimization allows for high tracking accuracy of EBVO, and it can improve 11x processing speed and reduce 20x energy consumption compared to the CPU implementation.

References

[1]
Shaizeen Aga, Supreet Jeloka, Arun Subramaniyan, Satish Narayanasamy, David Blaauw, and Reetuparna Das. 2017. Compute caches. In 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 481--492.
[2]
Khalid Al-Hawaj, Olalekan Afuye, Shady Agwa, Alyssa Apsel, and Christopher Batten. 2020. Towards a reconfigurable bit-serial/bit-parallel vector accelerator using in-situ processing-in-sram. In 2020 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 1--5.
[3]
Mustafa Ali, Akhilesh Jaiswal, Sangamesh Kodge, Amogh Agrawal, Indranil Chakraborty, and Kaushik Roy. 2020. IMAC: In-memory multi-bit multiplication and ACcumulation in 6T SRAM array. IEEE Transactions on Circuits and Systems I: Regular Papers 67, 8 (2020), 2521--2531.
[4]
Charles Eckert, Xiaowei Wang, Jingcheng Wang, Arun Subramaniyan, Ravi Iyer, Dennis Sylvester, David Blaaauw, and Reetuparna Das. 2018. Neural cache: Bit-serial in-cache acceleration of deep neural networks. In 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). IEEE, 383--396.
[5]
Jakob Engel, Thomas Schöps, and Daniel Cremers. 2014. LSD-SLAM: Large-scale direct monocular SLAM. In European conference on computer vision. Springer.
[6]
Pedro F Felzenszwalb and Daniel P Huttenlocher. 2012. Distance transforms of sampled functions. Theory of computing 8, 1 (2012), 415--428.
[7]
Christian Forster, Matia Pizzoli, and Davide Scaramuzza. 2014. SVO: Fast semi-direct monocular visual odometry. In 2014 IEEE international conference on robotics and automation (ICRA). IEEE, 15--22.
[8]
Daichi Fujiki, Scott Mahlke, and Reetuparna Das. 2019. Duality cache for data parallel acceleration. In Proceedings of the 46th International Symposium on Computer Architecture. 397--410.
[9]
Anders Grunnet-Jepsen, Michael Harville, Brian Fulkerson, Daniel Piro, Shirit Brook, and Jim Radford. [n.d.]. Introduction to Intel® RealSenseTM Visual SLAM and the T265 Tracking Camera. https://dev.intelrealsense.com/docs/intel-realsensetm-visual-slam-and-the-t265-tracking-camera
[10]
Xiaochen Guo, Engin Ipek, and Tolga Soyata. 2010. Resistive computation: Avoiding the power wall with low-leakage, STT-MRAM based computing. ACM SIGARCH computer architecture news 38, 3 (2010), 371--382.
[11]
Yuquan He, Ying Wang, Cheng Liu, and Lei Zhang. 2021. PicoVO: A Lightweight RGB-D Visual Odometry Targeting Resource-Constrained IoT Devices. In 2021 IEEE International Conference on Robotics and Automation (ICRA). 5567--5573.
[12]
Jr. Henry S. Warren. [n.d.]. Hacker's Delight: The Basics. https://www.informit.com/articles/article.aspx?p=1959565{&}seqNum=19
[13]
Mark Horowitz. 2014. 1.1 Computing's energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC). 10--14.
[14]
Supreet Jeloka, Naveen Bharathwaj Akesh, Dennis Sylvester, and David Blaauw. 2016. A 28 nm configurable memory (TCAM/BCAM/SRAM) using push-rule 6T bit cell enabling logic-in-memory. IEEE Journal of Solid-State Circuits 51,4 (2016), 1009--1021.
[15]
Rainer Kümmerle, Giorgio Grisetti, Hauke Strasdat, Kurt Konolige, and Wolfram Burgard. 2011. g2o: A general framework for graph optimization. In 2011 IEEE International Conference on Robotics and Automation. IEEE, 3607--3613.
[16]
Kyeongho Lee, Jinho Jeong, Sungsoo Cheon, Woong Choi, and Jongsun Park. 2020. Bit parallel 6T SRAM in-memory computing with reconfigurable bit-precision. In 2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 1--6.
[17]
Runze Liu, Jianlei Yang, Yiran Chen, and Weisheng Zhao. 2019. eslam: An energy-efficient accelerator for real-time orb-slam on fpga platform. In Proceedings of the 56th Annual Design Automation Conference 2019. 1--6.
[18]
Raul Mur-Artal and Juan D Tardós. 2017. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Transactions on Robotics 33, 5 (2017), 1255--1262.
[19]
Nicholas Nethercote and Julian Seward. 2007. Valgrind: a framework for heavy-weight dynamic binary instrumentation. ACM Sigplan notices 42, 6 (2007), 89--100.
[20]
Fabian Schenk and Friedrich Fraundorfer. 2017. Robust edge-based visual odometry using machine-learned edges. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 1297--1304.
[21]
Vivek Seshadri, Donghyuk Lee, Thomas Mullins, Hasan Hassan, Amirali Boroumand, Jeremie Kim, Michael A Kozuch, Onur Mutlu, Phillip B Gibbons, and Todd C Mowry. 2017. Ambit: In-memory accelerator for bulk bitwise operations using commodity DRAM technology. In 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 273--287.
[22]
Ali Shafiee, Anirban Nag, Naveen Muralimanohar, Rajeev Balasubramonian, John Paul Strachan, Miao Hu, R Stanley Williams, and Vivek Srikumar. 2016. ISAAC: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM SIGARCH Computer Architecture News 44, 3 (2016), 14--26.
[23]
Xin Si, Yung-Ning Tu, Wei-Hsing Huang, Jian-Wei Su, Pei-Jung Lu, Jing-Hong Wang, Ta-Wei Liu, Ssu-Yen Wu, Ruhui Liu, Yen-Chi Chou, et al. 2020. 15.5a 28nm 64kb 6t sram computing-in-memory macro with 8b mac operation for ai edge chips. In 2020 IEEE International Solid-State Circuits Conference-(ISSCC). IEEE, 246--248.
[24]
J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. 2012. A benchmark for the evaluation of RGB-D SLAM systems. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. 573--580.
[25]
Amr Suleiman, Zhengdong Zhang, Luca Carlone, Sertac Karaman, and Vivienne Sze. 2019. Navion: A 2--mw fully integrated real-time visual-inertial odometry accelerator for autonomous navigation of nano drones. IEEE Journal of Solid-State Circuits 54, 4 (2019), 1106--1119.
[26]
Elene Terry. 2019. Silicon at the Heart of HoloLens 2. In 2019 IEEE Hot Chips 31 Symposium (HCS). IEEE Computer Society, 1--26.
[27]
Yi Zhou, Hongdong Li, and Laurent Kneip. 2018. Canny-vo: Visual odometry with rgb-d cameras based on geometric 3-d-2-d edge alignment. IEEE Transactions on Robotics 35, 1 (2018), 184--199.

Cited By

View all
  • (2024)ISOAcc: In-situ Shift Operation-based Accelerator For Efficient in-SRAM MultiplicationACM Transactions on Design Automation of Electronic Systems10.1145/370720530:2(1-24)Online publication date: 5-Dec-2024
  • (2024)CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory AcceleratorsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640359(185-200)Online publication date: 27-Apr-2024
  • (2024)SLAM-CIM: A Visual SLAM Backend Processor With Dynamic-Range-Driven-Skipping Linear-Solving FP-CIM MacrosIEEE Journal of Solid-State Circuits10.1109/JSSC.2024.340280859:11(3853-3865)Online publication date: Nov-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference
July 2022
1462 pages
ISBN:9781450391429
DOI:10.1145/3489517
This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2022

Check for updates

Qualifiers

  • Research-article

Conference

DAC '22
Sponsor:
DAC '22: 59th ACM/IEEE Design Automation Conference
July 10 - 14, 2022
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)232
  • Downloads (Last 6 weeks)26
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)ISOAcc: In-situ Shift Operation-based Accelerator For Efficient in-SRAM MultiplicationACM Transactions on Design Automation of Electronic Systems10.1145/370720530:2(1-24)Online publication date: 5-Dec-2024
  • (2024)CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory AcceleratorsProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640359(185-200)Online publication date: 27-Apr-2024
  • (2024)SLAM-CIM: A Visual SLAM Backend Processor With Dynamic-Range-Driven-Skipping Linear-Solving FP-CIM MacrosIEEE Journal of Solid-State Circuits10.1109/JSSC.2024.340280859:11(3853-3865)Online publication date: Nov-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media