skip to main content
10.1145/3508352.3549385acmconferencesArticle/Chapter ViewAbstractPublication PagesiccadConference Proceedingsconference-collections
research-article
Public Access

Tunable Precision Control for Approximate Image Filtering in an In-Memory Architecture with Embedded Neurons

Published: 22 December 2022 Publication History

Abstract

This paper presents a novel hardware-software co-design consisting of a Processing in-Memory (PiM) architecture with embedded neural processing elements (NPE) that are highly reconfigurable. The PiM platform and proposed approximation strategies are employed for various image filtering applications while providing the user with fine-grain dynamic control over energy efficiency, precision, and throughput (EPT). The proposed co-design can change the Peak Signal to Noise Ratio (PSNR, output quality metric for image filtering applications) from 25dB to 50dB (acceptable PSNR range for image filtering applications) without incurring any extra cost in terms of energy or latency. While switching from accurate to approximate mode of computation in the proposed co-design, the maximum improvement in energy efficiency and throughput is 2X. However, the gains in energy efficiency against a MAC-based PE array with the proposed memory platform are 3X-6X. The corresponding improvements in throughput are 2.26X-4.52X, respectively.

References

[1]
G. Anusha and P. Deepa. "Design of approximate adders and multipliers for error tolerant image processing". en. In: Microprocessors and Microsystems 72 (Feb. 2020), p. 102940. ISSN: 01419331. (Visited on 05/18/2022).
[2]
Amirali Boroumand et al. "Google Workloads for Consumer Devices: Mitigating Data Movement Bottlenecks". en. In: ACM SIGPLAN Notices 53.2 (Nov. 2018), pp. 316--331. ISSN: 0362-1340, 1558-1160. (Visited on 05/18/2022).
[3]
D. Celia, Vinita Vasudevan, and Nitin Chandrachoodan. "Probabilistic Error Modeling for Two-part Segmented Approximate Adders". In: Florence: IEEE, 2018, pp. 1--5. ISBN: 9781538648810. (Visited on 05/18/2022).
[4]
Indranil Chakraborty et al. "Resistive Crossbars as Approximate Hardware Building Blocks for Machine Learning: Opportunities and Challenges". In: Proceedings of the IEEE 108.12 (Dec. 2020), pp. 2276--2310. ISSN: 0018-9219, 1558-2256. (Visited on 05/18/2022).
[5]
Quan Deng et al. "LAcc: Exploiting Lookup Table-based Fast and Accurate Vector Multiplication in DRAM-based CNN Accelerator". en. In: Las Vegas NV USA: ACM, June 2019, pp. 1--6. ISBN: 9781450367257. (Visited on 05/18/2022).
[6]
Edouard Giacomin et al. "A Robust Digital RRAM-Based Convolutional Block for Low-Power Image Processing and Learning Applications". In: 66.2 (Feb. 2019), pp. 643--654. ISSN: 1549-8328, 1558-0806. (Visited on 05/18/2022).
[7]
Peng Gu et al. "iPIM: Programmable In-Memory Image Processing Accelerator Using Near-Bank Architecture". In: Valencia, Spain: IEEE, May 2020, pp. 804--817. ISBN: 9781728146614. (Visited on 05/18/2022).
[8]
Vaibhav Gupta et al. "Low-Power Digital Signal Processing Using Approximate Adders". In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 32.1 (2013), pp. 124--137.
[9]
Ameer Haj-Ali et al. "IMAGING: In-Memory AlGorithms for Image processiNG". In: 65.12 (Dec. 2018), pp. 4258--4271. ISSN: 1549-8328, 1558-0806. (Visited on 05/18/2022).
[10]
Mingxuan He et al. "Newton: A DRAM-maker's Accelerator-in-Memory (AiM) Architecture for Machine Learning". In: Athens, Greece: IEEE, Oct. 2020, pp. 372--385. ISBN: 9781728173832. (Visited on 05/18/2022).
[11]
Chandan Kumar Jha, Ankita Nandi, and Joycee Mekie. "Quality Tunable Approximate Adder for Low Energy Image Processing Applications". In: Genoa, Italy: IEEE, Nov. 2019, pp. 642--645. ISBN: 9781728109961. (Visited on 05/18/2022).
[12]
Matthias Jung et al. "A new bank sensitive DRAMPower model for efficient design space exploration". In: 2016, pp. 283--288.
[13]
Yirong Kan et al. "A Multi-grained Reconfigurable Accelerator for Approximate Computing". In: Limassol, Cyprus: IEEE, July 2020, pp. 90--95. ISBN: 9781728157757. (Visited on 05/18/2022).
[14]
Khaveen Investments. https://seekingalpha.com/article/4346547-micron-samsung-and-sk-hynix-dram-oligopoly.
[15]
Yoongu Kim, Weikun Yang, and Onur Mutlu. "Ramulator: A Fast and Extensible DRAM Simulator". In: 15.1 (2016), pp. 45--49.
[16]
Young-Cheon Kwon et al. "25.4 A 20nm 6GB Function-In-Memory DRAM, Based on HBM2 with a 1.2TFLOPS Programmable Computing Unit Using Bank-Level Parallelism, for Machine Learning Applications". In: San Francisco, CA, USA: IEEE, Feb. 2021, pp. 350--352. ISBN: 9781728195490. (Visited on 05/18/2022).
[17]
Jungwon Lee et al. "A Novel Approximate Adder Design Using Error Reduced Carry Prediction and Constant Truncation". In: IEEE Access 9 (2021), pp. 119939--119953. ISSN: 2169--3536. (Visited on 05/18/2022).
[18]
Lei Lei et al. "Joint Computation Offloading and Multiuser Scheduling Using Approximate Dynamic Programming in NB-IoT Edge Computing System". In: IEEE Internet of Things Journal 6.3 (June 2019), pp. 5345--5362. ISSN: 2327-4662, 2372-2541. URL: https://ieeexplore.ieee.org/document/8648197/ (visited on 08/09/2022).
[19]
Bo Liu et al. "Binarized Weight Neural-Network Inspired Ultra-Low Power Speech Recognition Processor with Time-Domain Based Digital-Analog Mixed Approximate Computing". In: Seville, Spain: IEEE, Oct. 2020, pp. 1--5. ISBN: 9781728133201. (Visited on 05/18/2022).
[20]
Bo Liu et al. "EERA-ASR: An Energy-Efficient Reconfigurable Architecture for Automatic Speech Recognition With Hybrid DNN and Approximate Computing". In: IEEE Access 6 (2018), pp. 52227--52237. ISSN: 2169-3536. (Visited on 05/18/2022).
[21]
Khaled Al-Maaitah et al. "Configurable-accuracy approximate adder design with light-weight fast convergence error recovery circuit". In: Aqaba: IEEE, Oct. 2017, pp. 1--6. ISBN: 9781509059690. (Visited on 05/18/2022).
[22]
Sana Mazahir, Osman Hasan, and Muhammad Shafique. "Adaptive Approximate Computing in Arithmetic Datapaths". In: 35.4 (Aug. 2018), pp. 65--74. ISSN: 2168-2356, 2168-2364. (Visited on 05/18/2022).
[23]
MediaWiki. https://boofcv.org/index.php?title=File:Original_lena512.jpg.
[24]
Joshua San Miguel, Mario Badr, and Natalie Enright Jerger. "Load Value Approximation". In: 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture. Cambridge, United Kingdom: IEEE, Dec. 2014, pp. 127--139. ISBN: 9781479969982. (Visited on 05/18/2022).
[25]
Saburo Muroga. Threshold logic and its applications. New York: Wiley-Interscience, 1971. ISBN: 9780471625308.
[26]
Bharath Srinivas Prabakaran et al. "ApproxFPGAs: Embracing ASIC-Based Approximate Arithmetic Components for FPGA-Based Systems". In: San Francisco, CA, USA: IEEE, July 2020, pp. 1--6. ISBN: 9781728110851. (Visited on 05/18/2022).
[27]
Sepahrad Salavati, Mohammad Hossein Moaiyeri, and Kian Jafari. "Ultra-Efficient Nonvolatile Approximate Full-Adder With Spin-Hall-Assisted MTJ Cells for In-Memory Computing Applications". In: 57.5 (May 2021), pp. 1--11. ISSN: 0018-9464, 1941-0069. (Visited on 05/18/2022).
[28]
Vivek Seshadri et al. "Ambit: in-memory accelerator for bulk bitwise operations using commodity DRAM technology". en. In: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture. Cambridge Massachusetts: ACM, Oct. 2017, pp. 273--287. ISBN: 9781450349529. (Visited on 05/18/2022).
[29]
Gian Singh et al. "CIDAN-XE: Computing in DRAM with Artificial Neurons". In: Frontiers in Electronics 3 (Feb. 2022), p. 834146. ISSN: 2673-5857. URL: https://www.frontiersin.org/articles/10.3389/felec.2022.834146/full (visited on 08/09/2022).
[30]
Gian Singh et al. "CIDAN: Computing in DRAM with Artificial Neurons". In: Storrs, CT, USA: IEEE, Oct. 2021, pp. 349--356. ISBN: 9781665432191. (Visited on 05/18/2022).
[31]
Leonardo Bandeira Soares et al. "Design Methodology to Explore Hybrid Approximate Adders for Energy-Efficient Image and Video Processing Accelerators". In: 66.6 (June 2019), pp. 2137--2150. ISSN: 1549-8328, 1558-0806. (Visited on 05/18/2022).
[32]
Purab Ranjan Sutradhar et al. "pPIM: A Programmable Processor-in-Memory Architecture With Precision-Scaling for Deep Learning". In: IEEE Computer Architecture Letters 19.2 (July 2020), pp. 118--121. ISSN: 1556-6056, 1556-6064, 2473-2575. URL: https://ieeexplore.ieee.org/document/9146670/ (visited on 05/22/2022).
[33]
Ankit Wagle, Sunil Khatri, and Sarma Vrudhula. "A Configurable BNN ASIC using a Network of Programmable Threshold Logic Standard Cells". In: Hartford, CT, USA: IEEE, Oct. 2020, pp. 433--440. ISBN: 9781728197104. (Visited on 05/18/2022).
[34]
Ankit Wagle et al. "Threshold Logic in a Flash". In: Abu Dhabi, United Arab Emirates: IEEE, Nov. 2019, pp. 550--558. ISBN: 9781538666487. (Visited on 05/18/2022).
[35]
Wm. A. Wulf and Sally A. McKee. "Hitting the memory wall: implications of the obvious". en. In: ACM SIGARCH Computer Architecture News 23.1 (Mar. 1995), pp. 20--24. ISSN: 0163-5964. URL: https://dl.acm.org/doi/10.1145/216585.216588 (visited on 05/22/2022).
[36]
Wenbin Xu, Sachin S. Sapatnekar, and Jiang Hu. "A Simple Yet Efficient Accuracy-Configurable Adder Design". In: 26.6 (June 2018), pp. 1112--1125. ISSN: 1063-8210, 1557-9999. (Visited on 05/18/2022).
[37]
Hasan Erdem Yantir, Ahmed M. Eltawil, and Fadi J. Kurdahi. "A Hybrid Approximate Computing Approach for Associative In-Memory Processors". In: 8.4 (Dec. 2018), pp. 758--769. ISSN: 2156-3357, 2156-3365. (Visited on 05/18/2022).
[38]
Shihui Yin et al. "Vesti: Energy-Efficient In-Memory Computing Accelerator for Deep Neural Networks". In: 28.1 (Jan. 2020), pp. 48--61. ISSN: 1063-8210, 1557-9999. (Visited on 05/18/2022).
[39]
Yong-Bin Kim and T. Chen. "Assessing merged DRAM/logic technology". In: vol. 4. Atlanta, GA, USA: IEEE, 1996, pp. 133--136. ISBN: 9780780330733. (Visited on 05/18/2022).
[40]
Fakhreddine Zayer et al. "RRAM Crossbar-Based In-Memory Computation of Anisotropic Filters for Image Preprocessingloa". In: IEEE Access 8 (2020), pp. 127569--127580. ISSN: 2169-3536. (Visited on 05/18/2022).
[41]
Ning Zhu et al. "Enhanced low-power high-speed adder for error-tolerant application". In: 2010 International SoC Design Conference. Nov. 2010, pp. 323--327.

Cited By

View all
  • (2024)ReApprox-PIM: Reconfigurable Approximate Lookup-Table (LUT)-Based Processing-in-Memory (PIM) Machine Learning AcceleratorIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336782243:8(2288-2300)Online publication date: 1-Aug-2024

Index Terms

  1. Tunable Precision Control for Approximate Image Filtering in an In-Memory Architecture with Embedded Neurons
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design
            October 2022
            1467 pages
            ISBN:9781450392174
            DOI:10.1145/3508352
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Sponsors

            In-Cooperation

            • IEEE-EDS: Electronic Devices Society
            • IEEE CAS
            • IEEE CEDA

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 22 December 2022

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. and throughput (EPT)
            2. energy efficiency
            3. neural processing elements (NPE)
            4. peak signal to noise ratio (PSNR)
            5. precision
            6. processing in-memory (PiM)

            Qualifiers

            • Research-article

            Funding Sources

            Conference

            ICCAD '22
            Sponsor:
            ICCAD '22: IEEE/ACM International Conference on Computer-Aided Design
            October 30 - November 3, 2022
            California, San Diego

            Acceptance Rates

            Overall Acceptance Rate 457 of 1,762 submissions, 26%

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)94
            • Downloads (Last 6 weeks)13
            Reflects downloads up to 28 Feb 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)ReApprox-PIM: Reconfigurable Approximate Lookup-Table (LUT)-Based Processing-in-Memory (PIM) Machine Learning AcceleratorIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336782243:8(2288-2300)Online publication date: 1-Aug-2024

            View Options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Login options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media