research-article

IMAC:: A Pre-Multiplier And Integrated Reduction Based Multiply-And-Accumulate Unit

Authors:
Bindu G. Gowda

IIIT-Bangalore, Bangalore, India

IIIT-Bangalore, Bangalore, India

0000-0003-2797-2363
View Profile

,
Prashanth H C

IIIT-Bangalore, Bangalore, India

IIIT-Bangalore, Bangalore, India

0000-0002-9650-3731
View Profile

,
Madhav Rao

IIIT-Bangalore, Bangalore, India

IIIT-Bangalore, Bangalore, India

0000-0003-2278-9148
View Profile

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023June 2023Pages 503–508https://doi.org/10.1145/3583781.3590265

Published:05 June 2023Publication History

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023

Pages 503–508

ABSTRACT

Multiply-and-accumulate (MAC) units are primarily utilized for convolution operations targeted towards signal and image processing workload. The compressors are applied at the partial product reduction stages to extract the multiplier output bits, which are later accumulated with an extra adder unit. The paper proposes an integrated approach where the other operand of the MAC unit is directly fed to the partial-product-matrix (PPM) before the product bits are evaluated. This integrated Multiplier-and-Accumulate (IMAC) approach saves an additional adder unit and instead extends the compressor, which is already used to reduce partial-product bits of the multiplier design. Compressors employed exact and approximate IMAC architectures were designed and evaluated through ASIC and FPGA flow. Five versions of inexact IMAC design were independently compared with traditional one-level approximation and two-level approximation in MAC designs. The proposed work is found to be hardware efficient when compared with state-of-art MAC units. The error metrics were either comparable or better for IMAC design when compared with separately designed approximate multipliers followed by exact or approximate adder units. The image blending application was considered to measure the quality metrics. The proposed IMAC design files are made freely available for further usage by the research and development community.

References

Yashaswi Mannepalli, Viraj Bharadwaj Korede, and Madhav Rao. Novel approximate multiplier designs for edge detection application. In Proceedings of the 2021 on Great Lakes Symposium on VLSI, GLSVLSI '21, page 371--377, New York, NY, USA, 2021. Association for Computing Machinery.Google ScholarDigital Library
Swagath Venkataramani, Vivek J. Kozhikkottu, Amit Sabne, Kaushik Roy, and Anand Raghunathan. Logic synthesis of approximate circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(10):2503--2515, 2020.Google ScholarCross Ref
Shalini Singh, Pavan Kumar Pothula, and Madhav Rao. Design and evaluation of on-chip dct accelerators based on novel approximate reverse carry propagate adders. In 2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pages 8--13, 2022.Google ScholarCross Ref
Vishesh Mishra, Divy Pandey, Saurabh Singh, Sagar Satapathy, Kaustav Goswami, Babita Jajodia, and Dip Sankar Banerjee. Art-mac: Approximate rounding and truncation based mac unit for fault-tolerant applications. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS), pages 1640--1644, 2022.Google ScholarCross Ref
Soujanya S R and Madhav Rao. Hardware characterization of integer-net based seizure detection models on fpga. In 2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), pages 224--231, 2022.Google Scholar
Prashanth H C, Soujanya S R, Bindu G Gowda, and Madhav Rao. Design and evaluation of in-exact compressor based approximate multipliers. In Proceedings of the Great Lakes Symposium on VLSI 2022, GLSVLSI '22, page 431--436, New York, NY, USA, 2022. Association for Computing Machinery.Google Scholar
Omkar G. Ratnaparkhi and Madhav Rao. Lead: Logarithmic exponent approximate divider for image quantization application. In Proceedings of the Great Lakes Symposium on VLSI 2022, GLSVLSI '22, page 437--442, New York, NY, USA, 2022. Association for Computing Machinery.Google Scholar
Omkar G Ratnaparkhi and Madhav Rao. Esas: Exponent series based approximate square root design. In 2022 25th Euromicro Conference on Digital System Design (DSD), pages 39--45, 2022.Google ScholarCross Ref
K J N S Bhargav, Sairam Palisetti, and Madhav Rao. A newton raphson method based approximate divider design for color quantization application. In 2021 18th International SoC Design Conference (ISOCC), pages 115--116, 2021.Google ScholarCross Ref
Kunal Bharathi, Jiang Hu, and Sunil P. Khatri. Scaled population subtraction for approximate computing. In 2020 IEEE 38th International Conference on Computer Design (ICCD), pages 348--355, 2020.Google ScholarCross Ref
H C Prashanth and Madhav Rao. Somalib: Library of exact and approximate activation functions for hardware-efficient neural network accelerators. In 2022 IEEE 40th International Conference on Computer Design (ICCD), pages 746--753, 2022.Google ScholarCross Ref
Prashanth H. C.. and Madhav Rao. Improving digital circuit synthesis of complex functions using binary weighted fitness and variable mutation rate in cartesian genetic programming. In Proceedings of the 14th International Joint Conference on Computational Intelligence - ECTA,, pages 112--120. INSTICC, SciTePress, 2022.Google Scholar
Nandagopal R, Rajashree V, and Madhav Rao. Accelerated piece-wise-linear implementation of floating-point power function. In 2022 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS), pages 1--4, 2022.Google Scholar
Alice Sokolova, Mohsen Imani, Andrew Huang, Ricardo Garcia, Justin Morris, Tajana Rosing, and Baris Aksanli. Maccelerator: Approximate arithmetic unit for computational acceleration. In 2021 22nd International Symposium on Quality Electronic Design (ISQED), pages 444--449, 2021.Google ScholarCross Ref
Hang Xiao, Haobo Xu, Xiaoming Chen, Yujie Wang, and Yinhe Han. Fast and high-accuracy approximate mac unit design for cnn computing. IEEE Embedded Systems Letters, 14(3):155--158, 2022.Google ScholarCross Ref
Gunho Park, Jaeha Kung, and Youngjoo Lee. Design and analysis of approximate compressors for balanced error accumulation in mac operator. IEEE Transactions on Circuits and Systems I: Regular Papers, 68(7):2950--2961, 2021.Google ScholarCross Ref
Yicheng Lu, Weiwei Shan, and Jiaming Xu. A depthwise separable convolution neural network for small-footprint keyword spotting using approximate mac unit and streaming convolution reuse. In 2019 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), pages 309--312, 2019.Google ScholarCross Ref
Vojtech Mrazek, Zdenek Vasicek, Lukas Sekanina, Muhammad Abdullah Hanif, and Muhammad Shafique. Alwann: Automatic layer-wise approximation of deep neural network accelerators without retraining. In 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 1--8, 2019.Google ScholarCross Ref
Bahar Asgari, Ramyad Hadidi, and Hyesoon Kim. Meissa: Multiplying matrices efficiently in a scalable systolic architecture. In 2020 IEEE 38th International Conference on Computer Design (ICCD), pages 130--137, 2020.Google ScholarCross Ref
Mingqiang Huang, Yucen Liu, Changhai Man, Kai Li, Quan Cheng, Wei Mao, and Hao Yu. A high performance multi-bit-width booth vector systolic accelerator for nas optimized deep learning neural networks. IEEE Transactions on Circuits and Systems I: Regular Papers, 69(9):3619--3631, 2022.Google ScholarCross Ref
Wei Mao, Liuyao Dai, Kai Li, Quan Cheng, Yuhang Wang, Laimin Du, Shaobo Luo, Mingqiang Huang, and Hao Yu. An energy-efficient mixed-bitwidth systolic accelerator for nas-optimized deep neural networks. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 30(12):1878--1890, 2022.Google ScholarCross Ref
G. A. Gillani, M. A. Hanif, B. Verstoep, S. H. Gerez, M. Shafique, and A. B. J. Kokkeler. Macish: Designing approximate mac accelerators with internal-self-healing. IEEE Access, 7:77142--77160, 2019.Google ScholarCross Ref
Mahmoud Masadeh, Osman Hasan, and Sofiène Tahar. Input-conscious approximate multiply-accumulate (mac) unit for energy-efficiency. IEEE Access, 7:147129--147142, 2019.Google ScholarCross Ref
Elizabeth Adams, Suganthi Venkatachalam, and Seok-Bum Ko. Energy-efficient approximate mac unit. In 2019 IEEE International Symposium on Circuits and Systems (ISCAS), pages 1--4, 2019.Google ScholarCross Ref
https://sites.google.com/view/integratedmac/home.Google Scholar
Yen-Jen Chang, Yu-Cheng Cheng, Shao-Chi Liao, and Chun-Huo Hsiao. A low power radix-4 booth multiplier with pre-encoded mechanism. IEEE Access, 8: 114842--114853, 2020.Google ScholarCross Ref
Darjn Esposito, Antonio Giuseppe Maria Strollo, Ettore Napoli, Davide De Caro, and Nicola Petra. Approximate multipliers based on new approximate compressors. IEEE Transactions on Circuits and Systems I: Regular Papers, 65(12): 4169--4182, 2018.Google ScholarCross Ref

Index Terms

IMAC:: A Pre-Multiplier And Integrated Reduction Based Multiply-And-Accumulate Unit
1. Hardware
  1. Integrated circuits
    1. Logic circuits
      1. Arithmetic and datapath circuits
      2. Combinational circuits
    2. Reconfigurable logic and FPGAs
  2. Very large scale integration design
    1. Application-specific VLSI designs
      1. Application specific integrated circuits

Recommendations

Design and Evaluation of Adiabatic Arithmetic Units
Special issue: analog design issues in digital VSLI circuits and systems

Adiabatic design is an attractive approach to reducing energy consumption in VLSI circuits after exhausting the potential of conventional energy-saving techniques. Despite the plethora of adiabatic logic architectures that have been proposed in recent years,...
Read More
Integration workshop: Expandable arithmetic block macrocell

Parameterized macrocells are a natural extension of libraries of less complex standard cells. An expandable arithmetic block macrocell was designed and implemented. The arithmetic block performs multiplication (using a sequential algorithm), ...
Read More
Logic Networks of Carry-Save Adders

logic networks of carry-save adders such as high-speed multipliers, multioperand adders, and double-rail input parallel adders are designed based on the parallel adders with a minimum number of NOR gates discussed in [1]. After a discussion of the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023
June 2023
731 pages
ISBN:9798400701252
DOI:10.1145/3583781
General Chairs:
Himanshu Thapliyal
University of Tennessee, Knoxville, USA
,
Ronald DeMara
University of Central Florida, USA
,
Program Chairs:
Inna Partin-Vaisband
University of Illinois Chicago, USA
,
Srinivas Katkoori
University of South Florida, USA
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 June 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
adders
approximate computing
approximate mac
image blending
image processing
multipliers
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate312of1,156submissions,27%
Upcoming Conference
GLSVLSI '24

Sponsor:

sigda

Great Lakes Symposium on VLSI 2024

June 12 - 14, 2024

Clearwater , FL , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 93
  Total Downloads
- Downloads (Last 12 months)93
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

IMAC:: A Pre-Multiplier And Integrated Reduction Based Multiply-And-Accumulate Unit

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023

ABSTRACT

References

Cited By

Index Terms

Recommendations

Design and Evaluation of Adiabatic Arithmetic Units

Integration workshop: Expandable arithmetic block macrocell

Logic Networks of Carry-Save Adders

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

IMAC:: A Pre-Multiplier And Integrated Reduction Based Multiply-And-Accumulate Unit

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023

ABSTRACT

References

Cited By

Index Terms

Recommendations

Design and Evaluation of Adiabatic Arithmetic Units

Integration workshop: Expandable arithmetic block macrocell

Logic Networks of Carry-Save Adders

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media