research-article

Open access

iSPADE: End-to-end Sparse Architecture for Dense DNN Acceleration via Inverted-bit Representation

Authors:

Jongsun ParkAuthors Info & Claims

ISLPED '24: Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design

Pages 1 - 6

https://doi.org/10.1145/3665314.3670808

Published: 09 September 2024 Publication History

Abstract

While recent cutting-edge deep neural network (DNN) models, such as large language models (LLMs), demonstrate remarkable capabilities, their inherent dense data characteristics limit the performance and energy gains achievable through sparse acceleration. In this paper, we introduce the iSPADE architecture, which sparsifies end-to-end execution of dense DNNs to directly adapt the advantages of sparse acceleration without applying accuracy-sensitive techniques such as pruning. First, we propose inverted-bit representation to eliminate repetitive sign bits in 2's complement representation. Leveraging the inverted-bit representation that generates a significant number of zero bits, we propose data packing and computation skipping techniques to reduce both redundant data movement and computation. Finally, we present an iSPADE bit-slice hardware architecture that efficiently supports and accelerates the proposed sparse dataflow. In the evaluation results, we assess performance across general DNN workloads using 8 popular DNNs. iSPADE achieves 4.1X and 4.5X improvements in energy efficiency and speedup, respectively, over the previous state-of-the-art bit-slice accelerators, and it realizes a 1.7X reduction in memory footprint.

References

[1]

Jacob Devlin et al. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL-HLT, pages 4171--4186, 2019.

[2]

Zhenzhong Lan et al. Albert: A lite bert for self-supervised learning of language representations. In ICLR, 2020.

[3]

Alec Radford et al. Language models are unsupervised multitask learners. In OpenAI blog, 1(8):9, 2019.

[4]

Kaiming He et al. Deep residual learning for image recognition. In CVPR, pages 770--778, 2016.

[5]

Alexey Dosovitskiy et al. An image is worth 16x16 words: Transformers for image recognition at scale. In ICLR, 2021.

[6]

Zhuang Liu et al. A convnet for the 2020s. In CVPR, pages 11976--11986, 2022.

[7]

Glenn Jocher et al. Ultralytics yolov5, 2020.

[8]

Ben Mildenhall et al. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99--106, 2021.

Digital Library

[9]

Angshuman Parashar et al. Scnn: An accelerator for compressed-sparse convolutional neural networks. In ISCA, pages 27--40, 2017.

[10]

Alberto Delmas Lascorz et al. Bit-tactical: A software/hardware approach to exploiting value and bit sparsity in neural networks. In ASPLOS, pages 749--763, 2019.

[11]

Song Han et al. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. In ICLR, 2016.

[12]

Tianlong Chen et al. Chasing sparsity in vision transformers: An end-to-end exploration. In NeurIPS, volume 34, pages 19974--19988, 2021.

[13]

Hardik Sharma et al. Bit fusion: Bit-level dynamically composable architecture for accelerating deep neural network. In ISCA, pages 764--775, 2018.

[14]

Sungju Ryu et al. Bitblade: Area and energy-efficient precision-scalable neural network accelerator with bitwise summation. In DAC, pages 1--6, 2019.

[15]

Dongseok Im et al. Sibia: Signed bit-slice architecture for dense dnn acceleration with slice-level sparsity exploitation. In HPCA, pages 69--80, 2023.

[16]

Yann LeCun et al. Efficient backprop. In Neural networks: Tricks of the trade, pages 9--50. Springer, 2002.

[17]

Sukhan Lee et al. Leveraging power-performance relationship of energy-efficient modern dram devices. IEEE Access, 6:31387--31398, 2018.

Index Terms

iSPADE: End-to-end Sparse Architecture for Dense DNN Acceleration via Inverted-bit Representation
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks

Recommendations

An antinoise sparse representation method for robust face recognition via joint l1 and l2 regularization

L1 or L2 regularization based representation is not antinoise enough.An antinoise sparse representation via joint L1 and L2 is proposed.The rationale of the objective function for fusion is analyzed.Recognition of noisy samples is evaluated as true ...
Adaptive sparse and dense hybrid representation with nonconvex optimization
Abstract
Sparse representation has been widely used in signal processing, pattern recognition and computer vision etc. Excellent achievements have been made in both theoretical researches and practical applications. However, there are two limitations on ...
Image super-resolution via adaptive sparse representation

Existing methods for image super-resolution (SR) usually use 1-regularization and 2-regularization to emphasize the sparsity and the correlation, respectively. In order to coordinate the sparsity and correlation synthetically, this paper proposes an ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISLPED '24: Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and Design

August 2024

384 pages

ISBN:9798400706882

DOI:10.1145/3665314

Chair:
Pascal Meinerzhagen,
Program Chair:
Kapil Dev,
Program Co-chair:
Jerald Yoo

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGDA: ACM Special Interest Group on Design Automation
IEEE CAS
IEEE EDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 September 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISLPED '24

Sponsor:

SIGDA

ISLPED '24: 29th ACM/IEEE International Symposium on Low Power Electronics and Design

August 5 - 7, 2024

CA, Newport Beach, USA

Acceptance Rates

Overall Acceptance Rate 398 of 1,159 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
133
Total Downloads

Downloads (Last 12 months)133
Downloads (Last 6 weeks)35

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten