Abstract:
DRAM-based near-memory architectures are recognized for their ability to deliver substantial energy efficiency and throughput to execute data-intensive tasks. However, th...Show MoreMetadata
Abstract:
DRAM-based near-memory architectures are recognized for their ability to deliver substantial energy efficiency and throughput to execute data-intensive tasks. However, the inherent limitations regarding area, power, and timing within DRAM allow the integration of only primitive processing elements with limited operations and application support. This paper introduces a near-memory processing architecture based on DRAM featuring a novel computing unit termed the neuron processing element (NPE). NPEs are capable of performing multiple arithmetic, logical, and predicate operations. With a well-defined instruction set, the NPEs can be programmed to support standard data formats for floating point and fixed point precision used in different AI/ML and signal processing applications. They can be dynamically reconfigured to switch operations during run-time without increasing overall latency or power consumption. The NPEs have a small area and power footprint compared to conventional MAC units and other functionally equivalent implementations, making them suitable for integration with DRAM without compromising its organization or timing constraints. Furthermore, this paper shows a substantial improvement in latency and energy consumption compared to prior in-memory architectures and demonstrates the efficacy of the proposed architecture for the acceleration of neural network inference.
Published in: 2024 IFIP/IEEE 32nd International Conference on Very Large Scale Integration (VLSI-SoC)
Date of Conference: 06-09 October 2024
Date Added to IEEE Xplore: 03 December 2024
ISBN Information: