Abstract:
Large-scale SNNs, which employ advanced network architectures such as Transformers, have shown performance matching that of their ANN counterparts [1]. The inherent high ...Show MoreMetadata
Abstract:
Large-scale SNNs, which employ advanced network architectures such as Transformers, have shown performance matching that of their ANN counterparts [1]. The inherent high sparsity and the use of accumulation (AC) features in SNNs offer the promise of energy-efficient computing. However, the energy efficiency of SNNs is highly dependent on the data sparsity level and can deteriorate, falling even below the efficiency of resource-intensive ANNs when sparsity is low. Previous work [2] mitigates this problem by using heterogeneous SNN and CNN cores but incurs increased area and power consumption due to costly MAC array implementation. Therefore, a spiking-only accelerator with enhanced energy efficiency across all sparsity levels is highly desired. To achieve this goal, three critical challenges need to be addressed as shown in Fig. 1: 1) Redundant memory access of weights and partial sums across different time steps leads to significant power consumption and memory space waste. 2) Throughput degradation when exploiting unstructured spike sparsity. Fetching irregularly distributed non-zero spikes and corresponding weights one by one incurs long latency, compromising the benefits of high sparsity in large-scale SNNs. 3) One-size-fits-all scheduling leads to imbalanced efficiency across different operators. A homogeneous scheduler cannot be applicable to all operators at maximal efficiency because they have distinctive computational characteristics.
Published in: 2024 IEEE Custom Integrated Circuits Conference (CICC)
Date of Conference: 21-24 April 2024
Date Added to IEEE Xplore: 15 May 2024
ISBN Information: