ISOSceles: Accelerating Sparse CNNs through Inter-Layer Pipelining | IEEE Conference Publication | IEEE Xplore

ISOSceles: Accelerating Sparse CNNs through Inter-Layer Pipelining


Abstract:

Sparse CNNs dramatically reduce computation and storage costs over dense ones. But sparsity also makes CNNs more data-intensive, as each value is reused fewer times. Thus...Show More

Abstract:

Sparse CNNs dramatically reduce computation and storage costs over dense ones. But sparsity also makes CNNs more data-intensive, as each value is reused fewer times. Thus, current sparse CNN accelerators, which process one layer at a time, are bottlenecked by memory traffic.We present ISOSceles, a new sparse CNN accelerator that dramatically reduces data movement through inter-layer pipelining: overlapping the execution of consecutive layers so that a layer’s output activations are quickly consumed by the next layer without spilling them off-chip. Pipelining greatly increases reuse, but it is challenging to implement with existing approaches, which are limited to dense CNNs. ISOSceles relies on a novel input-stationary output-stationary (IS-OS) dataflow that consumes inputs and produces outputs in the same order, greatly reducing intermediate sizes over existing dataflows. ISOSceles implements IS-OS efficiently and leverages time-multiplexing and dynamic scheduling to pipeline multiple layers despite the large variations in work that sparsity induces.On a wide range of sparse CNNs, ISOSceles outperforms a state-of-the-art accelerator by gmean 4.3× (up to 6.7×), and reduces traffic by 4.7× (up to 8.5×) while using less area.
Date of Conference: 25 February 2023 - 01 March 2023
Date Added to IEEE Xplore: 24 March 2023
ISBN Information:

ISSN Information:

Conference Location: Montreal, QC, Canada

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.