Loading web-font TeX/Main/Regular
CINOC: Computing in Network-On-Chip With Tiled Many-Core Architectures for Large-Scale General Matrix Multiplications | IEEE Journals & Magazine | IEEE Xplore

CINOC: Computing in Network-On-Chip With Tiled Many-Core Architectures for Large-Scale General Matrix Multiplications


Abstract:

Large-scale general matrix multiplications (LMMs) are the key bottlenecks in various computation domains such as Transformer applications. However, it is a challenge to p...Show More

Abstract:

Large-scale general matrix multiplications (LMMs) are the key bottlenecks in various computation domains such as Transformer applications. However, it is a challenge to perform LMMs efficiently on traditional multi/many-core processor systems due to the large amount of memory access and the tight dependence of data transmission. By analyzing the aforementioned problems, we propose a computing in network-on-chip paradigm to perform LMMs by mitigating the performance losses caused by limited on-chip cache resources and memory bandwidth. Specifically, we propose a co-design of computable network-on-chip and the last-level cache method in tiled many-core architectures, which can reconstruct the redundant cache capacity as computable input buffer to balance the demands of computing, storage, and communication for the running LMM applications. Furthermore, a data-aware thread execution mechanism is also proposed to maximize the computational efficiency of thread streams in computable network. At the software level, memory-friendly matrix partitioning strategy, hybrid routing method and programming model are designed to bridge the gap between application demands and mismatched hardware/software interfaces. Experimental evaluations demonstrate that this proposed work achieves a computational latency reduction of 45% compared to the state-of-the-art GPU architecture, and the inference performance is improved by 2\times of the GPT network.
Page(s): 1256 - 1268
Date of Publication: 01 October 2024

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.