Abstract:
Matching nucleic acid sequences has long become the performance bottleneck in genome assembly which aims to connect enormous partial genome reads without prior knowledge ...Show MoreMetadata
Abstract:
Matching nucleic acid sequences has long become the performance bottleneck in genome assembly which aims to connect enormous partial genome reads without prior knowledge of the reference sequence. The intensive and random data accesses of querying sequences using the widely adopted FM-Index data structure have caused in-efficient usage and long runtime of the memory system. Existing software FM-Index tools are limited on algorithmic inefficiency and poor processing parallelism. Solutions on GPU, FPGA and ASIC focus mainly on computational acceleration while still bottlenecked at the memory-bound nature of querying FM-Index. This paper proposes DSIM, a scalable FM-Index querying on near-DRAM accelerators. DSIM supports highly parallel multi-step query processing by distributing partial FM-Index table to different DRAM chips. Each genome sequence is partitioned into shorter queries and dispatched to the corresponding DRAM chip for string lookup. The optimized data layout and execution control on DRAM enables high row-data reuse and minimizes CPU-DRAM data transfers. The light-weight mapping scheme on the host CPU facilitates effective query distribution to DRAM chips and further supports scalability to multiple DIMMs (Dual-Inline Memory Modules). An in-DRAM arbiter is implemented to control the intra-chip data processing without affecting the external memory controller and DDR protocol. Experiments on 128-chip DRAM system showed that DSIM achieves up to 231 \times and 8.9 \times overall speedup compared to the software FM-Index tool and the state-of-the-art near-DRAM solution respectively.
Published in: IEEE Journal on Emerging and Selected Topics in Circuits and Systems ( Volume: 12, Issue: 2, June 2022)