|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
A Coarse-Grain Hierarchical Technique for 2-Dimensional FFT on Configurable Parallel Computers
Xizhen XU Sotirios G. ZIAVRAS
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E89-D
No.2
pp.639-646 Publication Date: 2006/02/01 Online ISSN: 1745-1361
DOI: 10.1093/ietisy/e89-d.2.639 Print ISSN: 0916-8532 Type of Manuscript: Special Section PAPER (Special Section on Parallel/Distributed Computing and Networking) Category: Parallel/Distributed Algorithms Keyword: configurable computing, FPGA, SIMD, parallel processing, memory switching, FFT, hardware-software codesign,
Full Text: PDF(417.3KB)>>
Summary:
FPGAs (Field-Programmable Gate Arrays) have been widely used as coprocessors to boost the performance of data-intensive applications [1],[2]. However, there are several challenges to further boost FPGA performance: the communication overhead between the host workstation and the FPGAs can be substantial; large-scale applications cannot fit in a single FPGA because of its limited capacity; mapping an application algorithm to FPGAs still remains a daunting job in configurable system design. To circumvent these problems, we propose in this paper the FPGA-based Hierarchical-SIMD (H-SIMD) machine with its codesign of the Pyramidal Instruction Set Architecture (PISA). PISA comprises high-level instructions implemented as FPGA functions of coarse-grain SIMD (Single-Instruction, Multiple-Data) tasks to facilitate ease of program development, code portability across different H-SIMD implementations and high performance. We assume a multi-FPGA board where each FPGA is configured as a separate SIMD machine. Multiple FPGA chips can work in unison at a higher SIMD level, if needed, controlled by the host. Additionally, by using a memory switching scheme and the high-level PISA to partition applications into coarse-grain tasks, host-FPGA communication overheads can be hidden. We enlist the two-dimensional Fast Fourier Transform (2D FFT) to test the effectiveness of H-SIMD. The test results show sustained high performance for this problem. The H-SIMD machine even outperforms a Xeon processor for this problem.
|
open access publishing via
|
|
|
|
|
|
|
|