Abstract:
Contemporary configurable architectures have dedicated internal functional units such as multipliers, high-capacity storage RAM, and even CAM blocks. These RAM blocks all...Show MoreMetadata
Abstract:
Contemporary configurable architectures have dedicated internal functional units such as multipliers, high-capacity storage RAM, and even CAM blocks. These RAM blocks allow the implementations to cache data to be reused in the near future, thereby avoiding the latency of external memory accesses. We present a data allocation algorithm that utilizes the RAM blocks in the presence of a limited number of hardware registers. This algorithm, based on a compiler data reuse analysis, determines which data should be cached in the internal RAM blocks and when. The preliminary results, for a set of image/signal processing kernels targeting a Xilinx Virtex/spl trade/ FPGA device, reveal that despite the increase latency of accessing data in RAM blocks, designs that use them require smaller configurable resources than designs that exclusively use registers, while attaining comparable and in some cases even better performance.
Published in: Proceedings. 2004 IEEE International Conference on Field- Programmable Technology (IEEE Cat. No.04EX921)
Date of Conference: 06-08 December 2004
Date Added to IEEE Xplore: 14 February 2005
Print ISBN:0-7803-8651-5