Abstract:
Many computation kernels that analyze large data streams can be accelerated by converting their recurrences to parallel systolic arrays. Application domains such as bioin...Show MoreMetadata
Abstract:
Many computation kernels that analyze large data streams can be accelerated by converting their recurrences to parallel systolic arrays. Application domains such as bioinformatics seek to minimize the total time to analyze a large set of discrete small inputs. While traditional methods for array synthesis produce a single most efficient array design, modern computational platforms support fast runtime reconfiguration that can choose among a collection of arrays optimized for different input characteristics, such as input size. In this work, we give dynamic programming algorithms to efficiently select a few array implementations from a large set of candidates so as to minimize total execution time on a dataset with a known distribution of input sizes. We apply our methods to accelerate the Nussinov RNA folding algorithm on a Xilinx Virtex 4 FPGA. Using runtime reconfiguration among five array instantiations, we are able to process a database of 2.7 billion RNA bases in 72 seconds, which is 48% faster than using a single array and 252times faster than comparable software. We demonstrate substantial efficiency benefits even when the input length distribution is biased toward low-throughput arrays, when reconfiguration time is as large as half a second, and when only a small number of distinct arrays may be used.
Date of Conference: 31 August 2009 - 02 September 2009
Date Added to IEEE Xplore: 29 September 2009
CD:978-1-4244-3892-1