Abstract:
Stride permutation is widely used in various digital signal processing algorithms when implemented on FPGAs. Permuting a long data sequence through hardware wiring leads ...View moreMetadata
Abstract:
Stride permutation is widely used in various digital signal processing algorithms when implemented on FPGAs. Permuting a long data sequence through hardware wiring leads to high area consumption and routing complexity. A preferable approach is to build a hardware structure to permute streaming data inputs. In this paper, we present an energy-efficient architecture to perform stride permutation on streaming data. The supported problem size and stride are powers of two. A three-stage structure, composed of two stages of interconnection networks and one stage of data buffers, is used as a baseline architecture. To improve the energy efficiency, we develop a data remapping technique which reduces the required memory by 50% at the expense of small amount of extra logic. We also present a multiplexer-based cyclic shift interconnection network. Our proposed architecture is evaluated using two performance metrics: composite Energy ×Area × Time (EAT) and energy efficiency (defined as points/Joule). The experimental results show that the proposed data remapping technique reduces up to 40% dynamic power consumption compared with the baseline architecture. The proposed architecture results in a high energy efficiency of up to 75.3 giga points/Joule, and has an EAT ratio of 0.31 to 0.35 over the baseline architecture for various streaming width w (2 ≤ w ≤ 32).
Date of Conference: 09-11 December 2013
Date Added to IEEE Xplore: 06 February 2014
ISBN Information:
Print ISSN: 2325-6532