ABSTRACT
In this paper, we present techniques for energy-efficient design at the algorithm level using FPGAs. We then use these techniques to create energy-efficient designs for two signal processing kernel applications: fast Fourier transform (FFT) and matrix multiplication. We evaluate the performance, in terms of both latency and energy efficiency, of FPGAs in performing these tasks. Using a Xilinx Virtex-II as the target FPGA, we compare the performance of our designs to those from the Xilinx library as well as to conventional algorithms run on the PowerPC core embedded in the Virtex-II Pro and the Texas Instruments TMS320C6415. Our evaluations are done both through estimation based on energy and latency equations and through low-level simulation. For FFT, our designs dissipated an average of 60% less energy than the design from the Xilinx library and 56% less than the DSP. Our designs showed a factor of 10 improvement over the embedded processor. These results provide concrete evidence to substantiate the idea that FPGAs can outperform DSPs and embedded processors in signal processing. Further, they show that FPGAs can achieve this performance while still dissipating less energy than the other two types of devices.
- Altera Corporation, http://www.altera.com. Apex 20K data sheet, 2002.Google Scholar
- B. Bass, "A Low-Power, High-Performance, 1024-Point FFT Processor," IEEE Journal of Solid-State Circuits, Vol. 34, No. 3 (1999) 380--38.Google ScholarCross Ref
- S. Choi, J.-W. Jang, S. Mohanty, and V. K. Prasanna, "Domain-Specific Modeling for Rapid System-Wide Energy Estimation of Reconfigurable Architectures," Engineering of Reconfigurable Systems and Algorithms, 2002.Google Scholar
- E. Chu and A. George, Inside the FFT Black Box, CRC Press, 2000.Google Scholar
- J. A. B. Fortes, K. S. Fu, and B. Wah, "Systematic Approaches to the Design of Algorithmically Specified Systolic Arrays," International Conference on Acoustics, Signal, and Speech Processing, 1985.Google Scholar
- C. Dick, "The Platform FPGA: Enabling the Software Radio," Software Defined Radio Technical Conference and Product Exposition (SDR), November 2002.Google Scholar
- R. Hogg and E. Tanis, Probability and Statistical Inference, 6th Eds., Prentice Hall, pp656--657, 2001.Google Scholar
- J.-w. Jang, S. Choi, and V. K. Prasanna, "Energy Efficient Matrix Multiplication on FPGAs," Field-Programmable Logic and Applications, 2002. Google ScholarDigital Library
- S. Lei and K. Yao, "Efficient Systolic Array Implementations of Digital Filtering," IEEE International Symposium on Circuits and Systems, 1989.Google Scholar
- A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing, Prentice Hall, 1989. Google ScholarDigital Library
- A. Raghunathan, N. K. Jha, and S. Dey, High-level Power Analysis and Optimization, Kluwer Academic Publishers, 1998. Google ScholarDigital Library
- R. Scrofano, S. Choi, and V. K. Prasanna, "Energy Efficiency of FPGAs and Programmable Processors for Matrix Multiplication," IEEE International Conference on Field-Programmable Technology, 2002.Google Scholar
- L. Shang, A. Kaviani, and K. Bathala, "Dynamic Power Consumption in Virtex-II FPGA Family," International Symposium on Field Programmable Gate Arrays, 2002. Google ScholarDigital Library
- H. Styles and W. Luk, "Customising Graphics Application : Techniques and Programming Interface," IEEE Symposium on Field Programmable Custom Computing Machines, 2000. Google ScholarDigital Library
- R. Tessier and W. Burleson, "Reconfigurable Computing and Digital Signal Processing: A Survey," Journal of VLSI Signal Processing, May/June 2001. Google ScholarDigital Library
- Texas Instruments, http://www.ti.com.Google Scholar
- Xilinx Incorporated, http://www.xilinx.com.Google Scholar
- G. Yeap, Practical Low Power Digital VLSI Design, Kluwer Academic Publishers, 1998. Google ScholarDigital Library
Index Terms
- Energy-efficient signal processing using FPGAs
Recommendations
Energy- and time-efficient matrix multiplication on FPGAs
We develop new algorithms and architectures for matrix multiplication on configurable devices. These have reduced energy dissipation and latency compared with the state-of-the-art field-programmable gate array (FPGA)-based designs. By profiling well-...
Efficient AES implementations on ASICs and FPGAs
AES'04: Proceedings of the 4th international conference on Advanced Encryption StandardIn this article, we present two AES hardware architectures: one for ASICs and one for FPGAs. Both architectures utilize the similarities of encryption and decryption to provide a high throughput using only a relatively small area. The presented ...
Partitioning signal processing applications to different granularity reconfigurable logic
SSIP'05: Proceedings of the 5th WSEAS international conference on Signal, speech and image processingIn this paper, we propose a methodology for partitioning DSP applications between the fine and coarse-grain reconfigurable hardware for improving performance. The fine-grain logic is implemented by an embedded FPGA unit, while for the coarse-grain ...
Comments