Abstract
In this paper, a processor architecture tailored for radix-4 and mixed-radix FFT computations is described. The processor has native support for power-of-two transform sizes. Several optimizations have been used to improve the energy-efficiency of the processor and experiments show that a programmable solution can possess energy-efficiency comparable to fixed-function ASICs.
Similar content being viewed by others
References
Baas, B. M. (1999). A low-power, high-performance, 1024-point FFT processor. IEEE Journal of Solid State Circuits, 43(3), 380–387.
Baek, J. H., Kim, S. D., & Sunwoo, M. H. (2008). SPOCS: Application specific signal processor for OFDM communication systems. Journal of Signal Processing Systems, 53(3), 383–397.
Cohen, D. (1976). Simplified control of FFT hardware. IEEE Transactions on Acoustics, Speech, and Signal Processing, 24(6), 577–579.
Corporaal, H. (1997). Microprocessor architectures: From VLIW to TTA. Chichester: Wiley.
Deleganes, M., Douglas, J., Kommandur, B., & Patyra, M. (2002). Designing a 3 GHz, 130 nm, Intel Pentium4 processor. In Digest technical papers symp. VLSI circuits (pp. 230–233). Honolulu, HI.
Granata, J., Conner, M., & Tolimieri, R. (1992). Recursive fast algorithms and the role of the tensor product. IEEE Transactions on Signal Processing,40(12), 2921–2930.
Han, W., Erdogan, A. T., Arslan, T., & Hasan, M. (2008). High-performance low-power FFT cores. ETRI Journal, 30(3), 451–460.
Heikkinen, J., & Takala, J. (2007). Effects of program compression. Journal of Systems Architecture,53(10), 679–688.
Hung, C. H., Chen, S. G., & Chen, K. L. (2004). Design of an efficient variable-length FFT processor. In Proc. IEEE ISCAS (Vol. 2, pp. 833–836). Vancouver, Canada.
Intel: StrongARM SA-110 microprocessor for portable applications brief datasheet (1999).
Lee, H. Y., & Park, I. C.: Balanced binary-tree decomposition for area-efficient pipelined FFT processing. IEEE Transactions on Circuits and Systems, 54(4), 889–900.
Li, X., Lai, Z., & Cui, J. (2007). A low-power and small area FFT processor for OFDM demodulator. IEEE Transactions on Consumer Electronics, 53(2), 274–277.
Lim, S. Y., & Crosland, A. (2004). Implementing FFT in an FPGA co-processor. In Proc. int. embedded solutions event(pp. 230–233). Santa Clara, CA.
Lin, Y. T., Tsai, P. Y., & Chiueh, T. D. (2005). Low-power variable-length fast Fourier transform processor. IEE Proceedings on Computer and Digital Techniques, 152(4), 499–506.
Lin, Y. W., Liu, H. Y., & Lee, C. Y. (2004). Dynamic scaling FFT processor for DVB-T applications. IEEE Journal of Solid-State Circuits, 39(11), 2005–2013.
Liu, G., & Feng, Q. (2007). ASIC design of low-power reconfigurable FFT processor. In Int. conf. ASIC (pp. 44–47). Guilin, China.
Patel, K., Macii, E., & Poncino, M. (2004). Energy-performance tradeoffs for the shared memory in multi-processor systems-on-chip. In Proc. IEEE ISCAS (Vol. 2, pp. 361–364). Vancouver, BC, Canada.
Pitkänen, T., Mäkinen, R., Heikkinen, J., Partanen, T., & Takala, J. (2006). Low-power, high-performance TTA processor for 1,024-point fast Fourier transform. In S. Vassiliadis, S. Wong, & T. D. Hämäläinen (Eds.), Embedded computer systems: Architectures, modeling, and simulation: Proc. 6th int. workshop SAMOS 2006. LNCS (Vol. 4017, pp. 227–236). Berlin: Springer.
Pitkänen, T., Mäkinen, R., Heikkinen, J., Partanen, T., & Takala, J. (2006). Transport triggered architecture processor for mixed-radix FFT. In Conf. record asilomar conf. signals syst. comput. (pp. 84–88). Pacific Grove, CA.
Pitkänen, T., Partanen, T., & Takala, J. (2007). Low-power twiddle factor unit for FFT computation. In S. Vassiliadis, M. Bereković, & T. D. Hämäläinen (Eds.), Embedded computer systems: Architectures, modeling, and simulation: Proc. int. workshop SAMOS 2007, LNCS. (Vol. 4599, pp. 273–282). Berlin: Springer.
Pitkänen, T., & Takala, J. (2009). Low-power application-specific processor for FFT computations. In Proc. IEEE ICASSP (pp. 593–596). Taipei, Taiwan.
Pitkänen, T., Tanskanen, J. K., Mäkinen, R., & Takala, J. (2009). Parallel memory architecture for application-specific instruction-set processors. Journal of Signal Processing Systems, 57(1), 21–32.
Rabiner, L. R., & Gold, B. (1975). Theory and application of digital signal processing. Englewood Cliffs: Prentice Hall.
Rixner, S., Dally, W. J., Kapasi, U. J., Khailany, B., Lopez-Lagunas, A., Mattson, P. R., et al. (1998). A bandwidth-efficient architecture for media processing. In Proc. ann. ACM/IEEE int. symp. microarchitecture (pp. 3–13). Dallas, TX.
Saleh, H., Mohd, B. J., Aziz, A., & Swartzlander Jr., E. E. (2007). Contention-free switch-based implementation of 1024-point radix-2 Fourier transform engine. In Proc. IEEE int. conf. comput. design (pp. 7–12). Lake Tahoe, CA, USA.
Suleiman, A., Saleh, H., Hussein, A., & Akopian, D. (2008). A family of scalable FFT architectures and an implementation of 1024-point radix-2 FFT for real-time communications. In Proc. IEEE int. conf. comput. design (pp. 321–327). Lake Tahoe, CA, USA.
Tampere University of Technology (2008). TTA-based codesign environment. http://tce.cs.tut.fi/.
Texas Instruments, Inc., Dallas, TX: TMS320C64x DSP Library programmer’s reference (2003).
Wang, A., & Chandrakasan, A. (2005). A 180-mV subthreshold FFT processor using a minimum energy design methodology. IEEE Journal of Solid-State Circuits, 40(1), 310–319.
Wang, S. S., & Li, C. S. (2008). An area-efficient design of variable-length fast Fourier transform processor. Journal of Signal Processing Systems, 51(3), 245–256.
Wey, C. L., Lin, S. Y., Tang, W. C., & Shiue, M. T. (2007). High-speed, low cost parallel memory-based FFT processors for OFDM applications. In IEEE int. conf. electronics circ. syst. (pp. 783–787). Marrakech, Marocco.
Yang, Y. X., Li, J. F., Liu, H. N., & Wey, C. L. (2007). Design of cost-efficient memory-based FFT processors using single-port memories. In IEEE int. SOC conf. (pp. 321–327). Hsin Chu, Taiwan.
Zhao, Y., Erdogan, A. T., & Arslan, T. (2005). A low-power and domain-specific reconfigurable FFT fabric for system-on-chip applications. In Proc. IEEE par. distributed process. symp. reconf. logic. Denver, CO.
Acknowledgement
This work has been supported in part by the Academy of Finland under funding decision 205743.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pitkänen, T.O., Takala, J. Low-Power Application-Specific Processor for FFT Computations. J Sign Process Syst 63, 165–176 (2011). https://doi.org/10.1007/s11265-010-0528-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-010-0528-z