Abstract
Fast Fourier transformation (FFT) is widely used in modern wireless communication and digital signal processing. Because memory access is a major cause of power dissipated by the long-length FFT architecture, this paper explores the design space expanded by FFT size and radix number in detail and presents a novel low-memory-access length-adaptive architecture for computing any long-length 2\(^n\)-point FFT. The proposed hardware solution possesses the following three attractive features to reflect its novelty as compared to the existing designs. First, the authors identified that memory consumes major energy dissipation of a FFT processor and proposed to reduce memory access through decreasing the number of FFT butterfly stages. The second one is that we adopt the design concept of programmable processors to provide the flexibility in dynamically configuring the hardware for computing variable-length FFT without sacrificing the hardware utilization as contrary to the feed-forward architecture. Finally, a 16-bank memory organization is proposed to achieve conflict-free FFT operations for various radixes. Such low-memory-access length-adaptive architecture can reduce almost 70 % memory access or 30 % power consumption for FFT computation. After being implemented through 1P6M TSMC 0.18-\(\upmu \)m CMOS technology, this work costs a core area of only 4.49 mm\(^{2}\) and meets the FFT real-time performance requirements of DVB-T2 systems when operated at 20 MHz frequency. The proposed design consumes only 1.44 nJ of energy per sample for computing FFTs. Through adopting the proposed low-memory-access algorithm, flexible length-adaptive architecture, and efficient 16-bank memory organization, 56 % power dissipation of the whole FFT chip can be saved.
Similar content being viewed by others
References
Artisan Component, TSMC 0.18-\(\mu \)m process 1.8-volt SAGE-X standard cell library, Databook (2003)
B.M. Baas, A low-power, high-performance 1,024-point FFT processor. IEEE J. Solid-State Circuits 34(3), 380–387 (1999)
V. Baireddy, H. Khasnis, R. Mundhada, A 64–4,096 point FFT/IFFT/windowing processor for multi-standard ADSL/VDSL applications, in Proceedings of the IEEE International Symposium on Signals, Systems and Electronics (2007), pp. 403–405
G. Bi, E.V. Jones, A pipelined FFT processor for word-sequential data. IEEE Trans. Acoust. Speech Signal Process. 37(12), 1982–1985 (1989)
E. Bidet, D. Castelain, C. Joanblanq, P. Senn, A fast single-chip implementation of 8,192 complex point FFT. IEEE J. Solid-State Circuits 30(3), 300–305 (1995)
A.P. Chandrakasan, R.W. Brodersen, Low power digital CMOS design (Kluwer Academic Publishers, Boston, 1995)
C.K. Chang, C.P. Hung, S.G. Chen, An efficient memory-based FFT architecture. Proc. IEEE Int. Symp. Circuits Syst. 2, 129–132 (2003)
L.F. Chen, L.C. Chien, Y.H. Ma, C.H. Lee, Y.W. Lin, C.C. Lin, H.Y. Lin, T.Y. Hsu, C.Y. Lee, A 1.8 V 250 mW COFDM baseband receiver for DVB-T/H applications, in Proceedings of the IEEE International Solid-State Circuits Conference (2006), pp. 262–263
K.H. Chen, Y.S. Chu, A spurious-power suppression technique for multimedia/DSP applications. IEEE Trans. Circuits Syst. I 56(1), 132–143 (2009)
K.H. Chen, Y.S. Li, A multi-radix FFT processor using pipeline in memory-based architecture (PIMA) for DVB-T/H systems, in Proceedings of the IEEE International Mixed Design of Integrated Circuits and Systems (2008), pp. 549–554.
Y. Chen, Y.C. Tsao, Y.W. Lin, C.H. Lin, C.Y. Lee, An indexed-scaling pipelined FFT processor for OFDM-based WPAN applications. IEEE Trans. Circuits Syst. II 55(2), 146–150 (2008)
J.W. Cooley, J.W. Tukey, An algorithm for the machine calculation of complex Fourier series. Math. Comput. 5(5), 87–109 (1965)
ETSI, Digital video broadcasting (DVB); Framing structure, channel coding and modulation for digital terrestrial television, ETSI EN 300 744 v1.5.1 (2004)
ETSI, Digital video broadcasting (DVB); transmission systems for handheld terminals (DVB-H), ETSI EN 302 304 v1.1.1 (2004)
J.I. Guo, C.M. Liu, C.W. Jen, The efficient memory-based VLSI array designs for DFT and DCT. IEEE Trans. Circuits Syst. II 39(10), 723–733 (1992)
S. He, M. Torkelson, Designing pipeline FFT processor for OFDM (de)modulation, in Proceedings of the IEEE International Symposium on Signals, Systems and Electronics (1998), pp. 257–262
S.J. Huang, S.G. Chen, A high-throughput radix-16 FFT processor with parallel and normal input/output ordering for IEEE 802.15.3c systems. IEEE Trans. Circuits Syst. I 59(8), 1752–1765 (2012)
C. L. Hung, S. S. Long, and M. T. Shiue, A low-power and variable-length FFT design for flexible MIMO OFDM systems, Proceedings of the IEEE International Symposium on Circuits and Systems (2009), pp. 705–708
L. Jia, Y. Gao, J. Isoaho, H. Tenhunen, A new VLSI-oriented FFT algorithm and implementation, in Proceedings of the IEEE ASIC Conference (1998), pp. 337–341
M. Keating, P. Bricaud, Reuse Methodology Manual for System-on-a-Chip Designs (Kluwer Academic Publishers, Dordrecht, 2002)
H.Y. Lee, Y.C. Park, Balanced binary-tree decomposition for area-efficient pipelined FFT processing. IEEE Trans. Circuits Syst. I 54(4), 889–900 (2007)
H. Lee, M. Shin, A high-speed low-complexity two-parallel radix-2\(^{4}\) FFT/IFFT processor for UWB applications, in Proceedings of IEEE Asian Solid-State Circuits Conference (2007), pp. 284–287
W. Li, L. Wanhammar, A pipeline FFT processor, in Proceedings of the IEEE Workshop on Signal Processing Systems (1999), pp. 654–662
Y.W. Lin, H.Y. Liu, C.Y. Lee, A dynamic scaling FFT processor for DVB-T applications. IEEE J. Solid-State Circuits 39(11), 2005–2013 (2004)
Y.W. Lin, H.Y. Liu, C.Y. Lee, A 1-GS/s FFT/IFFT processor for UWB applications. IEEE J. Solid-State Circuits 40(8), 1726–1735 (2005)
S.Y. Lin, C.L. Wei, M.D. Shieh, Low-cost FFT processor for DVB-T2 applications. IEEE Trans. Consum. Electron. 56(4), 2072–2079 (2010)
S. Magar, S. Shen, G. Luikuo, M. Fleming, R. Aguilar, An application specific DSP chip set for 100 MHz data rate. Proc. Int. Conf. Acoust. Speech Signal Process. 4, 1989–1992 (1988)
K. Maharatna, E. Grass, U. Jagdhold, A 64-point Fourier transform chip for high-speed wireless LAN application using OFDM. IEEE J. Solid-State Circuits 39(3), 484–493 (2004)
N. Miyamoto, L. Karnan, K. Maruo, K. Kotani, T. Ohmi, A small-area high-performance 512-point 2-dimensional FFT single-chip processor, in Proceedings of the IEEE European Solid-State Circuits Conference (2003), pp. 603–606
K.K. Parhi, VLSI Digital Signal Processing Systems (Wiley-Interscience Publication, New York, 1999)
A.A. Petrovsky, S.L. Shkredov, Automatic generation of split-radix 2–4 parallel-pipeline FFT processors: hardware reconfiguration and core optimization, in Proceedings of the IEEE International Symposium on Parallel Computing Electrical Engineering (2006), pp. 181–186
S. Qiao, Y. Hei, B. Wu, Y. Zhou, An area and power efficient FFT processor for UWB systems, in Proceedings of the IEEE Conference on Wireless Communications, Networking and Mobile Computing (2007), pp. 582–585
Virtual silicon preliminary data sheet on single-port/dual-port/two-port SRAM compiler for UMC 0.18 \(\mu \)m (L180GII) (2004), pp. 1–3
C. Wang, W.S. Gan, C.C. Jong, J. Luo, A low-cost 256-point FFT processor for portable speech and audio applications, in Proceedings of the IEEE International Symposium on Integrated Circuits (2007), pp. 81–84
C.C. Wang, J.M. Huang, H.C. Cheng, A 2k/8k mode small-area FFT processor for OFDM demodulation of DVB-T receivers. IEEE Trans. Consum. Electron. 51(1), 28–32 (2005)
C.L. Wey, W.C. Tang, S.Y. Lin, Efficient memory-based architectures for digital video broadcasting automation and test, in Proceedings of the IEEE International Symposium VLSI Design (2007), pp. 1–4
W.C. Yeh, C.W. Jen, High-speed and low-power split-radix FFT. IEEE Trans. Signal Process. 51(3), 864–874 (2003)
Acknowledgments
The author would like to thank National Chip Implementation Center (CIC) of Taiwan for the help on chip fabrication and measuring.
Author information
Authors and Affiliations
Corresponding author
Appendix: Fetching Data for Radix-2, Radix-4, and Radix-8 FFTs
Appendix: Fetching Data for Radix-2, Radix-4, and Radix-8 FFTs
Addressing modes are given for fetching data from the memory into the data-path for FFT operation based on radix-2, radix-4, and radix-8, respectively. The first one is for computing radix-2 FFT in which \(x(n)\) are fetched from the following address Addr[\(x(n)\)] inside the memory bank Bank[\(x(n)\)] of memory to the FFT data-path.
Besides, the partial timing diagram is shown as Fig. 11. From the figure, we can find that the numbers of a pair of banks are kept unchanged for 512 cycles, and the address number increments every clock cycle. After all the data of the two acting banks are fetched, the bank numbers increment to continue fetching data from the next pair of banks.
The second one is for computing radix-4 FFT where \(x(n)\) are fetched from the following locations of memory to the FFT data-path for computing.
The partial timing diagram of data fetching for FFT operation based on radix-4 algorithm is shown as Fig. 12. From the figure, we can find that two pairs of banks form a basic unit and the data inside are fetched in order during 1,024 cycles. Thus, the address number increments every two clock cycles. After all the data of the acting unit are fetched, the bank numbers increment to continue fetching data from the next unit.
Furthermore, for calculating radix-8 FFT, \(x(n)\) are fetched from the locations of memory as shown in Fig. 13 to the FFT data-path for computing. From the figure, we can find that four pairs of banks form a basic unit and the data inside are fetched in order during 2,048 cycles. Thus, the address number increments every four clock cycles. After all the data of the acting unit are fetched, the bank numbers increment to continue fetching data from the next unit.
Rights and permissions
About this article
Cite this article
Chen, KH. A Low-Memory-Access Length-Adaptive Architecture for 2\(^n\)-Point FFT. Circuits Syst Signal Process 34, 459–482 (2015). https://doi.org/10.1007/s00034-014-9862-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-014-9862-x