Abstract
The Fast Fourier Transform (FFT) is of primary importance and a fundamental kernel in many computationally intensive scientific applications. In this paper we investigate its performance on the Sony-Toshiba-IBM Cell Broadband Engine, a heterogeneous multicore chip architected for intensive gaming applications and high performance computing. The Cell processor consists of a traditional microprocessor (called the PPE) that controls eight SIMD co-processing units called synergistic processor elements (SPEs). We exploit the architectural features of the Cell processor to design an efficient parallel implementation of Fast Fourier Transform (FFT). While there have been several attempts to develop a fast implementation of FFT on the Cell, none have been able to achieve high performance for input series with several thousand complex points. We use an iterative out-of-place approach to design our parallel implementation of FFT with 1K to 16K complex input samples and attain a single precision performance of 18.6 GFLOP/s on the Cell. Our implementation beats FFTW on Cell by several GFLOP/s for these input sizes and outperforms Intel Duo Core (Woodcrest) for inputs of greater than 2K samples. To our knowledge we have the fastest FFT for this range of complex inputs.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agarwal, R.C., Cooley, J.W.: Vectorized mixed radix discrete Fourier transform algorithms. Proc. of the IEEE 75(9), 1283–1292 (1987)
Ashworth, M., Lyne, A.G.: A segmented FFT algorithm for vector computers. Parallel Computing 6(2), 217–224 (1988)
Averbuch, A., Gabber, E., Gordissky, B., Medan, Y.: A parallel FFT on an MIMD machine. Parallel Computing 15, 61–74 (1990)
Bailey, D.H.: A high-performance FFT algorithm for vector supercomputers. Intl. Journal of Supercomputer Applications 2(1), 82–87 (1988)
Chen, T., Raghavan, R., Dale, J., Iwata, E.: Cell Broadband Engine Architecture and its first implementation. Technical Report (November 2005)
Chow, A.C., Fossum, G.C., Brokenshire, D.A.: A Programming Example: Large FFT on the Cell Broadband Engine. In: GSPx. Tech. Conf. Proc. of the Global Signal Processing Expo. (2005)
Cico, L., Cooper, R., Greene, J.: Performance and Programmability of the IBM/Sony/Toshiba Cell Broadband Engine Processor. White paper (2006)
IBM Corporation. Cell Broadband Engine technology. http://www.alphaworks.ibm.com/topics/cell
IBM Corporation. The Cell project at IBM Research. http://www.research.ibm.com/cell/home.html
Flachs, B., et al.: A streaming processor unit for a Cell processor. In: International Solid State Circuits Conference, San Fransisco, CA, USA, vol. 1, pp. 134–135 (February 2005)
Frigo, M., Johnson, S.G.: FFTW on the Cell Processor (2007), http://www.fftw.org/cell/index.html
Hofstee, H.P.: Cell Broadband Engine Architecture from 20,000 feet. Technical Report (August 2005)
Hofstee, H.P.: Real-time supercomputing and technology for games and entertainment. In: Proc. SC, Tampa, FL (November 2006)(keynote talk)
Jacobi, C., Oh, H.-J., Tran, K.D., Cottier, S.R., Michael, B.W., Nishikawa, H., Totsuka, Y., Namatame, T., Yano, N.: The vector floating-point unit in a synergistic processor element of a Cell processor. In: ARITH 2005. Proc. 17th IEEE Symposium on Computer Arithmetic, Washington, DC, USA, pp. 59–67. IEEE Computer Society Press, Los Alamitos (2005)
Kahle, J.A., Day, M.N., Hofstee, H.P., Johns, C.R., Maeurer, T.R., Shippy, D.: Introduction to the Cell multiprocessor. IBM J. Res. Dev. 49(4/5), 589–604 (2005)
Kistler, M., Perrone, M., Petrini, F.: Cell multiprocessor communication network: Built for speed. IEEE Micro 26(3), 10–23 (2006)
Pham, D., et al.: The design and implementation of a first-generation Cell processor. In: International Solid State Circuits Conference, San Fransisco, CA, USA, vol. 1, pp. 184–185 (February 2005)
Sony Corporation. Sony release: Cell architecture. http://www.scei.co.jp/
Williams, S., Shalf, J., Oliker, L., Kamil, S., Husbands, P., Yelick, K.: The potential of the Cell processor for scientific computing. In: CF 2006. Proc.3rd Conference on Computing Frontiers, pp. 9–20. ACM Press, New York (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bader, D.A., Agarwal, V. (2007). FFTC: Fastest Fourier Transform for the IBM Cell Broadband Engine. In: Aluru, S., Parashar, M., Badrinath, R., Prasanna, V.K. (eds) High Performance Computing – HiPC 2007. HiPC 2007. Lecture Notes in Computer Science, vol 4873. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77220-0_19
Download citation
DOI: https://doi.org/10.1007/978-3-540-77220-0_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77219-4
Online ISBN: 978-3-540-77220-0
eBook Packages: Computer ScienceComputer Science (R0)