Abstract
The fast Fourier transform (FFT) is an efficient implementation of the discrete Fourier transform (DFT) . The FFT is widely used in numerous applications in engineering, science, and mathematics. This chapter presents an introduction to the basis of the FFT and its implementation in parallel computing. Parallel computation is becoming indispensable in solving the large-scale problems that arise in a wide variety of applications. The chapter provides a thorough and detailed explanation of FFT for parallel computers. The algorithms are presented in pseudocode, and a complexity analysis is provided. This chapter also provides up-to-date computational techniques relevant to the FFT in state-of-the-art processors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
2DECOMP & FFT—Library for 2D Pencil Decomposition and Distributed FFTs. http://www.2decomp.org/
TOP500 Supercomputer Sites. http://www.top500.org/
O. Ayala, L.P. Wang, Parallel implementation and scalability analysis of 3D fast Fourier transform using 2D domain decomposition. Parallel Comput. 39, 58–77 (2013)
D.H. Bailey, FFTs in external or hierarchical memory. J. Supercomput. 4, 23–35 (1990)
W.T. Cochran, J.W. Cooley, D.L. Favin, H.D. Helms, R.A. Kaenel, W.W. Lang, G.C. Maling, D.E. Nelson, C.M. Rader, P.D. Welch, What is the fast Fourier transform? IEEE Trans. Audio Electroacoust. 15, 45–55 (1967)
J.W. Cooley, J.W. Tukey, An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19, 297–301 (1965)
M. Eleftheriou, B.G. Fitch, A. Rayshubskiy, T.J.C. Ward, R.S. Germain, Scalable framework for 3D FFTs on the Blue Gene/L supercomputer: implementation and early performance measurements. IBM J. Res. Dev. 49, 457–464 (2005)
B. Fang, Y. Deng, G. Martyna, Performance of the 3D FFT on the 6D network torus QCDOC parallel supercomputer. Comput. Phys. Commun. 176, 531–538 (2007)
A. Faraj, X. Yuan, Automatic generation and tuning of MPI collective communication routines, in Proceedings of 19th ACM International Conference on Supercomputing (ICS’05) (2005), pp. 393–402
M. Frigo, S.G. Johnson, The design and implementation of FFTW3. Proc. IEEE 93, 216–231 (2005)
R. Kumar, A. Mamidala, D.K. Panda, Scaling all-to-all collective on multi-core systems, in Proceedings of 2008 IEEE International Parallel and Distributed Processing Symposium (IPDPS 2008) (2008)
MVAPICH: MPI over InfiniBand and iWARP. http://mvapich.cse.ohio-state.edu/
NVIDIA Corporation, CUFFT Library User’s Guide (2017). http://docs.nvidia.com/cuda/pdf/CUFFT_Library.pdf
D. Pekurovsky, P3DFFT: a framework for parallel computations of Fourier transforms in three dimensions. SIAM J. Sci. Comput. 34, C192–C209 (2012)
The Portland Group, CUDA Fortran Programming Guide and Reference (2017). http://www.pgroup.com/doc/pgicudaforug.pdf
M. Püschel, J.M.F. Moura, J.R. Johnson, D. Padua, M.M. Veloso, B.W. Singer, J. Xiong, F. Franchetti, A. Gačić, Y. Voronenko, K. Chen, R.W. Johnson, N. Rizzolo, SPIRAL: code generation for DSP transforms. Proc. IEEE 93, 232–275 (2005)
G.E. Rivard, Direct fast Fourier transform of bivariate functions. IEEE Trans. Acoust. Speech Signal Process. ASSP-25, 250–252 (1977)
R.C. Singleton, An algorithm for computing the mixed radix fast Fourier transform. IEEE Trans. Audio Electroacoust. 17, 93–103 (1969)
D. Takahashi, A blocking algorithm for parallel 1-D FFT on shared-memory parallel computers, in Proceedings of 6th International Conference on Applied Parallel Computing (PARA 2002). Lecture Notes in Computer Science, vol. 2367 (Springer, 2002), pp. 380–389
D. Takahashi, Automatic tuning for parallel FFTs, in Software Automatic Tuning: From Concepts to State-of-the-Art Results, ed. by K. Naono, K. Teranishi, J. Cavazos, R. Suda (Springer, 2010), pp. 49–67
D. Takahashi, An implementation of parallel 3-D FFT with 2-D decomposition on a massively parallel cluster of multi-core processors, in Proceedings of 8th International Conference on Parallel Processing and Applied Mathematics (PPAM 2009), Part I, Workshop on Memory Issues on Multi- and Manycore Platforms. Lecture Notes in Computer Science, vol. 6067 (Springer, 2010), pp. 606–614
D. Takahashi, Implementation of parallel 1-D FFT on GPU clusters, in Proceedings of 2013 IEEE 16th International Conference on Computational Science and Engineering (CSE 2013) (2013), pp. 174–180
C. Temperton, Self-sorting mixed-radix fast Fourier transforms. J. Comput. Phys. 52, 1–23 (1983)
C. Van Loan, Computational Frameworks for the Fast Fourier Transform (SIAM Press, Philadelphia, PA, 1992)
H. Wang, S. Potluri, M. Luo, A.K. Singh, S. Sur, D.K. Panda, MVAPICH2-GPU: optimized GPU to GPU communication for infiniband clusters. Comput. Sci.—Res. Dev. 26, 257–266 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Takahashi, D. (2019). Fast Fourier Transform in Large-Scale Systems. In: Geshi, M. (eds) The Art of High Performance Computing for Computational Science, Vol. 1. Springer, Singapore. https://doi.org/10.1007/978-981-13-6194-4_8
Download citation
DOI: https://doi.org/10.1007/978-981-13-6194-4_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6193-7
Online ISBN: 978-981-13-6194-4
eBook Packages: Computer ScienceComputer Science (R0)