Abstract
In this paper, we propose a blocking algorithm for parallel one-dimensional fast Fourier transform (FFT) on shared-memory parallel computers. Our proposed FFT algorithm is based on the six-step FFT algorithm. The block six-step FFT algorithm improves performance by effectively utilizing the cache memory. Performance results of one-dimensional FFTs on the SGI Onyx 3400 and Sun Enterprise 6000 are reported. We successfully achieved performance of about 1929 MFLOPS on the SGI Onyx 3400 (MIPS R12000 400 MHz, 16 CPUs) and about 520 MFLOPS on the Sun Enterprise 6000 (UltraSPARC 168 MHz, 16 CPUs).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19 (1965) 297–301
Swarztrauber, P.N.: Multiprocessor FFTs. Parallel Computing 5 (1987) 197–210
Bailey, D.H.: FFTs in external or hierarchical memory. The Journal of Supercomputing 4 (1990) 23–35
Van Loan, C.: Computational Frameworks for the Fast Fourier Transform. SIAM Press, Philadelphia, PA (1992)
Wadleigh, K.R., Gostin, G.B., Liu, J.: High-performance FFT algorithms for the Convex C4/XA supercomputer. The Journal of Supercomputing 9 (1995) 163–178
Frigo, M., Johnson, S.G.: The fastest Fourier transform in the west. Technical Report MIT-LCS-TR-728, MIT Laboratory for Computer Science (1997)
Wadleigh, K.R.: High performance FFT algorithms for cache-coherent multiprocessors. The International Journal of High Performance Computing Applications 13 (1999) 163–171
Cochrane, W.T., Cooley, J.W., Favin, D.L., Helms, H.D., Kaenel, R.A., Lang, W.W., Maling, Jr., G.C., Nelson, D.E., Rader, C.M., Welch, P.D.: What is the fast Fourier transform? IEEE Trans. Audio Electroacoust. 15 (1967) 45–55
Swarztrauber, P.N.: FFT algorithms for vector computers. Parallel Computing 1 (1984) 45–63
Takahashi, D.: High-performance parallel FFT algorithms for the HITACHI SR8000. In: Proc. Fourth International Conference/Exhibition on High Performance Computing in Asia-Pacific Region (HPC-Asia 2000). (2000) 192–199
OpenMP: Simple, Portable, Scalable SMP Programming. (http://www.openmp.org)
Frigo, M., Johnson, S.G.: Fftw. (http://www.fftw.org)
Omni: RWCP Omni OpenMP Compiler Project. (http://www.hpcc.jp/Omni/)
Takahashi, D.: An extended split-radix FFT algorithm. IEEE Signal Processing Letters 8 (2001) 145–147
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Takahashi, D. (2002). A Blocking Algorithm for Parallel 1-D FFT on Shared-Memory Parallel Computers. In: Fagerholm, J., Haataja, J., Järvinen, J., Lyly, M., Råback, P., Savolainen, V. (eds) Applied Parallel Computing. PARA 2002. Lecture Notes in Computer Science, vol 2367. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48051-X_38
Download citation
DOI: https://doi.org/10.1007/3-540-48051-X_38
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43786-4
Online ISBN: 978-3-540-48051-8
eBook Packages: Springer Book Archive