Abstract
In this paper, we propose a high-performance parallel three- dimensional fast Fourier transform (FFT) algorithm on clusters of vector symmetric multiprocessor (SMP) nodes. The three-dimensional FFT algorithm can be altered into a multirow FFT algorithm to expand the innermost loop length. We use the multirow FFT algorithm to implement the parallel three-dimensional FFT algorithm. Performance results of three-dimensional power-of-two FFTs on clusters of (pseudo) vector SMP nodes, Hitachi SR8000, are reported. We succeeded in obtaining performance of about 40 GFLOPS on a 16-node Hitachi SR8000.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation of complex Fourier series,” Math. Comput., vol. 19, pp. 297–301, 1965.
A. Brass and G. S. Pawley, “Two and three dimensional FFTs on highly parallel computers,” Parallel Computing, vol. 3, pp. 167–184, 1986.
R. C. Agarwal, F. G. Gustavson, and M. Zubair, “An efficient parallel algorithm for the 3-D FFT NAS parallel benchmark,” in Proceedings of the Scalable High-Performance Computing Conference, May 23-25, 1994, Knoxville, Tennessee, pp. 129–133, IEEE Computer Society Press, 1994.
M. Hegland, “Real and complex fast Fourier transforms on the Fujitsu VPP 500,” Parallel Computing, vol. 22, pp. 539–553, 1996.
C. Calvin, “Implementation of parallel FFT algorithms on distributed memory machines with a minimum overhead of communication,” Parallel Computing, vol. 22, pp. 1255–1279, 1996.
IBM Corporation, Parallel Engineering and Scientific Subroutine Library Version 2 Release 1.2 Guide and Reference (SA22-7273), 3rd ed., 1999.
C. Van Loan, Computational Frameworks for the Fast Fourier Transform. SIAM Press, Philadelphia, PA, 1992.
M. Hegland, “An implementation of multiple and multi-variate Fourier transforms on vector processors,” SIAM J. Sci. Comput., vol. 16, pp. 271–288, 1995.
K. Nakazawa, H. Nakamura, T. Boku, I. Nakata, and Y. Yamashita, “CP-PACS: A massively parallel processor at the University of Tsukuba,” Parallel Computing, vol. 25, pp. 1635–1661, 1999.
P. N. Swarztrauber, “Multiprocessor FFTs,” Parallel Computing, vol. 5, pp. 197–210, 1987.
D. H. Bailey, “FFTs in external or hierarchical memory,” The Journal of Supercomputing, vol. 4, pp. 23–35, 1990.
K. R. Wadleigh, G. B. Gostin, and J. Liu, “High-performance FFT algorithms for the Convex C4/XA supercomputer,” The Journal of Supercomputing, vol. 9, pp. 163–178, 1995.
P. N. Swarztrauber, “FFT algorithms for vector computers,” Parallel Computing, vol. 1, pp. 45–63, 1984.
S. Goedecker, “Fast radix 2, 3, 4, and 5 kernels for fast Fourier transformations on computers with overlapping multiply-add instructions,” SIAM J. Sci. Comput., vol. 18, pp. 1605–1611, 1997.
C. Temperton, “A generalized prime factor FFT algorithm for any N = 2p3q5r,” SIAM J. Sci. Stat. Comput., vol. 13, pp. 676–686, 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Takahashi, D. (2001). A Parallel 3-D FFT Algorithm on Clusters of Vector SMPs. In: Sørevik, T., Manne, F., Gebremedhin, A.H., Moe, R. (eds) Applied Parallel Computing. New Paradigms for HPC in Industry and Academia. PARA 2000. Lecture Notes in Computer Science, vol 1947. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-70734-4_37
Download citation
DOI: https://doi.org/10.1007/3-540-70734-4_37
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41729-3
Online ISBN: 978-3-540-70734-9
eBook Packages: Springer Book Archive