Skip to main content
Log in

FFTs in external or hierarchical memory

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Conventional algorithms for computing large one-dimensional fast Fourier transforms (FFTs), even those algorithms recently developed for vector and parallel computers, are largely unsuitable for systems with external or hierarchical memory. The principal reason for this is the fact that most FFT algorithms require at least m complete passes through the data set to compute a 2m-point FFT. This paper describes some advanced techniques for computing an ordered FFT on a computer with external or hierarchical memory. These algorithms (1) require as few as two passes through the external data set, (2) employ strictly unit stride, long vector transfers between main memory and external storage, (3) require only a modest amount of scratch space in main memory, and (4) are well suited for vector and parallel computation.

Performance figures are included for implementations of some of these algorithms on Cray supercomputers. Of interest is the fact that a main memory version outperforms the current Cray library FFT routines on the CRAY-2, the CRAY X-MP, and the CRAY Y-MP systems. Using all eight processors on the CRAY Y-MP, this main memory routine runs at nearly two gigaflops.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agarwal, R.C., and Cooley, J.W. 1986. Fourier transform and convolution subroutines for the IBM 3090 vector facility. IBM J. Res. and Dev., 30: 145–162.

    Google Scholar 

  • Aggarwal, A., and Vitter, J.S. 1988. The input/output complexity of sorting and related problems. CACM, 31: 1116–1127.

    Google Scholar 

  • Armstrong, J. 1988. A multi-algorithm approach to very high performance one-dimensional FFTs. The J. of Supercomputing, 2, 4 (Dec.), 415–433.

    Google Scholar 

  • Ashworth, M., and Lyne, A.G. 1988. A segmented FFT algorithm for vector computers. Parallel Computing, 6: 217–224

    Google Scholar 

  • Bailey, D.H. 1987. A high-performance fast Fourier transform algorithm for the Cray-2. The J. of Supercomputing, 1, 1: 43–60.

    Google Scholar 

  • Bailey, D.H. 1988. A high-performance FFT algorithm for vector supercomputers. Internat. J. of Supercomputer Applications, 2: 82–87.

    Google Scholar 

  • Fraser, D. 1976. Array permutation by index-digit permutation. JACM, 23: 298–309.

    Google Scholar 

  • Gentleman, W. M., and Sande, G. 1966. Fast Fourier transforms—For fun and profit. AFIPS Proc., 29: 563–578.

    Google Scholar 

  • Swarztrauber, P.N. 1984. FFT algorithms for vector computers. Parallel Computing, 1: 45–63.

    Google Scholar 

  • Swarztrauber, P.N. 1987. Multiprocessor FFTs. Parallel Computing, 5: 197–210.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

A condensed version of this paper previously appeared in the Proceedings of Supercomputing '89.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bailey, D.H. FFTs in external or hierarchical memory. J Supercomput 4, 23–35 (1990). https://doi.org/10.1007/BF00162341

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00162341

Keywords

Navigation