Abstract
In this paper a new approach is presented in order to overlap all communication intensive steps appearing in the four-step FFT algorithm—initial data distribution, matrix transpose, and final data collection—with computation. The presented method is based on a Kronecker product factorization of the four-step FFT algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bailey, D.H.: FFTs in External or Hierarchical Memory. J. Supercomputing 4 (1990) 23–35
Karner, H., Ueberhuber, C.W.: Architecture Adaptive FFT Algorithms. In: Bukhres, O., El-Rewini, H. (eds.): Proceedings of the Second IASTED International Conference on European Parallel and Distributed Systems (Euro-PDS’98). IASTED/ACTA Press, Anaheim Calgary Zürich (1998) 331–334
Karner, H., Ueberhuber, C.W.: Parallel FFT Algorithms with Reduced Communication Overhead. AURORA Tech. Report TR1998-14, Institute for Applied and Numerical Mathematics, Technical University of Vienna (1998)
Van Loan, C. F.: Computational Frameworks for the Fast Fourier Transform. SIAM Press, Philadelphia (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Karner, H., Ueberhuber, C.W. (1999). Overlapped Four-Step FFT Computation. In: Zinterhof, P., Vajteršic, M., Uhl, A. (eds) Parallel Computation. ACPC 1999. Lecture Notes in Computer Science, vol 1557. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49164-3_64
Download citation
DOI: https://doi.org/10.1007/3-540-49164-3_64
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65641-8
Online ISBN: 978-3-540-49164-4
eBook Packages: Springer Book Archive