Skip to main content

An OpenMP Implementation of Parallel FFT and Its Performance on IA-64 Processors

  • Conference paper
  • First Online:
OpenMP Shared Memory Parallel Programming (WOMPAT 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2716))

Included in the following conference series:

Abstract

In this paper, we propose an OpenMP implementation of a recursive algorithm for parallel fast Fourier transform (FFT) on shared-memory parallel computers. A recursive three-step FFT algorithm improves performance by effectively utilizing the cache memory. Performance results of one-dimensional FFTs on the DELL PowerEdge 7150 and the hp workstation zx6000 are reported. We successfully achieved performance of about 757MFLOPS on the DELL PowerEdge 7150 (Itanium 800MHz, 4CPUs) and about 871MFLOPS on the hp workstation zx6000 (Itanium2 1GHz, 2CPUs) for 224-point FFT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19 (1965) 297–301

    Article  MATH  MathSciNet  Google Scholar 

  2. Swarztrauber, P.N.: Multiprocessor FFTs. Parallel Computing 5 (1987) 197–210

    Article  MATH  MathSciNet  Google Scholar 

  3. Bailey, D.H.: FFTs in external or hierarchical memory. The Journal of Supercomputing 4 (1990) 23–35

    Article  Google Scholar 

  4. Van Loan, C.: Computational Frameworks for the Fast Fourier Transform. SIAM Press, Philadelphia, PA (1992)

    MATH  Google Scholar 

  5. Wadleigh, K.R.: High performance FFT algorithms for cache-coherent multiprocessors. The International Journal of High Performance Computing Applications 13 (1999) 163–171

    Article  Google Scholar 

  6. Takahashi, D.: A blocking algorithm for parallel 1-D FFT on shared-memory parallel computers. In: Proc. 6th International Conference on Applied Parallel Computing (PARA 2002). Volume 2367 of Lecture Notes in Computer Science., Springer-Verlag (2002) 380–389

    Google Scholar 

  7. Hegland, M.: A self-sorting in-place fast Fourier transform algorithm suitable for vector and parallel processing. Numerische Mathematik 68 (1994) 507–547

    Article  MATH  MathSciNet  Google Scholar 

  8. Frigo, M., Johnson, S.G.: FFTW: An adaptive software architecture for the FFT. In: Proc. 1998 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP98). (1998) 1381–1384

    Google Scholar 

  9. Panda, P.R., Nakamura, H., Dutt, N.D., Nicolau, A.: Augmenting loop tiling with data alignment for improved cache performance. IEEE Transactions on Computers 48 (1999) 142–149

    Article  Google Scholar 

  10. Swarztrauber, P.N.: FFT algorithms for vector computers. Parallel Computing 1 (1984) 45–63

    Article  MATH  Google Scholar 

  11. Tanaka, Y., Taura, K., Sato, M., Yonezawa, A.: Performance evaluation of OpenMP applications with nested parallelism. In: Proc. 5th Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers (LCR 2000). Volume 1915 of Lecture Notes in Computer Science., Springer-Verlag (2000) 100–112

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Takahashi, D., Sato, M., Boku, T. (2003). An OpenMP Implementation of Parallel FFT and Its Performance on IA-64 Processors. In: Voss, M.J. (eds) OpenMP Shared Memory Parallel Programming. WOMPAT 2003. Lecture Notes in Computer Science, vol 2716. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45009-2_8

Download citation

  • DOI: https://doi.org/10.1007/3-540-45009-2_8

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40435-4

  • Online ISBN: 978-3-540-45009-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics