Skip to main content

Implementation of Parallel 3-D Real FFT with 2-D Decomposition on Intel Xeon Phi Clusters

  • Conference paper
  • First Online:
Parallel Processing and Applied Mathematics (PPAM 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12043))

  • 1080 Accesses

Abstract

In this paper, we propose an implementation of a parallel 3-D real fast Fourier transform (FFT) with 2-D decomposition on Intel Xeon Phi clusters. The proposed implementation of the parallel 3-D real FFT is based on the conjugate symmetry property of the discrete Fourier transform (DFT) and the row-column FFT algorithm. We vectorized FFT kernels using the Intel Advanced Vector Extensions 512 (Intel AVX-512) instructions. Performance results of parallel 3-D real FFTs on Intel Xeon Phi clusters are reported. We successfully achieved a level of performance over 10 TFlops on 2048 nodes of Fujitsu PRIMERGY CX1640 M1 cluster for an \(8192^3\)-point FFT.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. 2DECOMP&FFT - Library for 2D Pencil Decomposition and Distributed FFTs. http://www.2decomp.org/

  2. TOP500 Supercomputer Sites. https://www.top500.org/

  3. Ayala, O., Wang, L.P.: Parallel implementation and scalability analysis of 3D Fast Fourier Transform using 2D domain decomposition. Parallel Comput. 39, 58–77 (2013)

    Article  MathSciNet  Google Scholar 

  4. Brass, A., Pawley, G.S.: Two and three dimensional FFTs on highly parallel computers. Parallel Comput. 3, 167–184 (1986)

    Article  MathSciNet  Google Scholar 

  5. Brigham, E.O.: The Fast Fourier Transform and Its Applications. Prentice-Hall, Upper Saddle River (1988)

    Google Scholar 

  6. Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19, 297–301 (1965)

    Article  MathSciNet  Google Scholar 

  7. Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. Proc. IEEE 93, 216–231 (2005)

    Article  Google Scholar 

  8. Liu, Y.Q., Li, Y., Zhang, Y.Q., Zhang, X.Y.: Memory efficient two-pass 3D FFT algorithm for Intel® Xeon Phi™ coprocessor. J. Comput. Sci. Technol. 29, 989–1002 (2014)

    Article  Google Scholar 

  9. Pekurovsky, D.: P3DFFT: a framework for parallel computations of Fourier transforms in three dimensions. SIAM J. Sci. Comput. 34, C192–C209 (2012)

    Article  MathSciNet  Google Scholar 

  10. Pippig, M.: PFFT: an extension of FFTW to massively parallel architectures. SIAM J. Sci. Comput. 35, C213–C236 (2013)

    Article  MathSciNet  Google Scholar 

  11. Takahashi, D.: An implementation of parallel 3-D FFT with 2-D decomposition on a massively parallel cluster of multi-core processors. In: Wyrzykowski, R., et al. (eds.) PPAM 2009, Part I. LNCS, vol. 6067, pp. 606–614. Springer, Heidelberg (2010)

    Google Scholar 

  12. Takahashi, D.: An implementation of parallel 1-D real FFT on Intel Xeon Phi processors. In: Gervasi, O., et al. (eds.) ICCSA 2017, Part I. LNCS, vol. 10404, pp. 401–410. Springer, Cham (2017)

    Google Scholar 

Download references

Acknowledgments

This research used computational resources of the Oakforest-PACS provided by the Multidisciplinary Cooperative Research Program in Center for Computational Sciences, University of Tsukuba. This research was partially supported by JSPS KAKENHI Grant Number JP19K11989.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daisuke Takahashi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Takahashi, D. (2020). Implementation of Parallel 3-D Real FFT with 2-D Decomposition on Intel Xeon Phi Clusters. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K. (eds) Parallel Processing and Applied Mathematics. PPAM 2019. Lecture Notes in Computer Science(), vol 12043. Springer, Cham. https://doi.org/10.1007/978-3-030-43229-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-43229-4_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-43228-7

  • Online ISBN: 978-3-030-43229-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics