Skip to main content

DFT Performance Prediction in FFTW

  • Conference paper
Languages and Compilers for Parallel Computing (LCPC 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5898))

  • 846 Accesses

Abstract

Fastest Fourier Transform in the West (FFTW) is an adaptive FFT library that generates highly efficient Discrete Fourier Transform (DFT) implementations. It is one of the fastest FFT libraries available and it outperforms many adaptive or hand-tuned DFT libraries. Its success largely relies on the huge search space spanned by several FFT algorithms and a set of compiler generated C code (called codelets) for small size DFTs. FFTW empirically finds the best algorithm by measuring the performance of different algorithm combinations. Although the empirical search works very well for FFTW, the search process does not explain why the best plan found performs best, and the search overhead grows polynomially as the DFT size increases. The opposite of empirical search is model-driven optimization. However, it is widely believed that model-driven optimization is inferior to empirical search and is particularly powerless to solve problems as complex as the optimization of DFT.

In this paper, we propose a model-driven DFT performance predictor that can replace the empirical search engine in FFTW. Our technique adapts to different architectures and automatically predicts the performance of DFT algorithms and codelets (including SIMD codelets). Our experiments show that this technique renders DFT implementations that achieve more than 95% of the performance with the original FFTW and uses less than 5% of the search overhead on four test platforms. More importantly, our models give insight on why different combinations of DFT algorithms perform differently on a processor given its architectural features.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Discussion with Franz Franchetti (2009)

    Google Scholar 

  2. Bluestein, L.: A linear Filtering approach to the computation of discrete Fourier transform. IEEE Transactions on Audio and Electroacoustics 18(4), 451–455 (1970)

    Article  Google Scholar 

  3. Chen, C., Chame, J., et al.: Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy. In: Proceedings of CGO, Washington, DC, USA, pp. 111–122. IEEE Computer Society, Los Alamitos (2005)

    Google Scholar 

  4. Cooley, J.W., Tukey, J.W.: An algorithm for the machine computation of complex Fourier series. Mathematics of Computation 19(90), 297–301 (1965)

    Article  MATH  MathSciNet  Google Scholar 

  5. Duhamel, P., Vetterli, M.: Fast fourier transforms: a tutorial review and a state of the art. Signal Processing 19(4), 259–299 (1990)

    Article  MATH  MathSciNet  Google Scholar 

  6. Fraguela, B.B., Voronenko, Y., et al.: Automatic tuning of discrete fourier transforms driven by analytical modeling. To appear in PACT (2009)

    Google Scholar 

  7. Frigo, M.: A fast Fourier transform compiler. ACM SIGPLAN Notices 34(5), 169–180 (1999)

    Article  Google Scholar 

  8. Frigo, M., Johnson, S.G.: The Fastest Fourier Transform in the West (1997)

    Google Scholar 

  9. Frigo, M., Johnson, S.G.: FFTW manual version 3.1–The Fastest Fourier Transform in the West. Massachusetts Institute of Technology, Massachusetts (2004)

    Google Scholar 

  10. Frigo, M., Johnson, S.G.: The design and implementation of fftw3. Proceeding of the IEEE 93(2), 216–231 (2005)

    Article  Google Scholar 

  11. Im, E.-J.: Optimizing the performance of sparse matrix-vector multiplication. PhD thesis (2000); Chair-Katherine A. Yelick

    Google Scholar 

  12. Kulkarniand, P.A., Whalley, D.B., et al.: In search of near-optimal optimization phase orderings. SIGPLAN Not. 41(7), 83–92 (2006)

    Article  Google Scholar 

  13. Oppenheim, A.V., Schafer, R.W., et al.: Discrete-Time Signal Processing (1999)

    Google Scholar 

  14. Püschel, M., Moura, J.M.F., et al.: SPIRAL: Code generation for DSP transforms. Proceedings of the IEEE, special issue on Program Generation, Optimization, and Adaptation 93(2), 232–275 (2005)

    Google Scholar 

  15. Rader, C.M.: Discrete Fourier transforms when the number of data samples is prime. Proceedings of the IEEE 56(6), 1107–1108 (1968)

    Article  Google Scholar 

  16. Saad, Y.: Research Institute for Advanced Computer Science (US). Sparskit: A Basic Tool Kit for Sparse Matrix Computation (1994)

    Google Scholar 

  17. Saavedra, R.H., Smith, A.J.: Analysis of benchmark characteristics and benchmark performance prediction. ACM Transactions on Computer Systems (TOCS) 14(4), 344–384 (1996)

    Article  Google Scholar 

  18. Whaley, R.C., Petitet, A., Dongarra, J.J.: Automated empirical optimizations of software and the ATLAS project. Parallel Computing 27(1-2), 3–35 (2001)

    Article  MATH  Google Scholar 

  19. Yotov, K., Li, X., et al.: Is Search Really Necessary to Generate High-Performance BLAS? Proceedings of the IEEE 93(2), 358–386 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gu, L., Li, X. (2010). DFT Performance Prediction in FFTW. In: Gao, G.R., Pollock, L.L., Cavazos, J., Li, X. (eds) Languages and Compilers for Parallel Computing. LCPC 2009. Lecture Notes in Computer Science, vol 5898. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13374-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13374-9_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13373-2

  • Online ISBN: 978-3-642-13374-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics