Abstract
In this paper, we propose an implementation of a parallel two-dimensional fast Fourier transform (FFT) using Intel Advanced Vector Extensions (AVX) instructions on multi-core processors. The combination of vectorization and a block two-dimensional FFT algorithm is shown to effectively improve performance. We vectorized FFT kernels using the AVX instructions. Performance results of two-dimensional FFTs on multi-core processors are reported. We successfully achieved a performance of over 61 GFlops on an Intel Xeon E5-2670 (2.6 GHz, two CPUs, 16 cores) and over 24 GFlops on an Intel Core i7-3930K (3.2 GHz, one CPU, six cores) for a 212×212-point FFT.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19, 297–301 (1965)
Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. Proc. IEEE 93, 216–231 (2005)
Püschel, M., Moura, J.M.F., Johnson, J., Padua, D., Veloso, M., Singer, B.W., Xiong, J., Franchetti, F., Gacic, A., Voronenko, Y., Chen, K., Johnson, R.W., Rizzolo, N.: SPIRAL: Code generation for DSP transforms. Proc. IEEE 93, 232–275 (2005)
McFarlin, D.S., Arbatov, V., Franchetti, F., Püschel, M.: Automatic SIMD vectorization of fast Fourier transforms for the Larrabee and AVX instruction sets. In: Proc. 25th International Conference on Supercomputing, ICS 2011, pp. 265–274 (2011)
Takahashi, D.: Implementation and evaluation of parallel FFT using SIMD instructions on multi-core processors. In: Proc. 2007 International Workshop on Innovative Architecture for Future Generation High-Performance Processors and Systems, IWIA 2007, pp. 53–59 (2007)
Intel Corporation: Intel 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture (2012)
Intel Corporation: Intel C++ Compiler XE 12.1 User and Reference Guides (2011)
Brigham, E.O.: The Fast Fourier Transform and its Applications. Prentice-Hall, Englewood Cliffs (1988)
Van Loan, C.: Computational Frameworks for the Fast Fourier Transform. SIAM Press, Philadelphia (1992)
Swarztrauber, P.N.: FFT algorithms for vector computers. Parallel Computing 1, 45–63 (1984)
Intel Corporation: Intel Math Kernel Library Reference Manual (2012)
Marr, D.T., Binns, F., Hill, D.L., Hinton, G., Koufaty, D.A., Miller, J.A., Upton, M.: Hyper-threading technology architecture and microarchitecture. Intel Technology Journal 6, 1–11 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Takahashi, D. (2012). An Implementation of Parallel 2-D FFT Using Intel AVX Instructions on Multi-core Processors. In: Xiang, Y., Stojmenovic, I., Apduhan, B.O., Wang, G., Nakano, K., Zomaya, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2012. Lecture Notes in Computer Science, vol 7440. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33065-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-33065-0_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33064-3
Online ISBN: 978-3-642-33065-0
eBook Packages: Computer ScienceComputer Science (R0)