ABSTRACT
The Fast Fourier Transform (FFT) is a widely used algorithm in digital signal processing. The FFT computes the discrete Fourier transform (DFT) of a sequence, converting from temporal or spatial domain to frequency domain. The DFT operation is useful for many signal processing applications, but computing directly from definition is too slow to be practical. An FFT algorithm reduces the complexity from O(N2) to O(NlogN), where N is the data size.
This work describes how to accelerate the FFT algorithm for Qualcomm’s Adreno graphics processing unit (GPU) using OpenCL. We discuss one-dimensional FFT implementations such as Cooley-Tukey, higher radix, and mixed radix.
- E. Bainville. 2011. OpenCL Fast Fourier Transform. Retrieved January 9, 2023 from http://www.bealto.com/gpu-fft2_opencl-2.htmlGoogle Scholar
- J. W. Cooley and J. W. Tukey. 1965. An algorithm for the machine calculation of complex Fourier series. Math. Comp 19 (1965), 297–301.Google ScholarCross Ref
- https://developer.qualcomm.com. 2023. Adreno OpenCL SDK v1.5. Retrieved January 9, 2023 from https://developer.qualcomm.com/software/adreno-gpu-sdk/toolsGoogle Scholar
- https://developer.qualcomm.com. 2023. Snapdragon Mobile Platform OpenCL General Programming and Optimization Guide. Retrieved January 9, 2023 from https://developer.qualcomm.com/download/adrenosdk/adreno-opencl-programming-guide.pdfGoogle Scholar
- https://registry.khronos.org. 2024. The OpenCL Extension Specification. Retrieved January 9, 2023 from https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_Ext.htmlGoogle Scholar
- G. M. Iodice. 2016. Speeding-up Fast Fourier Transform Mixed-Radix on Mali GPU with OpenCL. Retrieved January 9, 2023 from https://community.arm.com/arm-community-blogs/b/graphics-gaming-and-vr-blog/Google Scholar
- D. Tolmachev. 2023. VkFFT-A Performant, Cross-Platform and Open-Source GPU FFT Library. IEEE Access 11 (2023), 12039–12058. https://doi.org/10.1109/ACCESS.2023.3242240Google ScholarCross Ref
Recommendations
The Partial Fast Fourier Transform
An efficient algorithm for computing the one-dimensional partial fast Fourier transform $$f_j=\sum _{k=0}^{c(j)}e^{2\pi ijk/N} F_k$$fj=?k=0c(j)e2?ijk/NFk is presented. Naive computation of the partial fast Fourier transform requires $${\mathcal O}(N^2)$$...
Auto-tuning of fast fourier transform on graphics processors
PPoPP '11: Proceedings of the 16th ACM symposium on Principles and practice of parallel programmingWe present an auto-tuning framework for FFTs on graphics processors (GPUs). Due to complex design of the memory and compute subsystems on GPUs, the performance of FFT kernels over the range of possible input parameters can vary widely. We generate ...
Auto-tuning of fast fourier transform on graphics processors
PPoPP '11We present an auto-tuning framework for FFTs on graphics processors (GPUs). Due to complex design of the memory and compute subsystems on GPUs, the performance of FFT kernels over the range of possible input parameters can vary widely. We generate ...
Comments