Generating Fast FFT Kernels on CPUs via FFT-Specific Intrinsics
Abstract
References
Index Terms
- Generating Fast FFT Kernels on CPUs via FFT-Specific Intrinsics
Recommendations
Optimizing FFT-Based Convolution on ARMv8 Multi-core CPUs
Euro-Par 2020: Parallel ProcessingAbstractConvolutional Neural Networks (CNNs) are widely applied in various machine learning applications and very time-consuming. Most of CNNs’ execution time is consumed by convolutional layers. A common approach to implementing convolutions is the FFT-...
Optimization of the FFT Algorithm on RISC-V CPUs
High Performance ComputingAbstractThe emergence of RISC-V as a reduced instruction set architecture has brought several advantages such as openness, flexibility, scalability, and efficiency compared to other commercial ISAs. It has gained significant popularity, especially in the ...
Automatic FFT Performance Tuning on OpenCL GPUs
ICPADS '11: Proceedings of the 2011 IEEE 17th International Conference on Parallel and Distributed SystemsMany fields of science and engineering, such as astronomy, medical imaging, seismology and spectroscopy, have been revolutionized by Fourier methods. The fast Fourier transform (FFT) is an efficient algorithm to compute the discrete Fourier transform (...
Comments
Information & Contributors
Information
Published In
- General Chair:
- Maryam Mehri Dehnavi,
- Program Chairs:
- Milind Kulkarni,
- Sriram Krishnamoorthy
Sponsors
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Check for updates
Author Tags
Qualifiers
- Poster
Funding Sources
- National Natural Science Foundation of China
Conference
Acceptance Rates
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 226Total Downloads
- Downloads (Last 12 months)77
- Downloads (Last 6 weeks)4
Other Metrics
Citations
View Options
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in