skip to main content
10.1145/2159430.2159437acmconferencesArticle/Chapter ViewAbstractPublication PagesgpgpuConference Proceedingsconference-collections
research-article

High performance 3-D FFT using multiple CUDA GPUs

Published:03 March 2012Publication History

ABSTRACT

Fast Fourier transform is one of the most important computations used in many kinds of applications. Although there are several works of on single GPU FFT, we also need large-scale transforms that require multiple GPUs due to the capacity of the device memory. We present high performance 3-D FFT using multiple GPU devices both on a single node and on multiple nodes. As a result of optimizing the data transfer between GPUs, our multi GPU FFT successfully outperform single GPU.

References

  1. J. W. Cooley and J. W. Tukey. An Algorithm for the Machine Calculation of Complex Fourier Series. Math. Comput., Vol. 19:297--301, 1965.Google ScholarGoogle ScholarCross RefCross Ref
  2. Y. Dotsenko, S. S. Baghsorkhi, B. Lloyd, and N. K. Govindaraju. Auto-tuning of fast Fourier transform on graphics processors. In Proceedings of the 16th ACM symposium on Principles and practice of parallel programming, PPoPP '11, pages 257--266, New York, NY, USA, 2011. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. General-Purpose Computation Using Graphics Hardware. http://www.gpgpu.org/.Google ScholarGoogle Scholar
  4. N. K. Govindaraju, B. Lloyd, Y. Dotsenko, B. Smith, and J. Manferdelli. High Performance Discrete Fourier Transforms on Graphics Processors. In the 2008 ACM/IEEE conference on supercomputing, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. F. Hirata. Molecular theory of solvation. Dordrecht, The Netherlands, 2003.Google ScholarGoogle Scholar
  6. T. Imai, A. Kovalenko, F. Hirata, and A. Kidera. A new approach for investigating the molecular recognition of protein: Toward structure-based drug design based on the 3D-RISM theory. J. Am. Chem. Soc., 131:12430--12440, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  7. Y. Kiyota, N. Yoshida, and F. Hirata. A new approach for investigating the molecular recognition of protein: Toward structure-based drug design based on the 3D-RISM theory. J. Comp. Theo. Chem., 7:3803--3815, 2011.Google ScholarGoogle ScholarCross RefCross Ref
  8. E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym. NVIDIA Tesla: A Unified Graphics and Computing Architecture. IEEE Micro, 28(2):39--55, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. K. Moreland and E. Angel. The FFT on a GPU. In Proceedings of SIGGRAPH/Eurographics Workshop on Graphics Hardware 2003, pages 112--119, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. A. Nukada and S. Matsuoka. NukadaFFT: An auto-tuning FFT library for CUDA GPUs. In NVIDIA GPU Technology Conference 2010 (Research Summit Poster).Google ScholarGoogle Scholar
  11. A. Nukada and S. Matsuoka. Auto-tuning 3-D FFT library for CUDA GPUs. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09, pages 30:1--30:10, New York, NY, USA, 2009. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Nukada, Y. Ogata, T. Endo, and S. Matsuoka. Bandwidth intensive 3-D FFT kernel for GPUs using CUDA. In SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, pages 1--11, Piscataway, NJ, USA, 2008. IEEE Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. NVIDIA CUDA. Compute Unified Device Architecture. http://developer.nvidia.com/object/cuda.html.Google ScholarGoogle Scholar
  14. C. Van Loan. Computational Frameworks for the Fast Fourier Transform. SIAM Press, Philadelphia, PA, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Volkov and B. Kazian. Fitting FFT onto the G80 architecture, 2008. http://www.cs.berkeley.edu/~kubitron/courses/cs258-S08/projects/reports/project6_report.pdf.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    GPGPU-5: Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
    March 2012
    122 pages
    ISBN:9781450312332
    DOI:10.1145/2159430

    Copyright © 2012 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 3 March 2012

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate57of129submissions,44%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader