skip to main content
10.1145/2063384.2063479acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Large scale plane wave pseudopotential density functional theory calculations on GPU clusters

Authors Info & Claims
Published:12 November 2011Publication History

ABSTRACT

In this work, we present our implementation of the density functional theory (DFT) plane wave pseudopotential (PWP) calculations on GPU clusters. This GPU version is developed based on a CPU DFT-PWP code: PEtot, which can calculate up to a thousand atoms on thousands of processors. Our test indicates that the GPU version can have a ~10 times speed-up over the CPU version. A detail analysis of the speed-up and the scaling on the number of CPU/GPU computing units (up to 256) are presented. The success of our speed-up relies on the adoption a hybrid reciprocal-space and band-index parallelization scheme. As far as we know, this is the first GPU DFT-PWP code scalable to large number of CPU/GPU computing units. We also outlined the future work, and what is needed to further increase the computational speed by another factor of 10.

References

  1. Carter, E. A. 2008, Challenges in Modeling Materials Properties Without Experimental Input, Science, 321 (Aug. 2008), 800--803.Google ScholarGoogle ScholarCross RefCross Ref
  2. Rapaport, D. C., The Art of Molecular Dynamics Simulation (Cambridge University Press, 2004, 2nd Ed). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Martin, R. M., Electronic Structure: Basic Theory and Practical Methods (Cambridge University Press, 2004).Google ScholarGoogle ScholarCross RefCross Ref
  4. Wang, L. W. 2005, A survey of codes and algorithms used in NERSC material science allocations, LBNL technical report, LBNL-61051. (http://escholarship.org/uc/item/9kh2q0cd).Google ScholarGoogle Scholar
  5. Wimmer, E., Krakauer, H., Weinert, M., Freeman A. J. 1981, Full-potential self-consistent linearized-augmented-plane-wave method for calculating the electronic structure of molecules and surfaces: O2 molecule, Phys. Rev. B 24, 864.Google ScholarGoogle ScholarCross RefCross Ref
  6. Saad, Y., Zhou, Y., Bekas, C., Tiago, M. L., Chelikowsky J. R. 2006, Diagonalization methods in PARSEC, Phys. Status Solidi B 243, 2188--97.Google ScholarGoogle ScholarCross RefCross Ref
  7. Pask, J. E., Klein, B. M., Sterne, P. A., Fong, C. Y. 2001, Finite-element methods in electronic-structure theory, Comp. Phys. Commun. 135, 1.Google ScholarGoogle ScholarCross RefCross Ref
  8. Ordejon, P., Artacho, E., Soler, J. M. 1996, Self-consistent order-N density-functional calculations for very large systems, Phys. Rev. B 53, R10441.Google ScholarGoogle ScholarCross RefCross Ref
  9. Genovese, L., Neelov, A., Goedecker S., Deutsch, T., Ghasemi, A., Willand, A, Galiste, D., Zilberberg, O., Rayson, M., Bergman, A., Schneider, R. 2008, Daubechies wavelets as a basis set for density functional pseudopotential calculations, J. Chem. Phys, 129, 014109.Google ScholarGoogle ScholarCross RefCross Ref
  10. Payne, M. C., Teter, M. P., Allan, D. C., Arias, T. A., Joannopoulos, J. D. 1992, Iterative minimization techniques for ab initio total-energy calculations: molecular dynamics and conjugate gradients, Rev. Mod. Phys. 64, 1046--97.Google ScholarGoogle ScholarCross RefCross Ref
  11. There are many DFT PWP codes, here is a partial list: VASP, CASTEP, CPMD, ABINIT, PWSCF, PEtot, DACAPO, SOCORRO, DFT++, PARATEC, DOD-PW, CP2K, SPHINX, and QBOX.Google ScholarGoogle Scholar
  12. VASP: http://cms.mpi.univie.ac.at/vasp/Google ScholarGoogle Scholar
  13. Gygi, F. 2008, Architecture of Qbox: A scalable first-principles molecular dynamics code, IBM J Res. & Dev. 52, 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Gygi, F., Draeger, E., de Supinski, B. R., Yates, R. K., Franchetti, F., Kral, S., Lorenz, J., Ueberhuber, C. W., Gunnels, J., Sexton, J. 2005, Large-scale first-principles molecular dynamics simulations on the bluegen/L platform using the Qbox code, ACM. digital Lib, Proceedings of the 2005 ACM/IEEE SC/05 Conference (SC'05). Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Dag, S., Canning, A., Wang, L. W., Large scale conjugated gradient electronic structure calculations including spin-orbit coupling (unpublished).Google ScholarGoogle Scholar
  16. Wang, L. W., Lee, B., Shan, H., Zhao, Z., Meza, J., Strohmaier, E., Bailey, D. 2008, Linear scaling 3D fragment method for large scale electronic structure calculations, Proc. 2008 ACM/IEE Conf. Supercomp. (ACM Gordon Bell), Article 65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Gotz, A. W., Wolfle, T., Walker, R. C. 2010, Quantum Chemistry on Graphics Processing Units, Annual Reports in Comp. Chemistry. Vol.6, 21.Google ScholarGoogle ScholarCross RefCross Ref
  18. Genovese, L., Ospici, M., Deutsch, T., Mehaut, J.-F., Neelov, A., Goedecker, S. 2009, Density functional theory calculation on many-cores hybrid CPU-GPU architectures, J. Chem. Phys. 131, 034103.Google ScholarGoogle ScholarCross RefCross Ref
  19. Tomono, H., Aoki, M., Iitaka, T., Tsumuraya K. 2010, GPU based acceleration of first principles calculation, J. of Phys: Conf. Series 215, 012121.Google ScholarGoogle ScholarCross RefCross Ref
  20. Canning, A., Wang, L. W., Williamson, A., Zunger, A. 2000, Parallel empirical pseudopotential electronic structure calculations for million atom systems, J. Comp. Phys. 160, 29. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Wang, L. W., PEtot code: https://hpcrd.lbl.gov/~linwang/PEtot/PEtot.htmlGoogle ScholarGoogle Scholar
  22. Raczkowski, D., Canning, A., Wang, L. W. 2001, Thomas-Fermi charge mixing for obtaining self-consistency in density functional calculations, Phys. Rev. B, 64, R121101.Google ScholarGoogle ScholarCross RefCross Ref
  23. Canning, A., Shalf, J., Wang, L. W., Wasserman, H., Gajbe M. 2009, A comparison of different communication structures for scalable parallel three dimensional FFTs in first principle codes, Proceed. Parco09, Lyon France.Google ScholarGoogle Scholar
  24. Bekas, C., Curionic, A. 2010, Very large scale wave function orthogonalization in Density Functional Theory electronic structure calculation, Comp. Phys. Comm. 181, 1057.Google ScholarGoogle ScholarCross RefCross Ref
  25. Wang, L. W. 2001, Mask function real space implementations of nonlocal pseudopotentials, Phys. Rev. B 64, R201107.Google ScholarGoogle ScholarCross RefCross Ref
  26. Wang, L. W. 2001, Large scale LDA-band-gap-corrected GaAsN calculations, Appl. Phys. Lett. 78, 1565.Google ScholarGoogle ScholarCross RefCross Ref
  27. Wang, L. W., Zunger, A. 1996, Pseudopotential calculations of nanoscale CdSe quantum dots, Phys. Rev. B 53, 9579.Google ScholarGoogle ScholarCross RefCross Ref
  28. http://www.top500.orgGoogle ScholarGoogle Scholar
  29. Anderson, A. G., Goddard III, W. A., Schroder, P. 2007, Quantum Monte Carlo on graphical processing units, Comp. Phys. Comm. 177, 298--306.Google ScholarGoogle ScholarCross RefCross Ref
  30. Esler, K. P., Kim, J., Schulenburger L., Ceperley, D. M. 2011, Fully accelerating quantum Monte Carlo simulations of real materials on GPU clusters, Comp. Sci. Eng. (in press).Google ScholarGoogle Scholar
  31. Wang, L. W. 2008, A special purpose computer for ab initio molecular dynamics simulations, LBNL technical report, LBNL-817E. (http://escholarship.org/uc/item/2874t8zr)Google ScholarGoogle Scholar
  32. http://www.culatools.comGoogle ScholarGoogle Scholar
  33. Yasuda, K. 2008, Accelerating density functional calculations with graphics processing unit, J. Chem. Theory Comput. 4, 1230--1236.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Large scale plane wave pseudopotential density functional theory calculations on GPU clusters

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
        November 2011
        866 pages
        ISBN:9781450307710
        DOI:10.1145/2063384

        Copyright © 2011 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 12 November 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SC '11 Paper Acceptance Rate74of352submissions,21%Overall Acceptance Rate1,516of6,373submissions,24%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader