A massively-parallel electronic-structure calculations based on real-space density functional theory
Introduction
First-principles calculations based on the density functional theory (DFT) [1], [2] have been performed on a variety of materials and have provided important microscopic information for physical properties based on quantum theory [3], [4], [5]. Thus, among various theoretical methodologies, DFT is a top choice at present for clarification and prediction of phenomena in condensed matter. The popularity of DFT is due to its relatively low computational costs and its reasonable accuracy, which often favor it over elaborate and highly accurate quantum chemistry approaches such as the configuration-interaction [6] or the diffusion Monte-Carlo approaches [7]. Although there are a few report that the exceptionally large-scale DFT calculations have been achieved [8], [9], the usual target systems to which DFT can be applied easily are still limited to medium-sized systems consisting of hundreds to one thousand of atoms. Recent research interests in condensed matter and the material sciences require DFT calculations for much larger systems such as nanoscale systems. For instance, in semiconductor science, the typical size of metal–oxide–semiconductor field-effect transistors is on the nanometer scale. For such systems quantum mechanical simulations are important for understanding device characteristics [10], [11]. Furthermore, in the life sciences, correlations between atomic structures and the bio-functions of in vivo proteins are hard to clarify without the help of simulations based on quantum theory [12], [13]. Since nanoscale systems consist of 10,000–100,000 atoms or more, it is imperative to perform drastically large-scale calculations based on DFT to address these important subjects, and it is impossible to carry out such large-scale calculations with traditional computational programs. Thus, several computational approaches have been investigated intensively.
One promising approach is the order-N method [14] in which mathematical problems are re-formulated so as to utilize the possible localized nature of unitary-transformed wave functions or density matrices. However, another approach exists in which the conventional order- exact formulation is adopted and drastic improvements are achieved in the performance of the computer codes through restructuring and tuning for state-of-the-art computer architectures.
For the latter approach, the real-space finite-difference pseudopotential method proves to be a key ingredient because the methods are suitable for parallel computations. The real-space method for first-principles electronic-structure calculations was first proposed by Chelikowsky et al. in 1994 [15], and then, various developments and applications have been performed by many researchers [8], [16], [17], [18], [19], [20], [21]. Recently the real-space method have been applied for unprecedentedly large size Si nanocrystals by Zhou et al. [8]. In the real-space methods, singular ionic potentials are replaced by smoother pseudopotentials [22], and Schrödinger-type quantum mechanical equations are discretized on three-dimensional spatial grids and the solutions are produced by treating them as finite-difference (FD) equations. In principle, the matrix of the real-space formulation is sparse and fast Fourier transformation (FFT) is unnecessary for Hamiltonian matrix operations. The FFT-free character contrasts with the conventional plane-wave methods [23], [24], and provides a great advantage by easing the communication burden in parallel computations.
To date, the real-space method have achieved the calculations for the systems of thousands of atoms in the order- exact formulation [8], [9]. The real-space method is also promising for much larger systems containing 10,000–100,000 atoms. Therefore it is important to understand the computational details of the state-of-the-art real-space method, including the algorithms, implementations, and the performances, for further development of the first-principles calculations with the next-generation supercomputers. In this paper, we present a detailed description of our real-space DFT (RSDFT) code developed recently to overcome the size limitation of our computing system by utilizing the power of massively-parallel computing. We also investigate in detail the computational costs of RSDFT code to clarify how to study large systems using next-generation supercomputers.
The algorithm employed in our code is rather conventional, so that the computational costs scale as . However, there are several benefits to develop the code based on the conventional algorithm. First, we already know its applicability, accuracy, and suitable choices of computational parameters such as cut-off energies and sampling k points. Second, we are able to concentrate on particular problems in large-scale calculations, such as numerical precision, convergence behavior, and the reliability of the total-energy calculation itself. The present study focuses on the computational aspects in large-scale real-space calculations. Thus, the programming techniques developed for RSDFT are also applicable to other methods, e.g. order-N methods.
In Section 2, we briefly review the basics and fundamental equations of DFT. In Section 3, we present a similar formulation to that of Section 2, but in a discrete space for real-space FD calculations. Although the Sections 2 Density functional theory, 3 DFT on three-dimensional grid space are somewhat repetitious for the specialists of the DFT calculations, we add them by the following reason; for future development of the first-principles calculations, we must need the help of the specialists of computer and computational sciences to bring out the best performance of the supercomputers, and we aim to remove the barrier at the entrance of the DFT calculations for the non-DFT specialists. In Section 4, we summarize the computational parameters and the several initial configurations for the parallel computation. In Section 5, we describe the overall algorithm of our RSDFT code and the details of several main subroutines. We introduce a new algorithm to accelerate operations in Gram–Schmidt orthogonalization and in subspace diagonalization. In Section 6, we present the performance tests of our code and analyze the costs of computation and communication. In Section 7, we show several practical applications dealing with Si crystal, nanometer-scale Si quantum dots, and Si nanowires. Finally, we present a summary and conclusion in Section 8. Acronyms used in this paper are summarized in Table 1.
Section snippets
Density functional theory
The ultimate purpose of DFT calculations is to minimize an energy functional with respect to the electron density . Following the standard Kohn–Sham DFT formalism [2], we introduce the orbital , and assume that the electron density is expressed as a sum of the absolute square of the orbitalsand the minimization is performed with respect to the orbitals. In Eq. (1), is the total number of orbitals and is the occupation number for the nth orbital. The explicit
DFT on three-dimensional grid space
To numerically minimize the energy functional, we introduce a three-dimensional spatial grid and consider the minimization problem within the discrete space of grid points. Then the orbitals, density, and potentials are expressed as column vectors whose elements are the value at each grid point. For example,where the ith element is the value at the grid point :
In this discrete scheme, the energy functional can be written as
Computational set up and parallelization
The systems for which we chose to perform DFT calculations may be classified into two categories: (1) a system with periodic boundary conditions for orbitals and (2) a system with decaying boundary conditions. Crystalline solids or supercell geometries are typical of the former, and molecules or clusters belong to the latter. For both systems, we first must set the number N, the position, and the species of each atom in the unit cell. Since we are usually interested in the behavior of
Details of algorithms
Minimization of the energy functional is two-fold. One minimization is with respect to the electron density or the orbitals, and the other minimization is with respect to the atomic coordinates. In our computational code, the minimization with respect to atomic coordinates contains the minimization with respect to the orbitals for fixed atomic coordinates as an internal procedure. This corresponds to the Born–Oppenheimer approximation or the adiabatic approximation. The most time-consuming
Performance tests
The predominant portion of the total computational time in RSDFT calculations results from the SCF iterations. Therefore, in this section we investigate in detail the costs of computations and communications for SCF iterations. To this end, we present several test calculations performed by the RSDFT code for periodic systems. The test systems are cubic Si crystals in the diamond structure, with sizes of 512, 1000, 1728, 2744, and 4096 atoms. Only the gamma point is sampled for the Brillouin
Practical applications
We begin this section by showing an example of the application of our RSDFT code to a crystalline Si system with 4096 atoms. The crystalline lattice is diamond structure with a lattice constant of 43.4 Å. The grid-spacing is chosen to be 0.45 Å, which corresponds to the lattice constant divided into 96 segments. Thus, the total number of grid points is points. For the ionic potentials, we employ the norm-conserving pseudopotential [22] in the separable approximate form [27], and we
Summary
We have developed RSDFT code suitable for massively-parallel computers, and have demonstrated that the code, which is based on the usual formulation, can treat large systems. The code was developed to perform the computations using matrix products to the extent possible, which substantially improves the performance of parts of the computations. We apply the code to 10,000 Si atom systems, and demonstrate that the self-consistent electronic-structure can be obtained within a few
Acknowledgments
We acknowledge the members of the COMAS-DFT meeting at University of Tsukuba, especially Prof. M. Sato, Prof. T. Sakurai, and Prof. A. Ukawa, for fruitful discussions and comments on the development of the RSDFT code. Numerical calculations for the present work have been carried out under the Interdisciplinary Computational Science Program at the Center for Computational Sciences, University of Tsukuba.
References (50)
- et al.
HARES: an efficient method for first-principles electronic structure calculations of complex systems
Comput. Phys. Commun.
(2001) Convergence acceleration of iterative sequences. The case of SCF iteration
Chem. Phys. Lett.
(1980)- et al.
State-of-the-art eigensolvers for electronic structure calculations of large scale nano-systems
J. Comput. Phys.
(2008) - et al.
Inhomogeneous electron gas
Phys. Rev.
(1964) - et al.
Self-consistent equations including exchange and correlation effects
Phys. Rev.
(1965) - et al.
The density functional formalism, its applications and prospects
Rev. Mod. Phys.
(1989) - et al.
Iterative minimization techniques for ab initio total-energy calculations: molecular dynamics and conjugate gradients
Rev. Mod. Phys.
(1992) Nobel lecture: electronic structure of matter – wave functions and density functionals
Rev. Mod. Phys.
(1999)- et al.
Cannonical configurational interaction procedure
Rev. Mod. Phys.
(1960) - et al.
Quantum Monte Carlo simulations of solids
Rev. Mod. Phys.
(2001)
Parallel self-consistent calculations via Chebyshev-filtered subspace acceleration
Phys. Rev. E
Size limits on doping phosphorus into silicon nanocrystals
Nano Lett.
J. Appl. Phys.
Chemical controllability of charge states of nitrogen-related defects in HfOxNy: first-principles calculations
Phys. Rev. B
DFT modeling of biological systems
Phys. Status Solidi b
Possible mechanism of proton transfer through peptide groups in the H-pathway of the bovine cytochrome c oxidase
J. Am. Chem. Soc.
Linear scalin electronic structure methods
Rev. Mod. Phys.
Higher-order finite-difference pseudopotential method – an application to diatomic-molecules
Phys. Rev. B
Time-dependent local-density approximation in real time
Phys. Rev. B
Real-space multigrid-based approach to large-scale electronic structure calculations
Phys. Rev. B
Real-space mesh techniques in density functional theory
Rev. Mod. Phys.
First-Principles Calculation in Real-Space Formalism, Electronic Configurations and Transport Properties of Nanostructures
Large-scale density functional calculations on silicon divacancies
Phys. Rev. B
Efficient pseudopotentials for plane-wave calculations
Phys. Rev. B
First-principles study on energetics of c-BN(0 0 1) reconstructed surfaces
Phys. Rev. B
Cited by (143)
Efficient parallel strategy for molecular plasmonics – A numerical tool for integrating Maxwell-Schrödinger equations in three dimensions
2023, Journal of Computational PhysicsInsight into the step flow growth of gallium nitride based on density functional theory
2023, Applied Surface ScienceCalculation of phonons in real-space density functional theory
2023, Physical Review ERSDFT-NEGF transport simulations in realistic nanoscale transistors
2023, Journal of Computational ElectronicsFinite Difference Interpolation for Reduction of Grid-Related Errors in Real-Space Pseudopotential Density Functional Theory
2023, Journal of Chemical Theory and Computation