Error and timing analysis of multiple time-step integration methods for molecular dynamics

https://doi.org/10.1016/j.cpc.2006.10.005Get rights and content

Abstract

Molecular dynamics simulations of biomolecules performed using multiple time-step integration methods are hampered by resonance instabilities. We analyze the properties of a simple 1D linear system integrated with the symplectic reference system propagator MTS (r-RESPA) technique following earlier work by others. A closed form expression for the time step dependent Hamiltonian which corresponds to r-RESPA integration of the model is derived. This permits us to present an analytic formula for the dependence of the integration accuracy on short-range force cutoff range. A detailed analysis of the force decomposition for the standard Ewald summation method is then given as the Ewald method is a good candidate to achieve high scaling on modern massively parallel machines. We test the new analysis on a realistic system, a protein in water. Under Langevin dynamics with a weak friction coefficient (ζ=1ps−1) to maintain temperature control and using the SHAKE algorithm to freeze out high frequency vibrations, we show that the 5 fs resonance barrier present when all degrees of freedom are unconstrained is postponed to 12fs. An iso-error boundary with respect to the short-range cutoff range and multiple time step size agrees well with the analytical results which are valid due to dominance of the high frequency modes in determining integrator accuracy. Using r-RESPA to treat the long range interactions results in a 6× increase in efficiency for the decomposition described in the text.

Introduction

Classical molecular dynamics (MD) simulation studies have provided invaluable insight into the properties of biomolecular systems. Unfortunately, biomolecular simulations are computationally intensive due to the large number of particles and the complex interactions present. In a MD calculation, the trajectory of a group of interacting atoms is determined through the approximate solution of a Hamiltonian set of equations of motion based on an empirical potential function or force field. A typical force field contains bonded interactions through bond, angle, and torsion potentials, and non-bonded interactions through van der Waals and the electrostatic potential terms. The first four terms are short-range in nature, while the last term is long-range in nature.

The efficient computation of the long range interactions is recognized to be a challenging task in atomistic computer simulation studies. Intuitively, the long-range interactions can be calculated for each pair of particles in the system. However, the computing time complexity would then be on the order of O(N2). This naïve method becomes impossibly computationally intensive when periodic boundary conditions are applied and/or systems are large. Historically, long-range forces were spherically truncated in macromolecular simulations because of insufficient computer power. However, such a scheme has been shown to produce severe artifacts in the results of the simulation [39], [47], [48]. Thus the need for a correct and efficient treatment of long-range electrostatic forces for biomolecular systems was recognized although the required methodology was long applied to the long range force problem in other communities [33].

The various techniques developed to treat long-range electrostatic interactions [14], [47], can be broadly classified into three categories: standard Ewald summation [11], [25], [42], mesh-based Ewald methods [9], [19], and fast multipole methods [15]. The relative efficiency of the standard Ewald [O(N3/2)], the mesh-based Ewald [O(NlogN)] and the fast multipole [O(N)] approaches has been the subject of some discussion. Pollock and Glosli [26] concluded that both particle–particle particle–mesh (P3M) and fast multipole method (FMM) are more efficient than the standard Ewald summation for macroscopic systems but the P3M approach is significantly more efficient than FMM for any conceivable system size, despite the superior asymptotic scaling of the latter. On the other hand, parallel implementations of standard Ewald scale much better than P3M or PME, due to the difficulties in parallelizing the 3D-FFT beyond a few thousand processors. Hence, research [12] suggests that the Ewald summation might be superior to PME when the number of processors is greater than 1000, which is easily achieved using massively parallel computers such as Blue Gene/L and QCDOC. This paper will concentrate on analyzing Ewald summation due to our interest in large scale parallel architectures.

In order to perform MD simulation studies of proteins, it is necessary to generate trajectories by iterating the single time step numerical solver many times. In order to maintain stability at long times, solvers that preserve the symplectic property of Hamiltonian systems are employed. Nonetheless, the solvers require a time step small enough to describe the fast motions accurately if they are to achieve accurate integration of the equations of motion. Each integration step requires the re-evaluation of all the forces. Since the evaluation of the long-range forces dominates the computer time required to perform MD simulations, developing well defined ways to reduce the number of long range force evaluations is critical. The multiple time step (MTS) method was introduced to help eliminate unnecessary long range force evaluations. In such a scheme, the interactions to be evaluated are separated into two classes: those which are short-range and those which are long-range. The short-range interactions must be evaluated at each small time-step size, because they are strong; on the other hand, the long-range interactions can be computed at larger time-steps since they are relatively weak. The MTS integration method was proposed first for use in astronomy [18] and later for use in molecular dynamics simulations [40].

Extensive research has been performed on the MTS scheme [3], [13], [16], [17], [20], [21], [27], [28], [29], [36], [41], [43], [44], [45], [46]. There are basically two different types of MTS algorithms, i.e. the naïve (or constant force) MTS algorithm [1], [13] in which the long-range forces are held constant for certain time steps, which does not preserve the symplectic property of Hamiltonian mechanics, and the Verlet-I/r-RESPA algorithm [16], [20], [41], [43], [44], [49] in which the long-range forces are applied only at certain time steps, which does preserve it. Indeed, the Verlet-I/r-RESPA algorithm is widely applied because it does preserve the symplectic property which guarantees, in turn, that an alternative time step dependent Hamiltonian [35] is exactly preserved by the numeric integration. Thus, Verlet-I/r-RESPA samples a well defined phase space. Unfortunately, resonance phenomena limit the time step size under Verlet-I/r-RESPA. At certain time-steps correlated to internal motions (e.g., 5 fs, around half the period of the fastest bond stretch), significant inaccuracies or instabilities occur [3], [6], [7]. SHAKE [2], [24], [31], can be used to freeze out fast bond stretches and bends which allows the resonance point to be pushed out to about 12fs. More sophisticated algorithms such as MOLLY [38], Backward Euler Langevin dynamics [4], [5], [34], [37], [38] and the isokinetic Nosé–Hoover chain method [23] allow the stability limit to be further postponed or bypassed altogether.

In this paper, the properties of the r-RESPA MTS method are studied in detail. First, the resonance instabilities inherent in the MTS method are examined by solving, analytically, a linear system following earlier work by others. A formula that expresses the maxima number of “Jump steps” consistent with a given “simulation accuracy” as a function of “short range force accuracy” is presented. Here, the term Jump step is defined to be the ratio of the largest time step to the smallest time step employed in the multiple time step procedure; the term short range force accuracy simply refers to the degree to which the short range force mimics the total force; the term simulation accuracy refers to the ability of the simulation to yield correct results with a given error tolerance. Next, we estimate the work required to compute the long- and short-range forces under Ewald summation for two different force decompositions, reflecting our interest in the use of Ewald summation on massively parallel computers. Analytical formulae for the optimal short range force cutoff as a function of Jump steps are presented for the two decompositions of interest. Numerical computations on a protein in water are then used to determine the accuracy of the short force as a function of cutoff for the two decompositions. Comparing the results allows the relationship between the optimal cutoff at a fixed number of Jump steps and the maximum number of Jump steps consistent with good simulation accuracy to be assessed. That is, the highest frequency mode dominates accuracy and the 1d theory provides a good approximation of the complex many-body system. Numerical tests of these predictions/assumptions on a realistic system, a protein in water, are provided using SHAKE to freeze out high frequency motions and a weak Langevin coupling (friction coefficient of ζ=1ps−1) to maintain temperature control. The maximum number of Jump steps capable of maintaining simulation accuracy is presented as a function of short range force accuracy and an iso-error boundary for the method thereby defined. The MTS simulation efficiency is then given as a function of Jump step size for the protein along the iso-error boundary.

The paper is organized as follows: Section 2 reviews the r-RESPA/Verlet-I MTS method. Section 3 performs the stability analysis for the r-RESPA MTS scheme. Section 4 gives a case study of the MTS method under the Ewald summation technique reflecting our interest in large parallel supercomputers. In Section 5, we calibrate the iso-error boundary and speed-up of the MTS technique with respect to the short-range cutoff and the jump steps based on simulations of the HIV-1 Protease in water using Ewald summation and the force decompositions described within. Last, some conclusions are presented.

Section snippets

Multiple time-step methods

Symplectic MTS MD methods are important approaches that are commonly applied to simulate accurately and efficiently biomolecular systems. MTS methods are applied because of varying strength of the inter-particle forces inherent in these problems. Under MTS integration, the weaker long-range forces can be integrated with a fairly large time steps, while the stronger short-range forces can be integrated with smaller time steps. Since the long range forces are computationally intensive to evaluate

Stability analysis of MTS algorithms

In this section, the reversible reference system propagator algorithm (r-RESPA) [43], [44] is presented, followed by a stability analysis based on the usual assumption that the high frequency mode in complex systems dominates integrator accuracy.

The r-RESPA [43], [44] is based upon the Trotter expansion of the classical Liouville propagator. The Liouville operator L for a system of N degrees of freedom in Cartesian coordinates is defined asiL=[,H]=[x˙ixi+Fi(x)pi], where [,] is the

Multiple time-step integration with the Ewald summation: A case study

In this section, the use of the Ewald method in MTS integration is thoroughly analyzed. First, the Ewald technique which divides the Coulomb energy into real-space and reciprocal space parts controlled by the introduction of a parameter, αewd, is reviewed. Next, criteria for truncating the real space part of the sum and the reciprocal space part of the sum at constant precession/accuracy are given. These criteria allow the reciprocal space cutoff and αewd to be expressed in terms of the real

Multiple time-step numerical experiments and results

In order to test the timing and error analysis of the previous sections, we have performed biomolecular simulation experiments on the aqueous HIV-1 Protease Dimer using the MTS method. The Amber 8 simulation package and a δt=1fs basic step size was employed in the computations. A variety of real space cutoffs, R, and Jump steps, J, were tested under the simple splitting scheme of Section 4.3 that is implemented in Amber 8. The Ewald parameter, αewd, and k-space cutoff, K, was determined based

Conclusions

In this paper, we develop a simple expression for the dependence of trajectory accuracy on the accuracy of the short-range force under the r-RESPA MTS integration technique using an analytically solvable model. The analysis shows that although it may seem that computational efficiency may be enhanced by using very large MTS jumping steps, a surprisingly accurate short range force (large α and large cutoff) is required to ensure accurate and stable trajectories. This reduces efficiency even away

Acknowledgements

This work was supported by Brookhaven National Laboratory LDRD grants for molecular dynamics on ultrascalable supercomputers.

References (49)

  • E. Barth et al.

    Extrapolation versus impulse in multiple-timestepping schemes. II. Linear analysis and applications to Newtonian and Langevin dynamics

    J. Chem. Phys.

    (1998)
  • T.C. Bishop et al.

    Difficulties with multiple time stepping and fast multipole algorithm in molecular dynamics

    J. Comp. Chem.

    (1997)
  • S.A. Chin

    Dynamical multiple-time stepping methods for overcoming resonance instabilities

    J. Chem. Phys.

    (2004)
  • T.A. Darden et al.

    Particle mesh Ewald: An Nlog(N) method for Ewald sums in large systems

    J. Chem. Phys.

    (1993)
  • Y. Deng et al.

    Performance models on QCDOC for molecular dynamics with Coulomb potentials

    J. High Perform. Comp. Appl.

    (2004)
  • P.P. Ewald

    Die Berechnung optischer und elektrostatischer Gitterpotentiale

    Ann. Phys.

    (1921)
  • T. Forester et al.

    On multiple time-step algorithms and the Ewald sum

    Mol. Sim.

    (1994)
  • P. Gibbon et al.

    Long-range interactions in many-particle simulation

  • L. Greengard

    The Rapid Evaluation of Potential Fields in Particle Systems

    (1988)
  • H. Grubmüller et al.

    Generalized Verlet algorithm for efficient molecular dynamics simulations with long-range interactions

    Mol. Sim.

    (1991)
  • H. Grubmüller et al.

    Multiple time step algorithms for MD simulations of proteins: How good are they?

    J. Comp. Chem.

    (1998)
  • A. Hayli

    Le problème des N corps dans un champ extérieur application a l'évolution dynamique des amas ouverts—I

    Bull. Astronomique

    (1967)
  • R.W. Hockney et al.

    Computer Simulations Using Particles

    (1988)
  • D.D. Humphreys et al.

    A multiple-time-step molecular dynamics algorithm for macromolecules

    J. Phys. Chem.

    (1994)
  • Cited by (0)

    View full text