Paper The following article is Open access

In operando active learning of interatomic interaction during large-scale simulations

and

Published 17 September 2020 © 2020 The Author(s). Published by IOP Publishing Ltd
, , Citation M Hodapp and A Shapeev 2020 Mach. Learn.: Sci. Technol. 1 045005 DOI 10.1088/2632-2153/aba373

2632-2153/1/4/045005

Abstract

A well-known drawback of state-of-the-art machine-learning interatomic potentials is their poor ability to extrapolate beyond the training domain. For small-scale problems with tens to hundreds of atoms this can be solved by using active learning which is able to select atomic configurations on which a potential attempts extrapolation and add them to the ab initio-computed training set. In this sense an active learning algorithm can be viewed as an on-the-fly interpolation of an ab initio model. For large-scale problems, possibly involving tens of thousands of atoms, this is not feasible because one cannot afford even a single density functional theory (DFT) computation with such a large number of atoms.

This work marks a new milestone toward fully automatic ab initio-accurate large-scale atomistic simulations. We develop an active learning algorithm that identifies local subregions of the simulation region where the potential extrapolates. Then the algorithm constructs periodic configurations out of these local, non-periodic subregions, sufficiently small to be computable with plane-wave DFT codes, in order to obtain accurate ab initio energies. We benchmark our algorithm on the problem of screw dislocation motion in bcc tungsten and show that our algorithm reaches ab initio accuracy, down to typical magnitudes of numerical noise in DFT codes. We show that our algorithm reproduces material properties such as core structure, Peierls barrier, and Peierls stress. This unleashes new capabilities for computational materials science toward applications which have currently been out of scope if approached solely by ab initio methods.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

It is now understood that the design of novel materials with exotic, unprecedented mechanical properties, specifically metallic alloys, requires a computer-assisted approach since many of the underlying mechanisms cannot be directly quantified by real experiments and are often not even observed. The state-of-the-art computing hardware and algorithms can simulate hundreds of atoms by direct atomistic simulations with density functional theory (DFT) which allows for an accurate prediction of ideal crystalline properties, such as phase stability or elastic moduli. However, on the macroscopic scale many critical material properties, such as yield strength or fracture toughness, are determined by the microstructure and, hence, require a multiscale approach. Presently, multiscale simulations are prohibitively expensive with DFT and, in lieu thereof, carried out with empirical interatomic potentials. However, existing empirical potentials can, at their best, only suggest a qualitative mechanism but remain inadequate for a quantitative understanding of multicomponent systems and chemical interactions.

A popular multiscale approach in the materials science community are quantum mechanics/molecular mechanics [15, QM/MM] methods, where only a small subset of the computational domain, usually in the vicinity of crystalline defects, is treated quantum-mechanically while the remainder is approximated by interatomic potentials. A newer class of methods are classical atomistic simulations using machine-learning interatomic potentials [6, MLIPs] which—in comparison with empirical potentials—admit a flexible and generic functional form allowing to solve any problem with arbitrary accuracy, at least in theory (cf [7]). This makes them a promising candidate for multiscale simulations since using an interatomic potential everywhere is orders of magnitudes faster than retaining a quantum mechanical model in parts of the computational domain.

Based on these premises, different MLIPs have been proposed in recent years, mainly differing in their representation. Recent review papers [8, 9] compare a number of interatomic potentials, namely the neural network potential [10, NNP], the Gaussian approximation potential [11, 12, GAP], the spectral neighbor analysis potential [13, SNAP] and the moment tensor potential [7, MTP]. Differences in the potential representation mainly influence the computational cost but all the above-mentioned MLIPs are able to produce (qualitatively) similar results. Several other representations and their variants have been proposed to date [1420], including related approaches in molecular modeling [21]. Hence, MLIPs can now be considered as sufficiently mature and, consequently, applications are rapidly emerging. The first notable contributions toward applications in computational metallurgy are the GAPs for bcc tungsten and iron by Szlachta [22], Dragoni [23], and the Al-Mg-Si NNP by Kobayashi et al [24]. It was shown by the authors that these potentials can reproduce crucial properties that govern the mechanical and thermodynamical behavior, e.g. vacancy formation energies, stacking fault energies or phonon spectra.

On the other hand, a major drawback of state-of-the-art MLIPs is their poor extrapolation beyond the known training configurations. Therefore, the most time-consuming step in developing these MLIPs is the construction of the training set. It takes months, or sometimes years, to manually construct all the needed configurations that would be representative of all configurations appearing in a simulation; this is done based on physical intuition and best practices (see the PhD theses [25, 26]). This approach might work well for simple problems, e.g. single-component materials, in which atomistic mechanisms of many phenomena are well-understood. However, a manual, 'human-developed' procedure will be too involved to effectively address more complex scenarios. For example, the GAP for iron [25] has indeed been shown to successfully reproduce crucial properties of screw [27] but not of edge dislocations [28].

A different approach is active learning [29]. In contrast to passive learning, where the training set is finalized before running a simulation, any configuration appearing in a simulation is allowed to become a part of the training set. In a nutshell, the idea is to estimate the magnitude of the error of the MLIP, its extrapolation grade, during the simulation and, should it exceed some threshold, recompute energies and forces (and stresses, if necessary) with the underlying ab initio model and retrain the potential including the newly acquired data. In the field of atomistic modeling, active learning has been initially proposed in the context of QM/MM methods by De Vita and Car, Csányi et al [3, 30]—to learn force fields of recurring atomic neighborhoods on-the-fly during the simulation—and continues to develop to date [16, 31, 32]. For MLIPs, an active learning method has been proposed by Behler [33] using a query-by-committee strategy (cf [29]). Another method based on D-optimality was proposed by Podryabinkin and Shapeev [34] who estimate the extrapolation grade by means of the growth in the determinant of the system matrix associated with the loss function. Their method has been successfully applied to predict molecular properties [35], diffusion processes in metals [36] and phase stability of (random) alloys [37, 38], alongside with a reduction of the necessary amount of DFT calculations by several orders of magnitude. However, despite these recent successes, applying active learning to larger simulations with thousand or more atoms requires a novel approach since DFT codes cannot efficiently handle thousands of atoms. Such 'larger simulations' are necessary to solve problems which involve extended defects with important properties affecting the material microstructure, e.g. dislocations, cracks or grain boundaries; typical properties are their mobility, possibly altered by interactions of the defect with impurities (or other types of defects), and their long-range behavior. In particular, the number of atoms rapidly exceeds the tens of thousands when considering intrinsically three-dimensional extended defects, such as kinked or jogged dislocations.

Building on [34], the main aim of the present work is thus to develop an active learning algorithm for large-scale problems containing extended defects, such as dislocations. The idea we pursue is as follows: from the large-scale computational domain we extract local configurations of 100–200 atoms, small enough to be feasible to compute with DFT, and measure the extrapolation grade therein. Should the extrapolation grade in one of them become too large, we automatically convert it to a periodic configuration maintaining the atomic neighborhoods from the large-scale problem, run a DFT calculation on this configuration and subsequently extend the training set.

While the algorithm, described in the previous paragraph, appears similar to the 'learning-on-the-fly QM/MM', it is novel in our understanding and does not share some of the drawbacks of QM/MM as will be made clear in the following. The distinguishing feature of our algorithm is the use of a MLIP in the entire computational domain—as opposed to the direct coupling between DFT and interatomic potentials in QM/MM—which allows us to compute energetic quantities directly, such as energy/activation barriers or defect formation energies. In QM/MM, energetic quantities are usually computed using indirect methods (e.g. [39]) since the quantum mechanical energy is a global quantity and cannot be computed for clusters of atoms. Nevertheless, it is still necessary that the periodic training configurations are not 'too' artificial to avoid degrading the accuracy of the MLIP. In section 3 we will present a new method to construct such periodic training configurations by symmetrizing the atomic positions so that atoms at the domain boundaries enjoy a neighborhood close those in the large-scale problem.

We test our methodology by predicting properties characterizing screw dislocation motion in bcc tungsten, i.e. we perform simulations in order to calculate the core structure, Peierls barrier and Peierls stress. Thereby, the focus is set on demonstrating that our training procedure ensures that the MLIP is able to (i) learn the right training configurations during these simulations and (ii) reliably predicts energy differences and forces.

2. Active learning algorithm

Let $\{{r}_i\}$ be a configuration of atoms subject to general (usually non-periodic) boundary conditions. We assume that the atomistic model is local in the sense that the motion of each atom $\boldsymbol{r}_i$ depends on its relative position $\boldsymbol{r}_{ij} = \boldsymbol{r}_j - \boldsymbol{r}_i$ with respect to all other atoms $\boldsymbol{r}_j$ within its neighborhood which usually extends to a few lattice spacings, i.e. up to third- or fourth nearest neighbors. The total energy $\boldsymbol{\Pi}$ then admits a partitioning into per-atom energies $\boldsymbol{\epsilon} = \boldsymbol{\epsilon}(\{{r}_{ij}\})$ leading to

Equation (1)

In this work we model the per-atom energies using MTPs [7, MTP] such that

Equation (2)

where the $\theta_\alpha$'s are the fitting parameters and $B_\alpha(\{{r}_{ij}\})$ the basis functions. For further details on the potential representation we refer to Appendix A.1.

In order to judge on the accuracy of the MTP, there exist a number of active learning algorithms [16, 20, 33, 40, 41] that are able to automatically assess whether $\{{r}_i\}$ is well-represented in the training set. If not, one could add $\{{r}_i\}$ to the training set after computing its DFT energy and gradients. However, if $\{{r}_i\}$ consists of 1000 or more atoms then this strategy is not feasible: DFT is very time-consuming and scales cubically with the total number of atoms.

In this work we develop an algorithm to learn interatomic interaction, as given by DFT, by automatically extracting small, representative configurations of 100–200 atoms with active learning and composing our training set from these small configurations. Our algorithm has two components:

  • (i)  
    detection of extrapolative neighborhoods,
  • (ii)  
    completion of such neighborhoods to small periodic configurations efficiently computable with plane-wave DFT codes.

Our algorithm of extracting neighborhoods is based on the D-optimality criterion [29] developed for MTPs in [34, 37]. In particular, we adapt the algorithm for detecting per-neighborhood extrapolation. In such an algorithm we determine the extrapolation grade $\gamma = \gamma(\{{r}_{ij}^\ast\})$ for a neighborhood $\{{r}_{ij}^\ast\}$ of the ith atom in a configuration based on how far the descriptors, $B_\alpha(\{{r}_{ij}^\ast\})$, are from those of the training set. That is, we imagine the existence of per-atom energies $\boldsymbol{\epsilon}^{\rm qm}$ and fitting the parameters $\underline{\theta}$ to them:

Equation (3)

Since equation (3) would be overdetermined, we select an m×m submatrix $\underline{\underline{A}}$ of the system matrix associated with equation (3). The m rows in $\underline{\underline{A}}$, representing a neighborhood $\{{r}_{ij}\}$ from the training set, are chosen to maximize $|\textrm{det}\underline{\underline{A}}|$ using the maxvol algorithm [42]. The extrapolation grade is then computed by solving the linear system (cf [34], section 3.3)

Equation (4)

To be more precise, we calculate the extrapolation grade for nonlinearly parameterized potentials; this calculation is a natural generalization of the linear case (the one with basis functions $B_\alpha(\{{r}_{ij}\})$) developed in [37] and outlined in Appendix A.2. This grade allows us to formulate a practically universal criterion by using a threshold for permissible extrapolation.

If the size of our atomistic configuration $\{{r}_i\}$ was ≈ 100–200 atoms, then our job would be done—we add this configuration to the training set together with the DFT-computed energy, forces, and stresses and continue (or restart) the simulation. Again, this is not feasible with larger systems because collecting DFT data scales cubically with the system size. Nevertheless, if the extrapolative neighborhoods $\{{r}_{ij}^\ast\}$ appear in subsets ${\{{r}_i^\ast\}} \subseteq \{{r}_i\}$ of 100–200 atoms which are sufficiently far from each other, we can construct several training configurations which, together, indeed capture all extrapolative neighborhoods. The justification of this approach is evident if the boundary atoms of a configuration ${\{{r}_i^\ast\}}$ have neighborhoods which are close to the ones they would have in the corresponding training configuration obtained by periodically replicating ${\{{r}_i^\ast\}}$. This is of course not the case if ${\{{r}_i^\ast\}}$ contains extended defects, such as dislocations, as illustrated in figure 1. Therefore, running a DFT calculation on ${\{{r}_i^\ast\}}$ after naively applying periodic boundary conditions would critically degrade the accuracy of the MTP by training it on many artificial neighborhoods that occur at the boundary (guide the eye to the coloring of the atoms).

Figure 1.

Figure 1. Schematic illustration of the individual steps during an atomistic simulation for screw dislocation motion using the active learning algorithm. [S1] The simulation runs until active learning detects inadmissible extrapolation in ${\{{r}_i^\ast\}}$ containing the dislocation core atoms (assuming that the MLIP has been already trained sufficiently well on the ideal crystal structure). [S2] Extraction of the extrapolative ${\{{r}_i^\ast\}}$. [S3] Construction of periodic configurations which capture the extrapolative neighborhoods of ${\{{r}_i^\ast\}}$ but removes the artificial ones at the boundary in order to accurately compute the dislocation core energy. [S4] The potential is retrained on the updated training set. The coloring of the atoms is due to the centrosymmetry parameter [43, CSP] in order visualize their deviation from the bulk crystal configuration.

Standard image High-resolution image

Our novel contribution here is the extension of active learning to precisely such cases by combining the extrapolation detection with the second component, the construction of periodic training configurations which capture the extrapolative neighborhoods $\{{r}_{ij}^\ast\}$ appearing in ${\{{r}_i^\ast\}}$. From figure 1 it can be seen that this is a more involved endeavor since the number of atoms as well as the shape of this periodic configuration differ from ${\{{r}_i^\ast\}}$. We postpone the details of this construction to the next section and first describe the full active learning algorithm.

During an atomistic simulation we may frequently encounter extrapolative configurations. Instead of retraining the potential immediately after detecting any extrapolation, we propose a greedy strategy that runs the simulation for a certain number of iterations 1 and subsequently selects a minimal set of configurations which must be added to the training set in order to ensure that γ remains below some threshold. In this way we allow for training configurations further outside the training set which potentially interpolate on otherwise selected configurations using an iteration-wise scheme. The individual steps of the full algorithm, which is schematically depicted in figure 1, are described below:


Figure 2.

Figure 2. Schematic illustration of the individual steps (1)–(4) for constructing periodic training configurations for a screw dislocation (a) in the initial easy-core position and (b) shifted along the glide plane by a few lattice spacings with respect to (a). (1) Infinite configuration containing a single dislocation. (2) Periodic configuration after applying the displacement equation (5) to the reference lattice. (3) & (4) Extraction of the dipole-like training configuration. Coloring of the atoms is due to the centrosymmetry parameter [43, CSP].

Standard image High-resolution image

The steps [S1][S4] are then repeated until convergence or the maximum number of iterations is attained.

3. Construction of periodic training configurations

3.1. Existing methods within the scope of QM/MM

We now turn to the question of how to construct an appropriate training set from configurations ${\{{r}_i^\ast\}}$ in which the extrapolation grade exceeds a certain threshold. One of the currently most accurate and efficient ways to compute quantum mechanical quantities, such as energies, forces and stresses, appears to be plane-wave, i.e. periodic, DFT (cf e.g. [44]). However, if we simply apply periodic boundary conditions to ${\{{r}_i^\ast\}}$, the neighborhoods of atoms at the periodic domain boundaries would differ from those in the large-scale problem. This is in particular severe for 'higher-than-zero-dimensional defects', e.g. line (dislocations), surface (grain boundaries) or volume (cracks, voids) defects. The problem is thus similar to hybrid quantum/classical mechanics (QM/MM) methods, where a small, but arbitrarily-shaped, cluster of atoms, treated with an ab initio model, is embedded in a much larger domain computed with interatomic potentials. To consider such clusters of atoms using plane-wave DFT, Woodward and Rao [5] developed two methods in the context of screw dislocations:

  • (i)  
    In the first method, a vacuum region was introduced to separate the QM cluster from its periodic images. To prevent spurious free surface effects on cluster atoms, an additional buffer region replicating their true neighborhood was used in-between.
  • (ii)  
    In the second method, the buffer region from method (i) was extended to the periodic domain boundaries to avoid the vacuum region.

Method (ii) allows for a smaller cell size in comparison with method (i)—at the expense of a larger buffer region. However, neither method seems to be preferable over the other in terms of accuracy and computational cost since both have been equally successfully employed in many QM/MM studies (e.g. [1, 28, 45, 46]). In principle, they therefore also apply to our algorithm, however, none of them allows one to compute the quantum mechanical energy (of the cluster) which is a global quantity corresponding to the entire periodic cell 2 .

3.2. Symmetrization of extrapolative configurations

Fortunately, in the case of MLIPs, we do not stringently require exact total energies—it suffices that the MLIP is able to interpolate on the large-scale configurational space. Nevertheless, it is still necessary to prevent free surfaces and/or other types of spurious boundary effects to avoid training on artificial neighborhoods irrelevant to the large-scale problem. We therefore present an improved version of method (ii) which symmetrizes the atomic displacements at the periodic domain boundaries to ensure that the neighborhood of the atoms outside the dislocation core is close to the one they would 'feel' in the large-scale problem.

The idea is to extract the displacements $\boldsymbol{\tilde{u}}_i(\boldsymbol{r}_{0,i}) = \boldsymbol{r}_i - \boldsymbol{r}_{0,i}$, where $\boldsymbol{r}_{0,i}$ is the ideal lattice site of the ith atom, around the dislocation core and apply it to a periodic configuration, small enough to be computable by DFT but without introducing atomic neighborhoods different from those in the large-scale configuration. To that end, we proceed as illustrated in figure 2 (a)–(1) and define the rectangular region of size $l_{\rm x} \times l_{\rm y}$, containing the dislocation core atoms ${\{{r}_i^\ast\}}$ which we check on extrapolation. Next, we define three replicas of this rectangular region and translate them according to figure 2 (a)–(2) by $l_{\rm x}$ with respect to the x-axis and/or $l_{\rm y}$ with respect to the y-axis.

Assuming that the dislocation line is parallel to the z-axis, we then define the mirrored displacement field

Equation (5)

where $\{{r}_{0,i}^1 \}$$\{{r}_{0,i}^3 \}$ are the ideal lattice sites in the translated regions. Imposing this displacement gives the configurations $\{{r}_i^1 \}$$\{{r}_i^3 \}$ shown in figure 2 (a)-(2). The union of ${\{{r}_i^\ast\}}$ and $\{{r}_i^1 \}$$\{{r}_i^3 \}$ is clearly periodic and from the figure it can be seen that this procedure creates minimal spurious effects in the sense that the atomic neighborhoods are not 'too far' from the large-scale problem, indicated by the centrosymmetry parameter which is less than 0.04 in the vicinity of the periodic domain boundaries—which is more than two orders of magnitude smaller than for the core atoms. This configuration appears similar to the well-known quadrupolar cell (cf e.g. [47, 48]), yet it is emphasized that our configuration fully includes the elastic far-field contribution due to external boundary conditions intrinsic to the large-scale problem.

We may further reduce this cell to the dipole-like configuration framed by the continuous line in figure 2 (a)-(3) along similar lines as for the quadrupolar cells if the displacement is point symmetric with respect to the dislocation positions. To see why this choice is indeed justified we refer to our analysis in Appendix B.

It should be noted that this construction applies independently of the position of the dislocation core as shown in figure 2 (b). That is, if the dislocation starts moving, e.g. due to some far-field applied stress, we detect the dislocation core position and proceed in the same way as above by shifting the rectangular region accordingly.

Even though the proposed symmetrization technique ensures that no artificial neighborhoods occur in the training configuration, we would like to point out that there is potential freedom in modifying the positions of the boundary atoms: if, e.g. the symmetry of the displacement field breaks due to additional long-range effects, one may introduce an intermediate step which minimizes the centrosymmetry parameter, or, alternatively, the extrapolation grade, of atoms residing in some boundary layer.

Important remark. We find it important to emphasize that, despite the fact that the dislocation lines are extremely (some may say 'unphysically') close to each other in the training set, the fitted model is still expected to (and does in fact, as we will see in next section) give accurate predictions. The reason for such training sets being representative of configurations with normal, 'physical' distance between dislocations rests on the main assumption behind the success of machine-learning potentials: that the perturbations to electronic degrees of freedom decrease fast and therefore the electronic interaction can be well-described by the relative positions of atoms in a small neighborhood (like 5 Å). Hence, if the atomic neighborhoods in the training set are representative of those in a large-scale simulation, then the potential is expected to give accurate predictions. The difference in the long-range elastic energy between training and actual configurations will be resolved by the atomistic deformation which is explicitly represented in the problem of interest.

In particular, total energies and forces of the small periodic training configurations are not used to compute any physical quantities (defect formation energies, energy barriers etc). On the contrary, they are computed in the large-scale simulation in which we are free to choose any boundary condition (e.g. flexible boundary conditions [5, 4951]). Hence, there is no need for the application of correction methods (e.g. [52, 53]) employed when operating solely on small, fully periodic DFT cells due to the interaction of defects with their periodic images through elastic fields . This is another advantage of our method.

4. Computational results

We now test our active learning algorithm for $1/2\,\langle 111 \rangle$ screw dislocation motion in bcc tungsten in order to predict the core structure, the Peierls barrier and the Peierls stress. Our large-scale problem is a cylindrical configuration of atoms and we choose the region where we check extrapolation of the MTP as the collection of atoms in the rectangular region with dimensions $l_{\rm x} = 5 \sqrt{3}\boldsymbol{a_0}/\sqrt{2}$ and $l_{\rm x} = 9 \boldsymbol{a_0}/\sqrt{2}$, where $\boldsymbol{a_0}$ denotes the lattice constant, centered at the dislocation core (cf figure 2 (1)). From this extrapolation region we construct the training set, as proposed in the previous section, leading to a total number of 135 atoms in the periodic configurations (i.e. in $\{{r}_i^{\rm tr} \}$, cf figure 2 (4)). As an ab initio model, we employ DFT by means of the Vienna ab initio simulation package [54, VASP,] which uses plane-wave basis sets and the projector-augmented wave (PAW) pseudopotential method [55, 56]. For all DFT calculations we have used the simulation parameters presented in C.

For our large-scale atomistic simulations we use non-linear MTPs as described in Appendix A.2. Since we focus on applications involving geometry optimization and barrier calculations we fit the MTPs to energies and forces according to the procedure described in Appendix A.3. The parameter N, that is, the number of iterations we perform before we retrain the MTPs (see the algorithm in section 2), is of the order of 30 for all our simulations.

The size of the cylindrical configuration is ≈ 5700 atoms for the core relaxation and the Peierls stress calculation in sections 4.1 and 4.3, respectively. For the barrier calculations in section 4.2 the size is ≈ 1400 atoms. We compare the results obtained by using the cylindrical configuration to the results obtained by using the classical 135-atom dipole configuration (e.g. [57]). In the following we therefore refer to them as the cylinder and the periodic problem, respectively.

For validation purposes, we have also trained the MTP with respect to two empirical interatomic potentials based on the embedded atom method (EAM), namely the EAM3 and EAM4 potentials developed in [58] 3 . For clarity, we therefore refer to the MTP trained with respect to DFT as MTP-DFT and the MTPs trained with respect to the EAM potentials as MTP-EAM3 and MTP-EAM4, respectively. For compactness, we only present a detailed discussion of the results for the large-scale cylinder problem using the MTP-DFT, which is our main contribution, and refer to the results of the validation (such as training errors, etc) for the MTP-EAM3 and the MTP-EAM4, and also the MTP-DFT for the periodic problems, to Appendix D.

In our test problems we consider solely structural relaxation using the fast inertial relaxation engine [59, FIRE] as implemented in the atomic simulation environment [60, ASE] library. The active learning algorithm is terminated when the maximum force on an atom is less than 0.001 eV/Å; except for the barrier calculations for which we use 0.01 eV/Å.

4.1. Core relaxation

For the dislocation core relaxation we set the bounds $\gamma_{\rm {min}}$ and $\gamma_{\rm \textrm{max}}$, within which we consider configurations as potential candidates for the training set, to 1.1 and 10, respectively. This choice is close to the optimum (cf [34]) and should therefore yield the best possible accuracy that one could achieve with an MTP in practice. Using this parameter set, the active learning algorithm converged after 63 iterations with a training set size of 40 configurations. The training errors reported in table 1 are in the range of the numerical approximation of DFT codes themselves. Therefore, the MTP essentially reproduces the training data without introducing further errors.

Table 1. Training errors corresponding to the core relaxation for the MTP-DFT. The errors are of the same order than the numerical precision of our (already conservative) DFT calculation (cf Appendix C) and, consequently, the MTP-DFT is able to reproduce the DFT model practically without error.

 Error in:
 Energy [eV]Energy per atom [eV]Forces [eV/Å]
Maximal absolute difference8.17 × 10−4 6.052 × 10−6 0.028 4
Average absolute difference4.069 × 10−4 3.015 × 10−6 0.011 2
RMS4.739 × 10−4 3.51 × 10−6 0.012 1
Max(ForceDiff) / Max(Force)  0.028 4
RMS(ForceDiff) / RMS(Force)  0.049 4

The differential displacement map for the relaxed core structure computed with the MTP is shown in figure 3. It can immediately be seen that our algorithm predicts the compact core structure from previous established DFT studies (cf e.g. [48, 57]). To quantify this result, we show the energy difference between the initial linear elastic and the final relaxed configuration in table 2. This is of course not feasible to compute with a reference DFT calculation so that we compare our result to the 135-atom periodic configuration. The energy difference is ≈ 2× larger for the periodic problem, which is expected since there are two dislocations in this system. In addition, we have computed the energy difference for the screw-cylinder configuration using the EAM4 potential as a reference model—where we can compute the exact solution. Here, the error of the MTP-EAM4 is ≈ 2% which confirms the excellent agreement between both models and provides further evidence that our algorithm reliably extracts the core energy; note that an error of 1–2 % also occurs for the periodic problem (compare the EAM4 and MTP-EAM4 values in table 2).

Figure 3.

Figure 3. Differential displacement map around the dislocation core after relaxation for the MTP-DFT which displays the compact structure in agreement with DFT.

Standard image High-resolution image

Table 2. Energy differences (in eV per Burgers vector) between the initial and final (relaxed) configurations. The MTP-DFT exactly reproduces the DFT value for the periodic problem while the small discrepancy between the MTP-EAM4 and the EAM4 is likely due to the nonsmooth potential energy surface (cf figure 4. Since this discrepancy is the same for both the periodic and the cylinder problem, and, moreover, the fitting errors are in the range of DFT precision (cf table 1), the MTP-DFT presumably also exactly reproduces the (infeasible) cylinder DFT value.

 DFTMTP-DFTEAM4MTP-EAM4
Periodic−0.298−0.298−0.822−0.813
Cylindern/a−0.166−0.415−0.407

It is interesting to note that it was 'easier' for the MTP to reproduce DFT than EAM4—the difference between MTP and DFT is less than 1 meV per Burgers vector, while the difference between MTP and EAM4 is of the order of 10 meV per Burgers vector. Speculatively, we attribute it to the fact that the MTP has a flexible functional form which makes it easy to reproduce the smooth underlying DFT energy (at least, when the configurational space is not too large), whereas fitting an EAM potential introduces an oscillatory behavior commensurate with atomic shell radii in the pair potential function as shown in figure 4. Such a nonsmooth behavior appears to be more difficult to fit with a smooth MTP radial basis than fitting the DFT energy.

Figure 4.

Figure 4. Potential functions of the EAM4 per-atom energy functional $\boldsymbol{\epsilon}(\{{r}_{ij}\}) = \sum_{\boldsymbol{r}_{ij}} \phi(|\boldsymbol{r}_{ij}|) + F(\sum_{\boldsymbol{r}_{ij}} \rho(|\boldsymbol{r}_{ij}|))$. The ranges of the function arguments are adapted to the atomic neighborhoods occurring during the simulation.

Standard image High-resolution image

4.2. Peierls barrier

Using the relaxed configuration from the previous section, we now compute the Peierls barrier for screw dislocation motion under zero stress by calculating the minimum energy path using the nudged elastic band [62, NEB] method. We create the final image by translating the displacement field of the relaxed core structure by $\sqrt{2/3}\,\boldsymbol{a_0}$ along the $\langle 112 \rangle$ direction, i.e. to the next easy-core position on the {110} plane. To create the initial guess for the intermediate images, we use the same displacements according to their reaction coordinate.

For this problem we have chosen the extrapolation bounds $\gamma_{\rm {min}} = 2$ and $\gamma_{\rm \textrm{max}} = 10$. Using an NEB with 5 images, the active learning algorithm converged after 38 iterations with a training set size of 22 configurations. Therefore, our algorithm only required a full DFT calculation on the 135-atom symmetrized core configuration (cf figure 2 (4)) roughly every 9th iteration. The training errors in table 3 are slightly worse than for the core relaxation, which is expected since the training set is now more diverse due to the intermediate core configurations. However, a maximum error in the total energy of 5.6 meV (per configuration) is still a very small quantity since the Peierls barrier is more than an order of magnitude higher (cf table 4).

Table 3. Training errors corresponding to the NEB calculation for the MTP-DFT. The maximum error in the energy is of the order of a few meV which confirms that the training set consisting of the symmetrized configurations are suitable for representing intermediate core structures (cf figure 2 (b)).

 Error in:
 Energy [eV]Energy per atom [eV]Forces [eV/Å]
Maximal absolute difference0.005 64.152 × 10−5 0.075 6
Average absolute difference0.00 2331.728 × 10−5 0.017 6
RMS0.00 2742.029 × 10−5 0.019 9
Max(ForceDiff) / Max(Force)  0.078 6
RMS(ForceDiff) / RMS(Force)  0.064 7

Table 4. Peierls energies in eV per Burgers vector corresponding to the energy curves in figure 5 (a). The small difference of ≈ 1 meV per Burgers vector and less between the available MTP-DFT/MTP-EAM4 and DFT/EAM4 values coincides with the training errors (see Appendix D.2) and is thus also expected between the MTP-DFT and DFT for the cylinder problem (cf table 3).

 DFT [63]MTP-DFTEAM4MTP-EAM4
periodic0.092 10.090 50.058 90.057 0
cylindern/a0.088 00.062 70.062 9

The corresponding energy curve is shown in figure 5 (a). Again, since a reference DFT calculation is unfeasible, we have compared our result to the one obtained from a periodic calculation using solely DFT. According to the convergence study by Grigorev et al [63], the difference between a periodic configuration of 135 atoms and an effectively infinite configuration should be in the range of only a few percent. From table 4, it follows that the difference is ≈ 4% and thus in agreement with their analysis. For testing purposes, we have also carried out calculations with the EAM4 potential as our reference model, as in the previous section. Here, the energy difference between the periodic and the cylinder configuration is ≈ 6%. Moreover, the Peierls energies for the EAM4 and the MTP-EAM4 coincide almost perfectly which further validates our training methodology. In addition, the differential displacement map for the relaxed central image is shown in figure 5 (b) which demonstrates that the MTP-DFT also correctly predicts the same intermediate configuration observed in our and recent DFT studies.

Figure 5.

Figure 5. (a) Energy curves for screw dislocation motion. The deviations of all MTP-X curves from the exact values are tiny (of the order of 1 meV per Burgers vector!) and, therefore, DFT as well as EAM4 are very-well reproduced by the MTP. Note that the small differences between the curves for the cylinder and periodic problems are expected since the elastic interaction between the dislocation cores largely cancels out due to the quadrupole arrangement in the periodic case (cf [63]). (b) Differential displacement map around the dislocation core of the central image for the MTP-DFT which shows the favorable intermediate configuration also predicted by DFT.

Standard image High-resolution image

4.3. Peierls stress

Finally, we test our algorithm for the screw-cylinder problem subject to a far-field applied stress in order to predict the Peierls stress, that is, the stress under which the dislocation starts moving. To this end, we apply a deformation corresponding to the shear stress $\boldsymbol{\tau}_{\rm yz}$ to all atoms and subsequently relax the system. During relaxation, we detect the position of the dislocation core and update the extrapolation region accordingly (cf figure 2 (b)). For this problem, we have used $\gamma_{\rm \textrm{min}} = 2$ and $\gamma_{\rm {max}} = 10$ as for the barrier calculations.

We have applied an initial stress of ≈ 1.74 GPa and increased it in 0.1 GPa-steps upon convergence. The dislocation then moved to the next easy-core position when the stress was 2.34 GPa. Hence, we define the obtained Peierls stress as 2.29± 0.05 GPa. This value is in the same range as previous DFT studies as shown in table 5. Nevertheless, we would like to remind the reader that these previous studies use indirect methods; e.g. the method in [64] is based on fitting a Kocks-type function using the Peierls barriers from several periodic calculations subject to different applied stresses as fitting parameters. They are therefore computationally significantly more expensive, require prior knowledge of the slip plane and are also prone to errors due to the analytic fit.

Table 5. Peierls stress (in GPa) for W predicted by DFT and the interatomic potentials. The MTP-DFT value lies well within the range of all considered previous DFT studies. It is remarked that, given the error in the Peierls stress for the MTP-EAM3 (which has similar training errors compared to the MTP-DFT) with respect to the EAM3 potential which is at most 8 %, the anticipated exact DFT value (for our parameter set) lies between 2 and 2.5 GPa.

DFT [65]DFT [48]DFT [64]MTP-DFTEAM3MTP-EAM3
2.5–2.81.7122.29± 0.050.97± 0.0050.904± 0.005

In order to judge on the accuracy of our method, we compute the Peierls stress with the EAM3 potential as the reference model which allows us to perform an exact large-scale calculation 4 . Following the same procedure as described above, except that we now increment the applied stress by 0.01 GPa, the predicted Peierls stress for the MTP-EAM3 agrees well with our large-scale reference calculation showing an error of only 6%–8% (cf table 5). Moreover, this corresponds to the relative fitting errors for the MTP-EAM3 shown in table D10. Given that the training errors for the MTP-EAM3 and the MTP-DFT (table 6) are in the same range, it is likely that the DFT Peierls stress can also be predicted with the MTP-DFT up to an error of 6–8%. Should we need a higher accuracy, we may simply increase the number of basis functions of the MTP but note as well that errors up to 10% are absolutely tolerable when the Peierls stress is used to calibrate a higher-scale model, such as discrete dislocation dynamics.

Table 6. Training errors corresponding to the Peierls stress calculation for the MTP-DFT. The errors are not significantly higher than for the barrier calculations (cf table 3) although the training data now consists of a more diverse set of configurations subject to various applied stresses within the range 1.74–2.34 GPa. It is therefore expected that the MTP-DFT also predicts the core energies for strained configurations without any significant loss of accuracy when compared to the results in sections 4.1 and 4.2.

 Error in:
 Energy [eV]Energy per atom [eV]Forces [eV/Å]
Maximal absolute difference0.023 31.723 × 10−4 0.133
Average absolute difference0.00 8065.969 × 10−5 0.017 7
RMS0.00 9557.073 × 10−5 0.019 8
Max(ForceDiff) / Max(Force)  0.123
RMS(ForceDiff) / RMS(Force)  0.064 4

To validate the correct functioning and demonstrate the capabilities of the active learning algorithm, we show the evolution of the dislocation position (with respect to the origin) and the training set size as a function of the iteration in figure 6 (a). Having crossed the first Peierls barrier after ≈ 550 iterations, the training set contained 20 configurations, increased to 22 while crossing the second Peierls barrier, and then continued gliding without triggering new DFT calculations. This behavior is expected since the dislocation moves by alternating from ① easy-core ② intermediate-core ③ easy-core $\rightarrow$ ... etc as shown in figure 6 (b) and does therefore not encounter any significantly new, i.e. extrapolative, configurations.

Figure 6.

Figure 6. (a) Dislocation position in $d_{\rm p} = \sqrt{2/3}\,a_0$, where $d_{\rm p}$ is the distance between two easy-core positions, and training set size as functions of the iteration for the Peierls stress calculation from section 4.3 using the MTP-DFT. The figure demonstrates that the active learning algorithm works properly by showing that no further configurations are added to the training set after the dislocation has crossed the first Peierls valley after ≈ 700  iterations (and henceforth only encounters 'known' neighborhoods). (b) Visualization of the dislocation core structure using a differential displacement map at selected iterations from (a), i.e. ① before, ② during and ③ after crossing the first Peierls barrier; thereby, the dislocation core structure correctly changes from easy to intermediate, and back again to easy, without admitting any other configuration (which also explains the small training set size of 22 configurations shown in (a)).

Standard image High-resolution image

5. Concluding remarks

5.1. Summary and discussion

We have developed an active learning algorithm for large-scale atomistic simulations using MTPs, a class of MLIPs. Our algorithm allows for a novel way of performing large-scale simulations with DFT accuracy fully automatically by letting active learning find the right training data during the simulations—without the need for any prior (passive) training. We have applied the algorithm to model screw dislocation motion in bcc tungsten and shown that the MTP is able to reproduce known mechanically relevant properties from the literature, such as core structure, transition state barriers and Peierls stress, alongside with a significant reduction in the necessary amount of DFT calculations compared to standard methods.

More specifically, we have shown that active learning is able to reliably detect a minimum set of sufficiently distinct training configurations. This is in particular demonstrated in section 4.3, where the dislocation moves through the effectively infinite crystal: in the beginning, active learning requests new training data during the change of the core structure but stops when the dislocation repetitively encounters the same configurations after crossing the first Peierls valley. Moreover, using our symmetrization strategy to convert extrapolative regions into periodic configurations, the MTP gave accurate predictions for forces as well as energy differences.

Our algorithm can in principle be applied by a user in the same way as a standard large-scale atomistic simulation. Therefore, no additional expert knowledge is required in order to extract physical quantities, such as the Peierls stress, from indirect DFT methods (cf section 4.3). This opens new prospects for using MLIPs in general as a robust tool for DFT-accurate simulations of extended defects.

5.2. Outlook

Our example in section 4.3 further shows that we can include far-field applied stresses which influence the core structure of the dislocation and its the elastic field. We therefore expect that our algorithm is directly applicable to related problems involving interactions between dislocations and different types of defects, e.g. vacancies or other dislocations.

While we have assumed that nonlinearities are confined to the small-sized extrapolation region(s), we have demonstrated that active learning significantly reduces the amount of actual DFT calculations. If the size of the defect core increases to several hundreds of atoms, e.g. in the case of dissociated dislocations in face-centered-cubic lattices, the saved computing capacities may thus be invested in a few larger calculations in order to compute energies; combined with many small cluster calculations in order to compute forces.

Further, we think that our methodology will also be useful for inherently three-dimensional problems, e.g. curved dislocations. For this class of problems, we currently envision two possible approaches. The first possibility is to train a MLIP on several selected straight dislocations with different character angles. The idea is then to switch off active learning and use the MLIP as a standard interatomic potential. The second possibility is to proceed as we do in the present work, i.e. we carve out a collection of atoms in the vicinity of the dislocation core and construct a periodic configuration. This procedure is illustrated in figure 7 for a three-dimensional dislocation in a bcc lattice moving by means of the kink-pair nucleation mechanism (see, e.g. [66]). Both directions will be explored in future work.

Figure 7.

Figure 7. Schematic illustration of a possible method for constructing periodic training configurations for dislocation kinks.

Standard image High-resolution image

It is finally worthwhile to emphasize that our methodology does not exclude multi-component materials involving alloying elements and chemical impurities, e.g. hydrogen or helium, as most MLIPs already include the necessary descriptors (e.g. [10, 37]). The discovery of new materials mainly comes through the search for novel alloys and we think that investigating corresponding mechanisms using MLIPs in combination with active learning will be one of the most auspicious potential applications since the configurational space will likely be too vast to be explored with only passively selected training sets.

Acknowledgments

We thank Dr Evgeny Podryabinkin for assistance during the implementation of the neighborhood selection criterion. Moreover, MH would like to thank Dr Petr Grigorev for helpful discussions regarding the DFT calculations for screw dislocations.

Financial support from the Russian Science Foundation (Grant No. 18-13-00479) and the Fonds National Suisse (FNS), Switzerland, (project 191 680) is highly acknowledged.

Acknowledgements

The figures in this article showing atomistic configurations were created using the visualization software OVITO [69].

Appendix A.: Moment tensor potentials

A.1. Potential representation

The basis functions from equation (2) for MTPs are given by the following form [7]

Equation (A1)

where

Equation (A2)

are the moment tensor descriptors. Here, the $f_{\mu_l}$'s are radial basis functions which vanish for $|\boldsymbol{r}_{ij}| > r_{\rm cut}$, where $r_{\rm cut}$ is a predefined cut-off radius. We characterize the degree of an MTP by its level. That is, an MTP of level d implies that the $B_\alpha$'s comprise all basis functions which satisfy $(2\mu_1 + \nu_1) + (2\mu_2 + \nu_2) + ... \le d$.

For all numerical examples, conducted in this work, we have used an MTP of level d = 16 with a cut-off radius $r_{\rm cut} = 5$ Å.

A.2. Nonlinear MTPs

In the case of non-linear MTPs, as we use them in the current work, the radial basis functions $f_{\mu_l}$ depend on an additional parameter vector $\underline{c}_{\mu_l}$ as follows (cf [37])

Equation (A3)

where Tn is the nth order Chebysev polynomial and the function $(r_{\rm cut} - |\boldsymbol{r}_{ij}|)^2$ ensures a smooth transition to 0 as $|\boldsymbol{r}_{ij}| \rightarrow r_{\rm cut}$.

Hence, the basis functions have a non-linear dependence on $\underline{c}_{\mu_l}$. Therefore, we need to modify the active learning algorithm to take this non-linearity into account. For this purpose, under the assumption that the vector $\underline{\theta}$ now comprises the basis coefficients and the $\underline{c}_{\mu_l}$ parameters, we linearize equation (3) with respect to some initial $\bar{\underline{\theta}}$ to first order such that

Equation (A4)

The system matrix associated with the new loss function L is then the n×m Jacobian matrix

Equation (A5)

where n is the total number of atomic neighborhoods in the training set which is usually significantly larger than the size of the parameter vector m.

We proceed as in section 2 by finding an m×m submatrix $\underline{\underline{A}}$ of $\underline{\underline{J}}$ using the maxvol algorithm. The extrapolation grade γ of a neighborhood ${\{{r}_i^\ast\}}$ is now defined—analogously to equation (4)—as the maximum absolute element of the vector

Equation (A6)

A.3. Training procedure

Since we perform structural relaxation and compute energy barriers, we need to fit the MTP with respect to energies and forces. Therefore, we compute the hyperparameters $\underline{\theta}$ by minimizing the loss functional

Equation (A7)

where $\boldsymbol{f}_{\rm r}$, $\boldsymbol{f}^{\rm qm}_{\boldsymbol{r}}$ are the MTP, respectively, the quantum mechanical forces on an atom $\boldsymbol{r}_i$ corresponding to a specific configuration $\{{r}_i^{\rm tr} \}$ in the training set $\mathscr{T}$. The C-constants are regularization parameters which we set to

Equation (A8)

throughout this work.

Appendix B.: Equivalence of the quadrupole- and dipole-like configurations

In the following we show that the quadrupole- and the dipole-like configurations, enclosed by the dashed and continuous lines in figure 2 (3), respectively, are equivalent when the displacement is the anisotropic linear elastic solution

Equation (B9)

of a screw dislocation. Assuming that the x, y and z-axes correspond to the $\langle 112\rangle$, $\langle 110 \rangle$ and $\langle 111\rangle$ direction, namely the slip direction, the slip plane normal and the dislocation line direction, the real quantities pi , qi , $\boldsymbol{c}_i$ and $\boldsymbol{d}_i$ can be computed for $\boldsymbol{b} = \begin{pmatrix} 0 & 0 & b \end{pmatrix}^{\top}$ using the formalism described in [66p. 444–445].

We now need to show that the displacement equation (5) on the upper boundary of the rectangular region with the $\{{r}_i^\ast\}$-atoms is the same as on the lower boundary of the rectangular region enclosing the $\{{r}_i^1 \}$-atoms, and vice versa. That is, $\forall\,x \in (-l_{\rm x}/2,l_{\rm x}/2] \; \wedge \; \forall\,x^\prime \in (-3l_{\rm x}/2,-l_{\rm x}/2]$, we have $\boldsymbol{u}(x,l_{\rm y}/2,z) = \boldsymbol{u}(-x^\prime-l_{\rm x},-l_{\rm y}/2,z)$ or, equivalently, $\boldsymbol{u}(x,l_{\rm y}/2,z) = \boldsymbol{u}(-x,-l_{\rm y}/2,z)$. The equivalence is therefore obvious for the logarithmic part. For the arctan contribution we write $arctan2{(q_iy/(x+p_i))} = arctan2{(\bar{y}/\bar{x})}$. Assuming that $\bar{y}>0$, it follows for

Equation (B10)

Thus, we have

Equation (B11)

since $\sum_{i = 1}^3 d_{3,i} = \boldsymbol{b}/\pi$ which can be deduced from the definition of the displacement at the branch cut. Therefore, the displacement field differs by the constant $\boldsymbol{b}/2$ which must be taken into account when constructing the training configurations. Hence, the region framed by the continuous line in figure 2 (3) and (4) must be the triclinic region

Equation (B12)

spanned by the lattice vectors

Equation (B13)

The training configuration $\{{r}_i^{\rm tr} \}$ is then constructed by imposing the displacement equation (5) on the ideal lattice sites which are in equation (B12).

If the actual atomistic displacements follow a similar pattern, the training configurations can be constructed analogously. This argument trivially applies if the atomistic solution does not differ too much from linear elasticity outside the dislocation core—which is usually the case for compact core structures.

Appendix C.: DFT simulations

The parameters we have used in all our VASP calculations of bcc W are given in table C1. Electronic relaxation was performed using the preconditioned minimal residual method, as implemented in VASP, and terminated when the energy difference between two subsequent iterations was less than 10−4 eV.

Table C1. VASP parameters.

Exchange-correlationPE generalized gradient approximation [68]
PAW potentialPAW_PBE W 08Apr2002
Energy cut-off334.585 eV
$\boldsymbol{k}$-point mesh2×3×16
Smearing width0.06 eV

With these parameters, we obtained the lattice constant $\boldsymbol{a_0} = 3.168\,\text{\AA}$ and the cubic elastic constants $C_{11} = 551.4\,\mathrm{GPa}$, $C_{12} = 200\,\mathrm{GPa}$ and $C_{44} = 139.9\,\mathrm{GPa}$, using the get_elastic_constants function from Matscipy [67].

Appendix D.: Algorithmic results and training errors for the atomistic simulations using MTPs

D.1. Core relaxation

Table D2. Total number of iterations and training set size for the core relaxation problems in section 4.1.

 PeriodicCylinder
 MTP-DFTMTP-EAM4MTP-DFTMTP-EAM4
Iterations57916372
Training set size44354055

Table D3. Training errors corresponding to the core relaxation problems for the cylindrical problem with the MTP-EAM4.

 Error in:
 Energy [eV]Energy per atom [eV]Forces [eV/Å]
Maximal absolute difference0.00 6424.759 × 10−5 0.047 6
Average absolute difference7.533 × 10−4 5.58 × 10−6 0.008 9
RMS0.00 1158.509 × 10−6 0.010 5
Max(ForceDiff) / Max(Force)  0.016 4
RMS(ForceDiff) / RMS(Force)  0.049

Table D4. Training errors corresponding to the core relaxation problems for the periodic problem with the MTP-DFT.

 Error in:
 Energy [eV]Energy per atom [eV]Forces [eV/Å]
Maximal absolute difference0.00 2932.167·10−5 0.029 9
Average absolute difference5.258 × 10−4 3.895 × 10−6 0.011 1
RMS7.457 × 10−4 5.524 × 10−6 0.012 3
Max(ForceDiff) / Max(Force)  0.037 8
RMS(ForceDiff) / RMS(Force)  0.128

Table D5. Training errors corresponding to the core relaxation problems for the periodic problem with the MTP-EAM4.

 Error in:
 Energy [eV]Energy per atom [eV]Forces [eV/Å]
Maximal absolute difference0.00 6815.048 × 10−5 0.067 1
Average absolute difference0.00 1259.287 × 10−6 0.00 699
RMS0.00 1691.254 × 10−5 0.00 835
Max(ForceDiff) / Max(Force)  0.023
RMS(ForceDiff) / RMS(Force)  0.032 5

D.2. Peierls barrier

Table D6. Total number of iterations and training set size for the Peierls barrier calculations in section 4.2.

 PeriodicCylinder
 MTP-DFTMTP-EAM4MTP-DFTMTP-EAM4
Iterations32573861
Training set size48712230

Table D7. Training errors corresponding to the NEB calculations for the cylindrical problem with the MTP-EAM4.

 Error in:
 Energy [eV]Energy per atom [eV]Forces [eV/Å]
Maximal absolute difference0.017 91.329 × 10−4 0.143
Average absolute difference0.010 47.697×10−5 0.020 6
RMS0.012 29.073 × 10−5 0.0257
Max(ForceDiff) / Max(Force)  0.0448
RMS(ForceDiff) / RMS(Force)  0.0699

Table D8. Training errors corresponding to the NEB calculations for the periodic problem with the MTP-DFT.

 Error in:
 Energy [eV]Energy per atom [eV]Forces [eV/Å]
Maximal absolute difference0.00 4943.662 ×10−5 0.072 1
Average absolute difference0.00 2151.593 × 10−5 0.012 5
RMS0.00 2591.919 × 10−5 0.013 9
Max(ForceDiff) / Max(Force)  0.078 6
RMS(ForceDiff) / RMS(Force)  0.191

Table D9. Training errors corresponding to the NEB calculations for the periodic problem with the MTP-EAM4.

 Error in:
 Energy [eV]Energy per atom [eV]Forces [eV/Å]
Maximal absolute difference0.019 21.423 × 10−4 0.127
Average absolute difference0.00 4073.017 × 10−5 0.014 5
RMS0.00 5223.866 × 10−5 0.020 4
Max(ForceDiff) / Max(Force)  0.032 5
RMS(ForceDiff) / RMS(Force)  0.085 7

D.3. Peierls stress

Table D10. Training errors corresponding to the Peierls stress calculation for the MTP-EAM3 in section 4.3. The simulation was carried out until the dislocation was about to pass the second Peierls valley. At this point the training set contained 51 configurations.

 Error in:
 Energy [eV]Energy per atom [eV]Forces [eV/Å]
Maximal absolute difference0.016 71.238 × 10−4 0.249
Average absolute difference0.005 84.295 × 10−5 0.015 7
RMS0.00 6835.06 × 10−5 0.020 1
Max(ForceDiff) / Max(Force)  0.070 4
RMS(ForceDiff) / RMS(Force)  0.045 6

Footnotes

  • 1  

    Here, iteration either refers to a step of a non-linear solver, in the case of structural relaxation, or a molecular dynamics timestep.

  • 2  

    On the other hand, this may not be a problem if one is interested in purely 'force-based' studies, e.g. computing defect geometries or the Peierls stress of a dislocation, etc.

  • 3  

    Versions of both potentials were obtained from the Knowledgebase of Interatomic Models [61, KIM] repository.

  • 4  

    Here, we have used the EAM3 potential since the EAM4 potential does not predict the {110} glide plane.

Please wait… references are loading.
10.1088/2632-2153/aba373