Bayesian uncertainty quantification in the evaluation of alloy properties with the cluster expansion method

https://doi.org/10.1016/j.cpc.2014.07.013Get rights and content

Abstract

Parametrized surrogate models are used in alloy modeling to quickly obtain otherwise expensive properties such as quantum mechanical energies, and thereafter used to optimize, or simply compute, some alloy quantity of interest, e.g., a phase transition, subject to given constraints. Once learned on a data set, the surrogate can compute alloy properties fast, but with an increased uncertainty compared to the computer code. This uncertainty propagates to the quantity of interest and in this work we seek to quantify it. Furthermore, since the alloy property is expensive to compute, we only have available a limited amount of data from which the surrogate is to be learned. Thus, limited data further increases the uncertainties in the quantity of interest, and we show how to capture this as well. We cannot, and should not, trust the surrogate before we quantify the uncertainties in the application at hand. Therefore, in this work we develop a fully Bayesian framework for quantifying the uncertainties in alloy quantities of interest, originating from replacing the expensive computer code with the fast surrogate, and from limited data. We consider a particular surrogate popular in alloy modeling, the cluster expansion, and aim to quantify how well it captures quantum mechanical energies. Our framework is applicable to other surrogates and alloy properties.

Introduction

In the present work, we aim to develop a Bayesian framework for quantifying the uncertainty in alloy modeling when using fast parametrized surrogates in place of an expensive computer code. In the most typical setup, the surrogate is learned from some data set, e.g., quantum mechanical energies, and then used to predict some quantity of interest (QI), which could be a ground state line, a phase transition, or some optimal structure (e.g., lowest thermal conductivity in the case where the data are instead thermal conductivities). Of course, the parametrization we choose for the surrogate depends on what data we obtain from the computer code. Since the code is expensive, we only have available a limited amount of data. Furthermore, we require the surrogates to be computationally cheap. This means that, e.g., if the surrogate is represented by a set of basis functions, we are not in liberty to include an arbitrarily large number of such basis functions. The particular surrogate we consider later is such an example. These restrictions on the surrogate mean that, when it is parametrized, we do not know a priori the best parametrization. We have to learn it from a set of multiple candidate parametrizations, a pool of candidates, each candidate, from a Bayesian perspective, consistent with the observed limited amount of data. A single value of the QI is computed from a single surrogate candidate. Since there may be multiple candidates, there may also be multiple values for the same QI. Our uncertainty about the best surrogate candidate has thus propagated to the QI. This is the first source of uncertainty we aim to capture in the present work. From now on, we will simply say parametrization to mean surrogate parametrization/candidate.

Notice also that the effect of limited data enters implicitly through our belief about the best parametrization pool to choose. For example, upon seeing data set D1 it might be that the pool of parametrizations t1 is better than another pool t2. But if we now observe more data, it could very well be that our opinion is reversed, thus choosing t2 over t1. The fewer data points we have, the worse, and generally larger, our pool of parametrizations consistent with the data will be, unless, of course, our prior belief is already sharply tuned to a good pool of parametrizations. However, this is rarely the case, and in by far the most cases we benefit from observing data. From this, it should be clear that the limited data plays a role in our knowledge about the best pool of parametrizations to use. The fact that we only see a limited amount of data therefore introduces a second source of uncertainty (not independent of the first one though) in the QI, and we will be able to capture this as well.

Our developed methods are independent of the particular surrogate employed, but we will focus on a very popular choice in materials science: the cluster expansion  [1]. The cluster expansion expands the alloy property in basis functions with associated expansion coefficients called effective cluster interactions (ECI)  [2]. It is useful in capturing properties, which depend on the particular atomic arrangement on the lattice, this arrangement being called a configuration. It has been used to describe quantum mechanical energies, thermal conductivities, band gaps  [3], etc., of a multitude of alloys. The ECI are obtained by fitting the cluster expansion to a data set. The cluster expansion surrogate is uniquely given once the ECI are specified, so we will consider a surrogate parametrization as being synonymous with ECI. Although the cluster expansion is exact when untruncated, in practice one needs to make a truncation choice and estimate the ECI from a pool of parametrizations, as discussed earlier. We reiterate that this introduces uncertainties in the QI predicted by the cluster expansion. Although an important question to ask, the sizes of these uncertainties have remained unknown until now. We employ a fully Bayesian approach to quantify these uncertainties.

We should mention that non-Bayesian methods have been applied in some works to quantify the uncertainty in QI’s, but we believe that they should be avoided for uncertainty propagation. In particular, there is no rigorous framework for propagating uncertainties through parametrized surrogates, such as the cluster expansion, to the QI in non-Bayesian frameworks  [4], [5].

In beginning our Bayesian approach, we need to be clear about what we mean by probability. We interpret probability as a reasonable degree of belief  [6], [7] as opposed to a frequency of some (hypothetical) long-run experiment. The sum and product rules of probability theory then tell us how to manipulate degrees of belief in a rigorous way. From this view of probability, the Bayes theorem follows and can be used to change our knowledge when observing new data in a given problem  [8], [6]. This is collectively what is called Bayesian probability theory. We will use a Bayesian approach to introduce a model describing our belief about the best set of ECI with emphasis on sparsity. We include the sparsity feature because alloy properties are expected to be sparsely representable, based on physical arguments  [9]. A very successful sparse regression method, from the non-Bayesian literature, is the least absolute shrinkage and selection operator method (LASSO), which is an L1-constrained least squares method  [10]. It can be shown that LASSO has a Bayesian interpretation. It corresponds to the posterior mode when the parameters to be learned have independent Laplace distributions as priors  [11]. The above information about LASSO will be used to choose Laplace distributed priors in Section  2.4. The Bayesian posterior distribution (posterior) contains the information needed to rigorously quantify the QI uncertainties. In our case, the posterior attains a shape allowing it to be summarized via the 95% highest posterior density confidence interval (HPD)—the smallest region containing at least 95% of the posterior mass. We will reduce the effects of other uncertainties as much as possible and discuss this as we go along.

We employ our framework to two real binary alloy systems. First, we consider body-centered-cubic (bcc) magnesium–lithium (Mg–Li) and let the QI be its ground state line. Then, we turn to diamond silicon–germanium (Si–Ge) and present a computationally more involved example where the QI is the transition temperature of the disordered to two-phase-coexistence at 50% composition.

The paper is organized as follows. We start out with a general introduction to uncertainty quantification and present our framework in Sections  2.1 Background, 2.2 Bayesian uncertainty quantification. In Section  2.3 we discuss where and how the cluster expansion enters the scene. Then, in Section  2.4, we present a Bayesian method for describing the ECI with emphasis on sparsity. This posterior will not be in closed form for its intended use so we show how samples are drawn from it using MCMC methods in Section  2.5. Having developed the framework we turn to case studies first discussing uncertainty quantification in the Mg–Li ground state line in Section  2.6, followed by the uncertainty quantification of an Si–Ge phase transition in Section  2.7. Results from carrying out these are presented in Section  4 and a corresponding discussion follows in Section  5. The paper is concluded in Section  6.

Section snippets

Background

In this section we introduce the methods used to quantify the uncertainty in the QI making no assumption about the form of parametrization of the response surface. Then, in the following section, we show how the cluster expansion makes this parametrization. Independent of the choice of surrogate model we will need data to make the best possible choice of parametrization. Therefore, we first discuss assumptions about the computer code used to obtain the data. Then, we introduce the central

Data

We used the quantum mechanical energies of both alloys computed via VASP in Ref.  [42]. This provided a set of responses. ATAT was then used to generate the design matrix X. We must make a choice about what cluster families (basis functions) can be considered by the Bayesian model. The RJMCMC method will then select a subset of these. For bcc Mg–Li we generated 16 two-point (2-pt) and 4 3-pt cluster families besides the empty and 1-pt cluster families contained in both systems. The maximum

Results

For both systems we needed to set up an RJMCMC chain on Eq. (5) and equilibrate it in order to draw posterior samples. In Fig. 2 we provide details of these simulations vs. MCMC step number for both systems. For Mg–Li(Si–Ge) we ran the chain for 9(8) million steps burning the first 4 million samples. This took 6 h to run on a single core (a parallelization of the MCMC chain is possible)  [44]. Fig. 2(a) shows the model complexity during the run. The sparse nature of the BL-RJMCMC is apparent

Discussion

Most work so far in the field has focused on obtaining the best surrogate surface itself. Therefore, we provide a comparison of the most likely surrogates predicted by BL-RJMCMC to the (point) surrogates predicted by least squares and LASSO-CV. See Fig. 5. The BL-RJMCMC chain is started with just one cluster family active and an ECI value of 1 eV. From this, it finds an almost identical result as LASSO-CV, but it is even sparser. Interestingly, in Fig. 5(a) BL-RJMCMC predicts a strong signal

Conclusion

We have presented a rigorous Bayesian framework for quantifying the uncertainty in predicted quantities of interests from the cluster expansion arising from two sources: lack of knowledge about the best truncation choice and corresponding ECI, and also from having observed a limited amount of data. We presented a framework for carrying out this quantification in general but considered two particular quantities of interest in two different binary alloys: bcc Mg–Li and diamond Si–Ge. The main

Acknowledgments

NJZ as ‘Royal Society Wolfson Research Merit Award’ holder acknowledges support from the Royal Society and the Wolfson Foundation. NJZ also acknowledges strategic grant support from EPSRC to the University of Warwick for establishing the Warwick Centre for Predictive Modeling (grant EP/L027682/1). In addition, NJZ as Hans Fisher Senior Fellow acknowledges the support of the Technische Universität München - Institute for Advanced Study, funded by the German Excellence Initiative and the European

References (49)

  • G. Kresse et al.

    Comput. Math. Sci.

    (1996)
  • S. Plimpton

    J. Comput. Phys.

    (1995)
  • J. Sanchez et al.

    Physica A

    (1984)
  • X. Chen et al.

    Signal Process.

    (2011)
  • A. Van~de Walle et al.

    CALPHAD

    (2002)
  • A. van~de Walle

    CALPHAD

    (2009)
  • J.M. Sanchez

    Phys. Rev. B

    (1993)
  • M. Asta et al.

    Phys. Rev. B

    (1991)
  • A. van~de Walle

    Nat. Mater.

    (2008)
  • Z.R. Kenz, H. Banks, R.C. Smith, Comparison of Frequentist and Bayesian Confidence Analysis Methods on a Viscoelastic...
  • R.C. Smith

    Uncertainty Quantification: Theory, Implementation, and Applications, vol.~12

    (2013)
  • E.T. Jaynes

    Probability Theory: The Logic of Science

    (2003)
  • P.S.d. Laplace, Analytical Theory of Probability...
  • H. Jeffreys, Theory of Probability...
  • L.J. Nelson et al.

    Phys. Rev. B

    (2013)
  • R. Tibshirani

    J. R. Stat. Soc. Ser. B Stat. Methodol.

    (1996)
  • T. Park et al.

    J. Amer. Statist. Assoc.

    (2008)
  • P.E. Blöchl

    Phys. Rev. B

    (1994)
  • G. Kresse et al.

    Phys. Rev. B

    (1993)
  • G. Kresse et al.

    Phys. Rev. B

    (1994)
  • G. Kresse et al.

    J. Phys.: Condens. Matter.

    (1994)
  • G. Kresse et al.

    Phys. Rev. B

    (1996)
  • J.P. Perdew et al.

    Phys. Rev. Lett.

    (1997)
  • H.J. Monkhorst et al.

    Phys. Rev. B

    (1976)
  • Cited by (22)

    • Recent progress of uncertainty quantification in small-scale materials science

      2021, Progress in Materials Science
      Citation Excerpt :

      To address the epistemic uncertainty arising from the use of the surrogate models, Kristensen and Zabaras [105] developed a fully Bayesian framework that modeled the uncertainty propagation on the alloy quantities of interest. In particular, the study was focused on the epistemic uncertainties arising from the cluster expansion surrogate model [106] and the effects of the uncertainty on the quantum mechanical energies [105]. The UQ problems, in general, are classified as forward and inverse problems.

    • Robust data-driven approach for predicting the configurational energy of high entropy alloys

      2020, Materials and Design
      Citation Excerpt :

      In many cases, only small datasets can be obtained from expensive DFT calculations due to limited computational resources. The challenge gives rise to the issue of uncertainty quantification in model inference and certified predictions [54–59]. The learned model also faces the additional challenges to capture the underlying physics with important features and cover the overall configurational space in an accurate and robust scheme [53,60].

    • Recursive alloy Hamiltonian construction and its application to the Ni-Al-Cr system

      2018, Acta Materialia
      Citation Excerpt :

      A sensitivity analysis, as developed by Zabaras and co-workers [53], for example, will also provide information about the spread on each interaction coefficient of the lower dimensional subsystem that can be utilized in the formulation of the recursive approach of section 2.2.

    • Quantifying uncertainties in first-principles alloy thermodynamics using cluster expansions

      2016, Journal of Computational Physics
      Citation Excerpt :

      Previous works employing a Bayesian framework for uncertainty quantification in the CE used a Laplace prior for the expansion coefficients and a right-truncated Poisson prior for the number of clusters to induce sparsity in the number of clusters for the expansion [33]. Despite the generality of this approach, the posterior distribution cannot be sampled analytically and a reversible jump Markov Chain Monte Carlo (RJ-MCMC) [34] had to be used to obtain statistics on the predictions [33]. There are other works based on a Bayesian framework, but they did not exploit the resultant uncertainty.

    View all citing articles on Scopus
    View full text