Keywords

1 Introduction

The computational modeling of chemical systems has become a fundamentally important technology in the development of new chemical compounds and materials. The most important quantity required to study a given system is the potential energy surface (PES). A PES represents the energy of a system, typically as a function of atomic positions. These positions can be absolute, like in a Cartesian coordinate system, or relative, like expressing the position of two atoms as the distance between them.

The PES can be derived exactly from quantum mechanics and physical constants, but direct solution of these equations is not possible. Over the years, very many approaches have been developed to estimate the PES. All methods attempt to balance the computational cost of the calculation with the answer’s accuracy.

These methods can be grouped into three broad categories: 1) ab-initio methods, which introduce some simplifying assumptions to the direct equations; 2) semi-empirical methods, which introduce simplifications to some aspects of the quantum mechanical equations through small parameterized models; or 3) fully empirical methods, which entirely substitute quantum mechanical equations with parameterized models. In order to tackle large systems, fully empirical approaches are required as semi-empirical or ab-initio ones are too expensive.

ReaxFF [1] is an example of a fully empirical model which includes hundreds of global parameters, parameters for chemical elements and pairs, triplets and quadruplets of elements; often there is no clear physical interpretability for what these parameters represent. However, these sorts of models are the current state-of-the-art in simulating chemical reactions computationally on a large scale.

Determining the appropriate values for each parameter involves adjusting them until the differences between model predictions and some set of training data are minimized. In practice, the error functions, which combine all the deviations between training data and model into a sigle value, have proven to be extremely difficult to optimize. Not only are the problems very high dimensional, but they are also expensive to evaluate because they involve computations of properties of many chemical arrangements. Many such calculations involve a geometry optimization of the arrangements of atoms before chemical properties can be extracted. These optimizations are not deterministic and can lead to different arrangements and, hence, properties; in this way the error function can show stochastic behavior. The complexity of evaluation also hinders the calculation of analytical derivatives of the cost function. Finally, the PES can explode rapidly to very large or non-numerical values if atoms move very close together. This means that the error function is very oscillatory, and fluctuates over several orders of magnitude.

Optimization techniques typically used within the community have, to date, not been tremendously sophisticated because the expense, high-dimensionality, and black-box nature of the problems put severe constraints on the types of algorithms which can be applied. For example, SOPPE [2] (Sequential One-Parameter Parabolic Extrapolation) was the default approach for several years and typically required an expert to make manual fine-tuning adjustments.

Recently, more sophisticated algorithms such as CMA-ES, MCFF and OGOLEM have been applied with some success [3], however, one continues to face issues of stability, convergence to (bad) local minima, and computational efficiency. One particular hurdle has been the construction of the error function itself. While conceptually simple, it requires interfacing several pieces of software and strong programming skills. This represents a barrier for most researchers in the field.

2 GloMPO Package

To address these challenges, our team aims to create a standardized reparameterization interface which is simple enough to encourage broad adoption. ParAMS [4] is a new tool which is able to construct an error function for several computational chemistry models, including ReaxFF, GFN-xTB, DFTB and others. This deals with the ‘upstream’ work of the reparameterization problem. In this work we introduce GloMPO (Globally Managed Parallel Optimization) to handle the optimization and ‘downstream’ post-processing aspects.

GloMPO is a novel optimization framework which manages several child optimizers in real-time. It aims to ensure that function evaluations are used efficiently (i.e. not wasted exploring bad minima), and that all available computational resources are applied to the optimization task (since some algorithms or codes are not adequately parallelized).

Figure 1 illustrates GloMPO’s function. Given any global minimization task, GloMPO starts several child optimizers which represent instances of any traditional optimization algorithm. GloMPO monitors the progress of each child and compares them. It is empowered to terminate ones which appear to be converging to poor local minima (usually defined as a larger value than already seen by another child). Terminated optimizers are replaced with new instances. When good iterations are found by one child, GloMPO shares them with the others. If the children can use this information in some way (for example, by sampling the suggested point in subsequent evaluations), this can accelerate convergence.

Fig. 1.
figure 1

Schematic of GloMPO’s supervision of an optimization task over time. As child optimizers appear to have converged to values higher than the best seen thus far, they are shutdown and replaced with new instances.

By managing several independent optimizer instances, the distribution of sampled points is spread over a wider area. In the context of ReaxFF, this often produces answers which are virtually degenerate in cost function value, but far apart in parameter space. Having access to several such solutions is invaluable to computational chemists as they try and identify deficiencies in their training sets, and select one which works best for their application.

GloMPO is open-source and publicly available (see end). It supports running children as threads and processes, and further supports a second layer (of threads or processes) for parallelized function evaluations for optimizers which use multiple evaluations per iteration. GloMPO is compatible with any optimization task and optimization algorithm, and all major decision criteria are customizable. Finally, by providing a single interface to optimization regardless of the task or type of optimizer used, and given its modular design and customizability, GloMPO can be configured to manage complex optimization workflows through a single standardized interface.

3 Results

To quantify the effect of GloMPO’s management, a benchmark test was run on the 66D Rastrigin, 20D Schwefel, 20D Deceptive, and 4D Shubert functions. ‘Normal optimization’ is defined here as repeated application of a traditional optimization algorithm (in this case CMA-ES [5]) to a minimization task. We compare GloMPO to repeated optimization because this is the traditional workflow for hard-to-optimize problems like the reparameterization of ReaxFF. Researchers, anticipating the difficulties they will face, often repeat optimizations and accept the lowest value of the set. We similarly define the best minimum of normal optimization as the lowest value found by any optimizer. In competition are GloMPO optimizations which operate on the same task, using the same type of child optimizers, and limited to the same iteration budget. In other words, the only point of difference between the two approaches is GloMPO’s ability to replace optimizers exploring poor minima with new optimizer instances.

Overall, 4800 tests were conducted in this way. The distribution of answers found by normal repeated optimization and GloMPO are shown in Fig. 2. GloMPO demonstrates a clear improvement in both the mean final value and overall distribution of explored points; this performance proved to be robust across test configurations which changed the number of optimizers, available iteration budget, and other GloMPO settings. Overall, GloMPO identified a better answer than normal optimization (70 ± 13)% of the time. GloMPO configurations which actively shared good evaluations with one another performed particularly well, winning (86 ± 6)% of their comparisons.

Fig. 2.
figure 2

Violin plot of final solutions from normal sequential optimization (solid red) and GloMPO managed (hatched blue) on four common global optimization test functions. Mean values are shown by the corresponding thick lines. Minima have been normalized to make the four functions visually comparable. Configurations were run 100 times each across 48 different configurations. GloMPO managed the same optimizer types as the normal optimization runs, CMA-ES. (Color figure online)

GloMPO was also used to reparameterize two ReaxFF force fields. The first was a 12 parameter model describing cobalt [6] in various configurations, and the second was an 87 parameter model for disulfide compounds [7]. For statistical significance the optimization of each set was repeated 10 times.

Figure 3a shows the averages and standard deviations of the ten minima found by normal and GloMPO optimizations for the cobalt trainings. It again shows an improvement in quality of answers obtained by GloMPO. Not only is the average answer improved, the standard deviation is much tighter, in other words GloMPO has a higher chance of producing a good minimum as compared to normal optimization. Of particular note for the cobalt optimizations is the identification of degenerate sets; parameter sets whose error values are relatively close, but are located far from one another in parameter space. GloMPO on average found 3.4 degenerate sets during each optimization, sometimes as many as 6 in a single run. This is compared to normal optimizations which only averaged 2.4.

In the case of the disulfide model, the dimensionality was much higher and the error surface was poorly conditioned. This proved to be a tougher challenge than the cobalt trainings, and normal optimization approaches failed to consistently identify good solutions. Figure 3b shows the difference between the GloMPO minimum and normal optimization minimum across the 10 conducted runs. In this case, GloMPO performed better in eight out of the ten trials.

Fig. 3.
figure 3

Comparison between normal and GloMPO optimization of two ReaxFF models. Repeated ten times each.

4 Conclusion

GloMPO is able to find better solutions to global optimization problems between 57% to 83% of the time when compared to traditional approaches. These figures are robust to changes in configuration and optimization task. GloMPO-generated solutions are on average lower, and have smaller variance, than their unmanaged counterparts. It also provides several qualitative advantages:

  1. 1.

    GloMPO prevents the inefficient use of function evaluations by preventing optimizers from exploring minima which are worse than that already seen by another optimizer.

  2. 2.

    GloMPO ensures that all computational resources are applied to the optimization problem.

  3. 3.

    GloMPO is fully customizable in terms of its decision making, and can be applied to any optimization problem using any combination of child algorithms. This flexibility means it can be leveraged as workflow manager.

  4. 4.

    GloMPO provides a standardized interface to optimization.

  5. 5.

    GloMPO provides standardized outputs regardless of the types of child optimizers used. The resulting log of trajectories can be analyzed and help in better conditioning of the optimization task.

Software Availability and Requirements

Project name:

GloMPO

Project home page:

www.github.com/mfgustavo/glompo

Operating system:

Platform independent

Programming language:

Python >3.6

Other requirements:

>AMS2020.1 for ReaxFF interface

License:

GPL-3.0