Surrogate modeling of phase equilibrium calculations using adaptive sampling

doi:10.1016/j.compchemeng.2019.04.006

Computers & Chemical Engineering

Volume 126, 12 July 2019, Pages 204-217

https://doi.org/10.1016/j.compchemeng.2019.04.006 Get rights and content

Highlights

•
Expensive phase equilibrium calculations can efficiently be replaced.
•
Combination of classifier and regression models is a suitable strategy.
•
Mixed adaptive sampling is superior to Latin hypercube sampling design.
•
Optimal sampling parameters determined by model training effort and -performance.
•
Crossvalidation errors can be used as a stop criterion.

Abstract

Equation of state models as the Perturbed-Chain Statistical Associating Fluid Theory (PC-SAFT) model are accurate and reliable prediction models for phase equilibria. But due to their iterative nature, they are difficult to apply in chemical process optimization, because of long computation times. To overcome this issue, surrogate modeling – replacing a complex model by a black-box model – can be used. A novel surrogate modeling strategy for phase equilibria is presented, combining the training of a classifier model with regression models for the phase composition using a mixed adaptive sampling method. We discuss the selection of the parameters of the sampling algorithm and a suitable stop criterion for the example ternary liquid-liquid equilibrium system of n-decane, dimethylformamide and 1-dodecene in detail. The sequential mixed adaptive sampling method is compared to the one-shot Latin hypercube sampling design.

Introduction

In computer-based process optimization, the reliability of the optimization result depends on the quality of the process model. In order to obtain an accurate representation of the process, models based on first principles are usually preferred.

In the modeling of chemical processes, phase equilibria play an important role. For example, the solubility of a feed material in the reaction solution significantly influences the speed of reaction, and the accurate computation of the composition of the vapor and liquid phases in equilibrium is fundamental to the modeling of distillation columns.

For phase equilibrium calculations, activity coefficient models or equations of state models can be employed. Activity coefficient models require less computational effort, but are not applicable at elevated pressures, close to critical temperatures, and for multi-component systems. For such systems, equations of state models should be preferred (Merchan, Wozny, 2016, Schäfer, Sadowski, Enders, 2014). For complex phase systems, advanced equations of state models as the PC-SAFT model are suitable for accurate predictions over a broad range of operating conditions. The PC-SAFT model has been applied to a wide range of different systems (Kleiner, Tumakaka, Sadowski, 2009, Kontogeorgis, Folas, 2010, Tumakaka, Gross, Sadowski, 2005). However, in order to solve phase equilibria using equation of state models, the density root problem as well as the phase equilibrium conditions must be fulfilled, which requires the use of embedded calculations that lead to a significant computational effort. This makes these advanced thermodynamic models difficult to use for process optimization.

In order to overcome this issue, the surrogate modeling methodology can be applied. Surrogate modeling is understood here as replacing a complex model by a simpler black-box model. One can distinguish between two different classes of surrogate modeling problems: classification problems with two or more discrete outputs and regression problems with one or more continuous outputs that are approximated.

As the quality of the surrogate model is dependent on the choice of the training points, sampling is an important aspect of the surrogate modeling process. Since sampling involves computationally intense evaluations of the original function, the sampling objective is to sample as few points as possible with a maximum gain of information on the modeled phenomenon.

There are mainly two approaches for sampling: sampling once (one-shot) or adaptive sampling. In many applications, one-shot space-filling designs, such as the Latin hypercube sampling (LHS), Monte-Carlo or Halton sequences are used to fit surrogate models. However, in many applications, the modeled quantities exhibit complex responses as discontinuities or a strong curvature in specific regions of the input space. In space-filling designs, such complex structures are often not approximated well. For these cases, adaptive or sequential sampling designs can be used. In this context, the terms of exploration – sampling in a space-filling manner – and exploitation – more dense sampling in regions with complex behavior of the original function – are used. The trade-off between exploration and exploitation is a recent field of research. Cozad et al. (2014) and Kleijnen and Van Beers (2004) focus on exploitation. The mixed adaptive sampling approach proposed in Eason and Cremaschi (2014), which was further improved by Jin et al. (2016), uses two equally weighted sampling objectives, while the recent work of Garud et al. (2017) proposes an approach of adaptive weighting of the two sampling objectives.

Prior to sampling, the input and output variables of the surrogate modeling problem have to be defined. For a chemical system, the composition, temperature and pressure define the number of phases and the composition of these. Modeling the phase composition of a mixture in the range where the transition between a one-phase and a two-phase mixture occurs with a regression model leads to a discontinuity in the output space. In general, discontinuities in the output space are hard to approximate with surrogate models in an accurate fashion. Restricting the valid input range to a region which is biphasic in any case as in other work (Nentwich and Engell, 2016) avoids discontinuities in the output space, but overconstrains the operating range which is not desirable in process optimization.

To cope with this problem, a novel surrogate modeling strategy for phase equilibria is investigated in this contribution. A classifier is trained on the obtained data in order to identify the biphasic region. In a second step, regression models are used to model the compositions of the two phases if the point is classified as being biphasic. This approach makes the use of equation of state models possible within chemical process simulation and optimization, since the computationally expensive frequent solution of the phase equilibrium is avoided. To obtain accurate classifier and regression models without calling the original thermodynamic model extensively for sampling, a mixed adaptive sampling scheme is applied.

As a case study, the process of the hydroformylation of 1-dodecene performed in a thermomorphic solvent system is examined. The proposed surrogate modeling strategy is demonstrated for the modeling of the liquid-liquid equilibrium of the ternary system n-decane, dimethylformamide and 1-dodecene, which is calculated using the PC-SAFT equation of state model. For the considered case study, the mean computation times were 5.1 seconds per phase equilibrium calculation using PC-SAFT which could be reduced to 0.003 seconds by applying the developed surrogate models on a standard computer (Windows 7, 3.6 GHz dual core Intel(R) i7, 18 GB RAM).

Section snippets

First principles modeling of phase equilibria

In this section, the phase equilibrium problem is defined and the solution procedure when applying equation of state models, e. g. PC-SAFT, is described. The results of this iterative procedure are used to provide data to train a surrogate model which computes the equilibrium by a simple function call.

Sampling

Especially for the case of expensive model evaluations, the goal of devising a sampling procedure is to sample as few points as possible but nonetheless get a model of good accuracy over the full range of inputs of interest. As the training locations have a strong effect on the accuracy of the surrogate model, finding the best sampling locations is an important aspect. A common approach is to equidistantly cover the complete input-space which is referred to as space-filling or exploratory

Surrogate models

Classifiers and regression models are commonly used to solve different problems in the field of chemical engineering. Examples of classification problems in this field are e. g. fault detection and diagnosis (Chiang, Kotanchek, Kordon, 2004, Onel, Kieslich, Guzman, Floudas, Pistikopoulos, 2018) and drug design (Byvatov et al., 2003). The application of surrogate models for regression problems to reduce the computational effort is gaining increasing popularity (see Cremaschi, 2015). Henao and

Case study

As a case study, the process of the hydroformylation of 1-dodecene to the main product n-tridecanal has been chosen (Kiedorf et al., 2014). This process has been developed up to the technical realization in two miniplants in the collaborative research center/transregio 63 “Integrated chemical processes in liquid multiphase systems” InPROMPT. Two different strategies of tunable solvent systems have been pursued. The reaction has been performed in a microemulsion process by employing surfactants

Results

In this section, the mixed adaptive sampling algorithm is applied to the case study of the ternary LLE of 1-dodecene, n-decane and DMF. The choice of the sampling parameters which are the number of subsets NSS (Section 6.1.1) and the selection factor SF (Section 6.1.2) is discussed first. The obtained models are compared with models that are based on a conventional LHS design of the same size in Section 6.2 to see the benefits of the sequential sampling approach. In order to analyze the

Conclusions

As equation of state models as PC-SAFT often are not directly applicable in process optimization due to the computational expense of the iterative computations, this work aims at replacing the expensive thermodynamic model calls by explicit computations of surrogate models. In order to combine explorative and exploitative samling objectives to find the best sample locations, the mixed adaptive sampling approach by Eason and Cremaschi (2014) has been extended to a novel surrogate modeling

Acknowledgment

This work is part of the Collaborative Research Center/Transregio 63 “Integrated Chemical Processes in Liquid Multiphase Systems” (subproject D1). Financial support by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) is gratefully acknowledged (TRR 63).

The authors also thank Roderich Wallrath, Clemens Lindscheid, Maximilian Cegla, Radoslav Paulen, Marina Rantanen Modéer, Simon Wenzel, Shreya Bhatia and Anoj-Winston Gladius for their support.

References (62)

B. Beykal et al.
Optimal design of energy systems using constrained grey-box multi-objective optimization
Comput. Chem. Eng.
(2018)
A. Bhosekar et al.
Advances in surrogate based modeling, feasibility analysis, and optimization: a review
Comput. Chem. Eng.
(2018)
J. Boston et al.
A radically different formulation and solution of the single-stage flash problem
Comput. Chem. Eng.
(1978)
L.H. Chiang et al.
Fault diagnosis based on fisher discriminant analysis and support vector machines
Comput. Chem. Eng.
(2004)
S. Cremaschi
A perspective on process synthesis: challenges and prospects
Comput. Chem. Eng.
(2015)
K. Crombecq et al.
Efficient space-filling and non-collapsing sequential design strategies for simulation-based modeling
Eur. J. Oper. Res.
(2011)
J. Eason et al.
Adaptive sequential sampling for surrogate model generation with artificial neural networks
Comput. Chem. Eng.
(2014)
S.S. Garud et al.
Smart sampling algorithm for surrogate model development
Comput. Chem. Eng.
(2017)
B. Hentschel et al.
Simultaneous design of the optimal reaction and process concept for multiphase systems
Chem Eng Sci
(2014)
R. Hernández et al.
Modelling and iterative real-time optimization of a homogeneously catalyzed hydroformylation process
Comput. Aided Chem. Eng.
(2016)

N.M. Kaiser et al.

Probabilistic reactor design in the framework of elementary process functions

Comput. Chem. Eng.

(2016)

T. Keßler et al.

Global optimization of distillation columns using explicit and implicit surrogate models

Chem. Eng. Sci.

(2019)

T. Keßler et al.

Efficient global optimization of a novel hydroformylation process

Comput. Aided Chem. Eng.

(2017)

G. Kiedorf et al.

Kinetics of 1-dodecene hydroformylation in a thermomorphic solvent system using a rhodium-biphephos catalyst

Chem. Eng. Sci.

(2014)

M. Leesley et al.

The dynamic approximation method of handling vapor-liquid equilibrium data in computer calculations for chemical processes

Comput. Chem. Eng.

(1977)

K. McBride et al.

Thermomorphic solvent selection for homogeneous catalyst recovery based on COSMO-RS

Chem. Eng. Process.

(2016)

D. Müller et al.

Dynamic real-time optimization under uncertainty of a hydroformylation mini-plant

Comput. Chem. Eng.

(2017)

E. Schäfer et al.

Calculation of complex phase equilibria of DMF/alkane systems using the PCP-SAFT equation of state

Chem. Eng. Sci.

(2014)

F. Tumakaka et al.

J. Supercrit. Fluids

(2013)

C. Vogelpohl et al.

High-pressure gas solubility in multicomponent solvent systems for hydroformylation. part II: syngas solubility

J. Supercrit. Fluids

(2014)

M. Zagajewski et al.

Rhodium catalyzed hydroformylation of 1-dodecene using an advanced solvent system: towards highly efficient catalyst recycling

Chem. Eng. Process.

(2016)

J.A. Barker et al.

Perturbation theory and equation of state for fluids. II. A successful theory of liquids

J. Chem. Phys.

(1967)

C. Bischof et al.

Combining source transformation and operator overloading techniques to compute derivatives for MATLAB programs

Proceedings. Second IEEE International Workshop on Source Code Analysis and Manipulation

(2002)

C.M. Bishop

Pattern Recognition and Machine Learning

(2006)

Y. Brunsch

Temperaturgesteuertes Katalysatorrecycling für die homogen katalysierte Hydroformylierung langkettiger Alkene

(2013)

E. Byvatov et al.

Comparison of support vector machine and artificial neural network systems for drug/nondrug classification

J. Chem. Inf. Comput. Sci.

(2003)

J.A. Caballero et al.

An algorithm for the use of surrogate models in modular flowsheet optimization

AlChE J.

(2008)

E.H. Chimowitz et al.

Local models for representing phase equilibria in multicomponent, nonideal vapor-Liquid and liquid-Liquid systems. 1. thermodynamic approximation functions

Ind. Eng. Chem. Process Des.Dev.

(1983)

E.H. Chimowitz et al.

Local models for representing phase equilibria in multicomponent, non-ideal vapor-liquid and liquid-liquid systems. 2. application to process design

Ind. Eng. Chem. Process Des. Dev.

(1984)

A. Cozad et al.

Learning surrogate models for simulation-based optimization

AlChE J.

(2014)

Cited by (31)

Iterative real-time optimization of a reductive amination process in a thermomorphic multiphase system
2024, Chemical Engineering Science
In this paper, we discuss the optimization of the operation of a reductive amination (RA) reaction process in a miniplant without an accurate process model using iterative real-time optimization (RTO). The rhodium-catalyzed RA of undecanal with diethylamine produces a tertiary amine from a long-chain aldehyde and is performed in a thermomorphic multiphase system (TMS) to recover and reuse the expensive catalyst efficiently. An iterative RTO method called modifier adaptation with quadratic approximation (MAWQA) is used in combination with guaranteed model adequacy (GMA) to drive the RA process to its optimum iteratively. MAWQA utilizes online measurements to overcome model deficiencies. GMA ensures that the model used in MAWQA satisfies the model adequacy conditions. The optimal operating conditions of the RA process in the miniplant were identified during an experimental run of the plant, thereby validating the applicability and efficiency of MAWQA with GMA. The results illustrate the benefits of process optimization using iterative RTO methods without accurate process models.
Convex Envelope Method for determining liquid multi-phase equilibria in systems with arbitrary number of components
2023, Computers and Chemical Engineering
The determination of liquid phase equilibria plays an important role in chemical process simulation. This work presents a generalization of an approach called the convex envelope method (CEM), which constructs all liquid phase equilibria over the whole composition space for a given system with an arbitrary number of components. For this matter, the composition space is discretized and the convex envelope of the Gibbs energy graph is computed. Employing the tangent plane criterion, all liquid phase equilibria can be determined in a robust way. The generalized CEM is described within a mathematical framework and it is shown to work numerically with various examples of up to six components from the literature.
Fast uncertainty reduction of chemical kinetic models with complex spaces using hybrid response-surface networks
2023, Combustion and Flame
Response-surface (RS) surrogate approaches permit efficient inverse uncertainty quantification (UQ) of combustion kinetic models, wherein the uncertainty of reaction rates is reduced from observed targets. For kinetic models with parameters characterized by large uncertainty factors, strong nonlinearities, and reaction couplings (e.g., reduced mechanisms of real fuels; such models are referred to be “complex” in this work), the global RS is difficult to approximate, precluding conventional surrogate approaches. This paper proposes a framework that is extendable to such systems, termed Hybrid Response Surface Networks followed by a Stochastic Gradient Descent Ensemble (HRSN-SGDE). This technique focuses on mapping the local RS of just the uncertain spaces in the vicinity of the observed target, referred to as the rate target subspace. Two neural network surrogates are considered: a classifier that predicts the probability of data residing in the rate target subspace and a local RS surrogate which maps the RS of this subspace. A hybrid surrogate loss function is then defined using these surrogates to optimize uncertain rates repeatedly to get an ensemble of solutions representing the constrained rate space. HRSN-SGDE is demonstrated on a complex jet fuel model developed using the hybrid chemistry (HyChem) approach with a low temperature chemistry sub-model using a series of ignition delay times as targets. Results show that the method's local RS objective enables efficient and accurate construction of the surrogates through active learning-based sampling. Also, the unique formulation of the surrogate loss function enables optimization that is robust to suboptimal local minima and faster than evolutionary algorithms by several orders of magnitude. It is shown that HRSN-SGDE method is highly efficacious and suitable to conducting inverse UQ on such complex kinetic models.
An integrated approach to fast model-based process design: Integrating superstructure optimization under uncertainties and optimal design of experiments
2023, Chemical Engineering Science
During the early-stage design of chemical production processes many decisions have to be made on the basis of incomplete knowledge about the underlying chemical and physical phenomena. Therefore, optimization-based approaches are often applied only in a later stage when more knowledge has been generated. In this work, an integrated approach to fast, model and optimization based process design for the selection of reactors and separation networks is discussed. The approach is based on superstructure optimization under uncertainties about the parameters of the models using a two-stage scenario-based approach. Usually, due to the uncertainties, structural decisions cannot be made in the initial stage as different structures are superior for different scenarios of the uncertain parameters. In order to arrive at such decisions, the models need to be refined based on experimental studies. We combine design of experiments with optimization under uncertainty to optimize the experiments such that the information obtained about those parameters that are critical with respect to taking structural design decisions is maximized. We apply the methodology to the development of a process for the hydroaminomethylation of 1-decene in a thermomorphic solvent system. It is shown that the integrated approach can help to significantly reduce the number of required experiments. Using the integrated approach, only 7 experiments need to be performed while more than 16 experiments following a full-factorial experimental design do not provide the same reduction of the design space.
Evaluating the Impact of Model Uncertainties in Superstructure Optimization to Reduce the Experimental Effort
2022, Computer Aided Chemical Engineering
Citation Excerpt :
The gas solubilities as well as the phase separation are predicted using the equation of state PC-SAFT. As the iterative solution of the PC-SAFT equations is not feasible in the optimization, surrogate models were trained as proposed in (Nentwich & Engell, 2019). The membrane separation is modelled using a solution-diffusion model.
Optimization-based process design can be an efficient tool for finding synergies between process units, but it strongly relies on accurate process models. Hence, experiments for model refinement may be necessary. We present an optimization-based methodology to enhance the process development by integrating superstructure optimization under uncertainties and optimal design of experiments. In this manner, experiments for model refinement can be focussed on the parameters which are critical for discrete design decisions. These parameters are identified by a local discrimination analysis followed by a computation of the partial dependence or the permutation feature importance. The methodology is applied to the hydroaminomethylation of 1-decene. It is shown that it reduces the number of experiments needed for the decision between alternative process structures.
ANN-assisted optimization-based design of energy-integrated distillation columns
2022, Computer Aided Chemical Engineering
The optimal design of chemical processes is of essential importance for an increased sustainability. However, the resulting non-convex mixed-integer nonlinear programming (MINLP) problems cannot directly be solved to global optimality. Therefore, different alternatives have been proposed, which either build on the application of a simulation-based optimization by means of a metaheuristic or the global optimization of a surrogate model, both requiring extensive simulations. The current work proposes a novel alternative approach for a surrogate-assisted hybrid optimization, which exploits a local deterministic optimization of a full MINLP problem to generate a compact artificial neural network (ANN) model that allows for the direct optimization on a reduced search space. In order to provide a sufficient accuracy of the ANN while targeting the global optimum of the design problem, a tailored mixed adaptive sampling is introduced. Application of the algorithm is illustrated for the optimal design of a distillation-based separation of benzene, toluene, and xylene with different means for energy integration.

View all citing articles on Scopus

View full text

Surrogate modeling of phase equilibrium calculations using adaptive sampling

Highlights

Abstract

Introduction

Section snippets

First principles modeling of phase equilibria

Sampling

Surrogate models

Case study

Results

Conclusions

Acknowledgment

Comput. Chem. Eng.

Comput. Chem. Eng.

Comput. Chem. Eng.

Comput. Chem. Eng.

Comput. Chem. Eng.

Eur. J. Oper. Res.

Comput. Chem. Eng.

Comput. Chem. Eng.

Chem Eng Sci

Comput. Aided Chem. Eng.

Comput. Chem. Eng.

Chem. Eng. Sci.

Comput. Aided Chem. Eng.

Chem. Eng. Sci.

Comput. Chem. Eng.

Chem. Eng. Process.

Comput. Chem. Eng.

Chem. Eng. Sci.

J. Supercrit. Fluids

J. Supercrit. Fluids

Chem. Eng. Process.

Perturbation theory and equation of state for fluids. II. A successful theory of liquids

J. Chem. Phys.

Combining source transformation and operator overloading techniques to compute derivatives for MATLAB programs

Proceedings. Second IEEE International Workshop on Source Code Analysis and Manipulation

Pattern Recognition and Machine Learning

Temperaturgesteuertes Katalysatorrecycling für die homogen katalysierte Hydroformylierung langkettiger Alkene

Comparison of support vector machine and artificial neural network systems for drug/nondrug classification

J. Chem. Inf. Comput. Sci.

An algorithm for the use of surrogate models in modular flowsheet optimization

AlChE J.

Local models for representing phase equilibria in multicomponent, nonideal vapor-Liquid and liquid-Liquid systems. 1. thermodynamic approximation functions

Ind. Eng. Chem. Process Des.Dev.

Local models for representing phase equilibria in multicomponent, non-ideal vapor-liquid and liquid-liquid systems. 2. application to process design

Ind. Eng. Chem. Process Des. Dev.

Learning surrogate models for simulation-based optimization

AlChE J.