Surrogate modeling of phase equilibrium calculations using adaptive sampling
Introduction
In computer-based process optimization, the reliability of the optimization result depends on the quality of the process model. In order to obtain an accurate representation of the process, models based on first principles are usually preferred.
In the modeling of chemical processes, phase equilibria play an important role. For example, the solubility of a feed material in the reaction solution significantly influences the speed of reaction, and the accurate computation of the composition of the vapor and liquid phases in equilibrium is fundamental to the modeling of distillation columns.
For phase equilibrium calculations, activity coefficient models or equations of state models can be employed. Activity coefficient models require less computational effort, but are not applicable at elevated pressures, close to critical temperatures, and for multi-component systems. For such systems, equations of state models should be preferred (Merchan, Wozny, 2016, Schäfer, Sadowski, Enders, 2014). For complex phase systems, advanced equations of state models as the PC-SAFT model are suitable for accurate predictions over a broad range of operating conditions. The PC-SAFT model has been applied to a wide range of different systems (Kleiner, Tumakaka, Sadowski, 2009, Kontogeorgis, Folas, 2010, Tumakaka, Gross, Sadowski, 2005). However, in order to solve phase equilibria using equation of state models, the density root problem as well as the phase equilibrium conditions must be fulfilled, which requires the use of embedded calculations that lead to a significant computational effort. This makes these advanced thermodynamic models difficult to use for process optimization.
In order to overcome this issue, the surrogate modeling methodology can be applied. Surrogate modeling is understood here as replacing a complex model by a simpler black-box model. One can distinguish between two different classes of surrogate modeling problems: classification problems with two or more discrete outputs and regression problems with one or more continuous outputs that are approximated.
As the quality of the surrogate model is dependent on the choice of the training points, sampling is an important aspect of the surrogate modeling process. Since sampling involves computationally intense evaluations of the original function, the sampling objective is to sample as few points as possible with a maximum gain of information on the modeled phenomenon.
There are mainly two approaches for sampling: sampling once (one-shot) or adaptive sampling. In many applications, one-shot space-filling designs, such as the Latin hypercube sampling (LHS), Monte-Carlo or Halton sequences are used to fit surrogate models. However, in many applications, the modeled quantities exhibit complex responses as discontinuities or a strong curvature in specific regions of the input space. In space-filling designs, such complex structures are often not approximated well. For these cases, adaptive or sequential sampling designs can be used. In this context, the terms of exploration – sampling in a space-filling manner – and exploitation – more dense sampling in regions with complex behavior of the original function – are used. The trade-off between exploration and exploitation is a recent field of research. Cozad et al. (2014) and Kleijnen and Van Beers (2004) focus on exploitation. The mixed adaptive sampling approach proposed in Eason and Cremaschi (2014), which was further improved by Jin et al. (2016), uses two equally weighted sampling objectives, while the recent work of Garud et al. (2017) proposes an approach of adaptive weighting of the two sampling objectives.
Prior to sampling, the input and output variables of the surrogate modeling problem have to be defined. For a chemical system, the composition, temperature and pressure define the number of phases and the composition of these. Modeling the phase composition of a mixture in the range where the transition between a one-phase and a two-phase mixture occurs with a regression model leads to a discontinuity in the output space. In general, discontinuities in the output space are hard to approximate with surrogate models in an accurate fashion. Restricting the valid input range to a region which is biphasic in any case as in other work (Nentwich and Engell, 2016) avoids discontinuities in the output space, but overconstrains the operating range which is not desirable in process optimization.
To cope with this problem, a novel surrogate modeling strategy for phase equilibria is investigated in this contribution. A classifier is trained on the obtained data in order to identify the biphasic region. In a second step, regression models are used to model the compositions of the two phases if the point is classified as being biphasic. This approach makes the use of equation of state models possible within chemical process simulation and optimization, since the computationally expensive frequent solution of the phase equilibrium is avoided. To obtain accurate classifier and regression models without calling the original thermodynamic model extensively for sampling, a mixed adaptive sampling scheme is applied.
As a case study, the process of the hydroformylation of 1-dodecene performed in a thermomorphic solvent system is examined. The proposed surrogate modeling strategy is demonstrated for the modeling of the liquid-liquid equilibrium of the ternary system n-decane, dimethylformamide and 1-dodecene, which is calculated using the PC-SAFT equation of state model. For the considered case study, the mean computation times were 5.1 seconds per phase equilibrium calculation using PC-SAFT which could be reduced to 0.003 seconds by applying the developed surrogate models on a standard computer (Windows 7, 3.6 GHz dual core Intel(R) i7, 18 GB RAM).
Section snippets
First principles modeling of phase equilibria
In this section, the phase equilibrium problem is defined and the solution procedure when applying equation of state models, e. g. PC-SAFT, is described. The results of this iterative procedure are used to provide data to train a surrogate model which computes the equilibrium by a simple function call.
Sampling
Especially for the case of expensive model evaluations, the goal of devising a sampling procedure is to sample as few points as possible but nonetheless get a model of good accuracy over the full range of inputs of interest. As the training locations have a strong effect on the accuracy of the surrogate model, finding the best sampling locations is an important aspect. A common approach is to equidistantly cover the complete input-space which is referred to as space-filling or exploratory
Surrogate models
Classifiers and regression models are commonly used to solve different problems in the field of chemical engineering. Examples of classification problems in this field are e. g. fault detection and diagnosis (Chiang, Kotanchek, Kordon, 2004, Onel, Kieslich, Guzman, Floudas, Pistikopoulos, 2018) and drug design (Byvatov et al., 2003). The application of surrogate models for regression problems to reduce the computational effort is gaining increasing popularity (see Cremaschi, 2015). Henao and
Case study
As a case study, the process of the hydroformylation of 1-dodecene to the main product n-tridecanal has been chosen (Kiedorf et al., 2014). This process has been developed up to the technical realization in two miniplants in the collaborative research center/transregio 63 “Integrated chemical processes in liquid multiphase systems” InPROMPT. Two different strategies of tunable solvent systems have been pursued. The reaction has been performed in a microemulsion process by employing surfactants
Results
In this section, the mixed adaptive sampling algorithm is applied to the case study of the ternary LLE of 1-dodecene, n-decane and DMF. The choice of the sampling parameters which are the number of subsets NSS (Section 6.1.1) and the selection factor SF (Section 6.1.2) is discussed first. The obtained models are compared with models that are based on a conventional LHS design of the same size in Section 6.2 to see the benefits of the sequential sampling approach. In order to analyze the
Conclusions
As equation of state models as PC-SAFT often are not directly applicable in process optimization due to the computational expense of the iterative computations, this work aims at replacing the expensive thermodynamic model calls by explicit computations of surrogate models. In order to combine explorative and exploitative samling objectives to find the best sample locations, the mixed adaptive sampling approach by Eason and Cremaschi (2014) has been extended to a novel surrogate modeling
Acknowledgment
This work is part of the Collaborative Research Center/Transregio 63 “Integrated Chemical Processes in Liquid Multiphase Systems” (subproject D1). Financial support by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) is gratefully acknowledged (TRR 63).
The authors also thank Roderich Wallrath, Clemens Lindscheid, Maximilian Cegla, Radoslav Paulen, Marina Rantanen Modéer, Simon Wenzel, Shreya Bhatia and Anoj-Winston Gladius for their support.
References (62)
- et al.
Optimal design of energy systems using constrained grey-box multi-objective optimization
Comput. Chem. Eng.
(2018) - et al.
Advances in surrogate based modeling, feasibility analysis, and optimization: a review
Comput. Chem. Eng.
(2018) - et al.
A radically different formulation and solution of the single-stage flash problem
Comput. Chem. Eng.
(1978) - et al.
Fault diagnosis based on fisher discriminant analysis and support vector machines
Comput. Chem. Eng.
(2004) A perspective on process synthesis: challenges and prospects
Comput. Chem. Eng.
(2015)- et al.
Efficient space-filling and non-collapsing sequential design strategies for simulation-based modeling
Eur. J. Oper. Res.
(2011) - et al.
Adaptive sequential sampling for surrogate model generation with artificial neural networks
Comput. Chem. Eng.
(2014) - et al.
Smart sampling algorithm for surrogate model development
Comput. Chem. Eng.
(2017) - et al.
Simultaneous design of the optimal reaction and process concept for multiphase systems
Chem Eng Sci
(2014) - et al.
Modelling and iterative real-time optimization of a homogeneously catalyzed hydroformylation process
Comput. Aided Chem. Eng.
(2016)
Probabilistic reactor design in the framework of elementary process functions
Comput. Chem. Eng.
Global optimization of distillation columns using explicit and implicit surrogate models
Chem. Eng. Sci.
Efficient global optimization of a novel hydroformylation process
Comput. Aided Chem. Eng.
Kinetics of 1-dodecene hydroformylation in a thermomorphic solvent system using a rhodium-biphephos catalyst
Chem. Eng. Sci.
The dynamic approximation method of handling vapor-liquid equilibrium data in computer calculations for chemical processes
Comput. Chem. Eng.
Thermomorphic solvent selection for homogeneous catalyst recovery based on COSMO-RS
Chem. Eng. Process.
Dynamic real-time optimization under uncertainty of a hydroformylation mini-plant
Comput. Chem. Eng.
Calculation of complex phase equilibria of DMF/alkane systems using the PCP-SAFT equation of state
Chem. Eng. Sci.
Thermodynamic modeling of complex systems using PC-SAFT
Fluid Phase Equilibria
High-pressure gas solubility in multicomponent solvent systems for hydroformylation. part i: carbon monoxide solubility
J. Supercrit. Fluids
High-pressure gas solubility in multicomponent solvent systems for hydroformylation. part II: syngas solubility
J. Supercrit. Fluids
Rhodium catalyzed hydroformylation of 1-dodecene using an advanced solvent system: towards highly efficient catalyst recycling
Chem. Eng. Process.
Perturbation theory and equation of state for fluids. II. A successful theory of liquids
J. Chem. Phys.
Combining source transformation and operator overloading techniques to compute derivatives for MATLAB programs
Proceedings. Second IEEE International Workshop on Source Code Analysis and Manipulation
Pattern Recognition and Machine Learning
Temperaturgesteuertes Katalysatorrecycling für die homogen katalysierte Hydroformylierung langkettiger Alkene
Comparison of support vector machine and artificial neural network systems for drug/nondrug classification
J. Chem. Inf. Comput. Sci.
An algorithm for the use of surrogate models in modular flowsheet optimization
AlChE J.
Local models for representing phase equilibria in multicomponent, nonideal vapor-Liquid and liquid-Liquid systems. 1. thermodynamic approximation functions
Ind. Eng. Chem. Process Des.Dev.
Local models for representing phase equilibria in multicomponent, non-ideal vapor-liquid and liquid-liquid systems. 2. application to process design
Ind. Eng. Chem. Process Des. Dev.
Learning surrogate models for simulation-based optimization
AlChE J.
Cited by (31)
Iterative real-time optimization of a reductive amination process in a thermomorphic multiphase system
2024, Chemical Engineering ScienceConvex Envelope Method for determining liquid multi-phase equilibria in systems with arbitrary number of components
2023, Computers and Chemical EngineeringEvaluating the Impact of Model Uncertainties in Superstructure Optimization to Reduce the Experimental Effort
2022, Computer Aided Chemical EngineeringCitation Excerpt :The gas solubilities as well as the phase separation are predicted using the equation of state PC-SAFT. As the iterative solution of the PC-SAFT equations is not feasible in the optimization, surrogate models were trained as proposed in (Nentwich & Engell, 2019). The membrane separation is modelled using a solution-diffusion model.
ANN-assisted optimization-based design of energy-integrated distillation columns
2022, Computer Aided Chemical Engineering