Elsevier

Computers & Chemical Engineering

Volume 126, 12 July 2019, Pages 204-217
Computers & Chemical Engineering

Surrogate modeling of phase equilibrium calculations using adaptive sampling

https://doi.org/10.1016/j.compchemeng.2019.04.006Get rights and content

Highlights

  • Expensive phase equilibrium calculations can efficiently be replaced.

  • Combination of classifier and regression models is a suitable strategy.

  • Mixed adaptive sampling is superior to Latin hypercube sampling design.

  • Optimal sampling parameters determined by model training effort and -performance.

  • Crossvalidation errors can be used as a stop criterion.

Abstract

Equation of state models as the Perturbed-Chain Statistical Associating Fluid Theory (PC-SAFT) model are accurate and reliable prediction models for phase equilibria. But due to their iterative nature, they are difficult to apply in chemical process optimization, because of long computation times. To overcome this issue, surrogate modeling – replacing a complex model by a black-box model – can be used. A novel surrogate modeling strategy for phase equilibria is presented, combining the training of a classifier model with regression models for the phase composition using a mixed adaptive sampling method. We discuss the selection of the parameters of the sampling algorithm and a suitable stop criterion for the example ternary liquid-liquid equilibrium system of n-decane, dimethylformamide and 1-dodecene in detail. The sequential mixed adaptive sampling method is compared to the one-shot Latin hypercube sampling design.

Introduction

In computer-based process optimization, the reliability of the optimization result depends on the quality of the process model. In order to obtain an accurate representation of the process, models based on first principles are usually preferred.

In the modeling of chemical processes, phase equilibria play an important role. For example, the solubility of a feed material in the reaction solution significantly influences the speed of reaction, and the accurate computation of the composition of the vapor and liquid phases in equilibrium is fundamental to the modeling of distillation columns.

For phase equilibrium calculations, activity coefficient models or equations of state models can be employed. Activity coefficient models require less computational effort, but are not applicable at elevated pressures, close to critical temperatures, and for multi-component systems. For such systems, equations of state models should be preferred (Merchan, Wozny, 2016, Schäfer, Sadowski, Enders, 2014). For complex phase systems, advanced equations of state models as the PC-SAFT model are suitable for accurate predictions over a broad range of operating conditions. The PC-SAFT model has been applied to a wide range of different systems (Kleiner, Tumakaka, Sadowski, 2009, Kontogeorgis, Folas, 2010, Tumakaka, Gross, Sadowski, 2005). However, in order to solve phase equilibria using equation of state models, the density root problem as well as the phase equilibrium conditions must be fulfilled, which requires the use of embedded calculations that lead to a significant computational effort. This makes these advanced thermodynamic models difficult to use for process optimization.

In order to overcome this issue, the surrogate modeling methodology can be applied. Surrogate modeling is understood here as replacing a complex model by a simpler black-box model. One can distinguish between two different classes of surrogate modeling problems: classification problems with two or more discrete outputs and regression problems with one or more continuous outputs that are approximated.

As the quality of the surrogate model is dependent on the choice of the training points, sampling is an important aspect of the surrogate modeling process. Since sampling involves computationally intense evaluations of the original function, the sampling objective is to sample as few points as possible with a maximum gain of information on the modeled phenomenon.

There are mainly two approaches for sampling: sampling once (one-shot) or adaptive sampling. In many applications, one-shot space-filling designs, such as the Latin hypercube sampling (LHS), Monte-Carlo or Halton sequences are used to fit surrogate models. However, in many applications, the modeled quantities exhibit complex responses as discontinuities or a strong curvature in specific regions of the input space. In space-filling designs, such complex structures are often not approximated well. For these cases, adaptive or sequential sampling designs can be used. In this context, the terms of exploration – sampling in a space-filling manner – and exploitation – more dense sampling in regions with complex behavior of the original function – are used. The trade-off between exploration and exploitation is a recent field of research. Cozad et al. (2014) and Kleijnen and Van Beers (2004) focus on exploitation. The mixed adaptive sampling approach proposed in Eason and Cremaschi (2014), which was further improved by Jin et al. (2016), uses two equally weighted sampling objectives, while the recent work of Garud et al. (2017) proposes an approach of adaptive weighting of the two sampling objectives.

Prior to sampling, the input and output variables of the surrogate modeling problem have to be defined. For a chemical system, the composition, temperature and pressure define the number of phases and the composition of these. Modeling the phase composition of a mixture in the range where the transition between a one-phase and a two-phase mixture occurs with a regression model leads to a discontinuity in the output space. In general, discontinuities in the output space are hard to approximate with surrogate models in an accurate fashion. Restricting the valid input range to a region which is biphasic in any case as in other work (Nentwich and Engell, 2016) avoids discontinuities in the output space, but overconstrains the operating range which is not desirable in process optimization.

To cope with this problem, a novel surrogate modeling strategy for phase equilibria is investigated in this contribution. A classifier is trained on the obtained data in order to identify the biphasic region. In a second step, regression models are used to model the compositions of the two phases if the point is classified as being biphasic. This approach makes the use of equation of state models possible within chemical process simulation and optimization, since the computationally expensive frequent solution of the phase equilibrium is avoided. To obtain accurate classifier and regression models without calling the original thermodynamic model extensively for sampling, a mixed adaptive sampling scheme is applied.

As a case study, the process of the hydroformylation of 1-dodecene performed in a thermomorphic solvent system is examined. The proposed surrogate modeling strategy is demonstrated for the modeling of the liquid-liquid equilibrium of the ternary system n-decane, dimethylformamide and 1-dodecene, which is calculated using the PC-SAFT equation of state model. For the considered case study, the mean computation times were 5.1 seconds per phase equilibrium calculation using PC-SAFT which could be reduced to 0.003 seconds by applying the developed surrogate models on a standard computer (Windows 7, 3.6 GHz dual core Intel(R) i7, 18 GB RAM).

Section snippets

First principles modeling of phase equilibria

In this section, the phase equilibrium problem is defined and the solution procedure when applying equation of state models, e. g. PC-SAFT, is described. The results of this iterative procedure are used to provide data to train a surrogate model which computes the equilibrium by a simple function call.

Sampling

Especially for the case of expensive model evaluations, the goal of devising a sampling procedure is to sample as few points as possible but nonetheless get a model of good accuracy over the full range of inputs of interest. As the training locations have a strong effect on the accuracy of the surrogate model, finding the best sampling locations is an important aspect. A common approach is to equidistantly cover the complete input-space which is referred to as space-filling or exploratory

Surrogate models

Classifiers and regression models are commonly used to solve different problems in the field of chemical engineering. Examples of classification problems in this field are e. g. fault detection and diagnosis (Chiang, Kotanchek, Kordon, 2004, Onel, Kieslich, Guzman, Floudas, Pistikopoulos, 2018) and drug design (Byvatov et al., 2003). The application of surrogate models for regression problems to reduce the computational effort is gaining increasing popularity (see Cremaschi, 2015). Henao and

Case study

As a case study, the process of the hydroformylation of 1-dodecene to the main product n-tridecanal has been chosen (Kiedorf et al., 2014). This process has been developed up to the technical realization in two miniplants in the collaborative research center/transregio 63 “Integrated chemical processes in liquid multiphase systems” InPROMPT. Two different strategies of tunable solvent systems have been pursued. The reaction has been performed in a microemulsion process by employing surfactants

Results

In this section, the mixed adaptive sampling algorithm is applied to the case study of the ternary LLE of 1-dodecene, n-decane and DMF. The choice of the sampling parameters which are the number of subsets NSS (Section 6.1.1) and the selection factor SF (Section 6.1.2) is discussed first. The obtained models are compared with models that are based on a conventional LHS design of the same size in Section 6.2 to see the benefits of the sequential sampling approach. In order to analyze the

Conclusions

As equation of state models as PC-SAFT often are not directly applicable in process optimization due to the computational expense of the iterative computations, this work aims at replacing the expensive thermodynamic model calls by explicit computations of surrogate models. In order to combine explorative and exploitative samling objectives to find the best sample locations, the mixed adaptive sampling approach by Eason and Cremaschi (2014) has been extended to a novel surrogate modeling

Acknowledgment

This work is part of the Collaborative Research Center/Transregio 63 “Integrated Chemical Processes in Liquid Multiphase Systems” (subproject D1). Financial support by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) is gratefully acknowledged (TRR 63).

The authors also thank Roderich Wallrath, Clemens Lindscheid, Maximilian Cegla, Radoslav Paulen, Marina Rantanen Modéer, Simon Wenzel, Shreya Bhatia and Anoj-Winston Gladius for their support.

References (62)

  • N.M. Kaiser et al.

    Probabilistic reactor design in the framework of elementary process functions

    Comput. Chem. Eng.

    (2016)
  • T. Keßler et al.

    Global optimization of distillation columns using explicit and implicit surrogate models

    Chem. Eng. Sci.

    (2019)
  • T. Keßler et al.

    Efficient global optimization of a novel hydroformylation process

    Comput. Aided Chem. Eng.

    (2017)
  • G. Kiedorf et al.

    Kinetics of 1-dodecene hydroformylation in a thermomorphic solvent system using a rhodium-biphephos catalyst

    Chem. Eng. Sci.

    (2014)
  • M. Leesley et al.

    The dynamic approximation method of handling vapor-liquid equilibrium data in computer calculations for chemical processes

    Comput. Chem. Eng.

    (1977)
  • K. McBride et al.

    Thermomorphic solvent selection for homogeneous catalyst recovery based on COSMO-RS

    Chem. Eng. Process.

    (2016)
  • D. Müller et al.

    Dynamic real-time optimization under uncertainty of a hydroformylation mini-plant

    Comput. Chem. Eng.

    (2017)
  • E. Schäfer et al.

    Calculation of complex phase equilibria of DMF/alkane systems using the PCP-SAFT equation of state

    Chem. Eng. Sci.

    (2014)
  • F. Tumakaka et al.

    Thermodynamic modeling of complex systems using PC-SAFT

    Fluid Phase Equilibria

    (2005)
  • C. Vogelpohl et al.

    High-pressure gas solubility in multicomponent solvent systems for hydroformylation. part i: carbon monoxide solubility

    J. Supercrit. Fluids

    (2013)
  • C. Vogelpohl et al.

    High-pressure gas solubility in multicomponent solvent systems for hydroformylation. part II: syngas solubility

    J. Supercrit. Fluids

    (2014)
  • M. Zagajewski et al.

    Rhodium catalyzed hydroformylation of 1-dodecene using an advanced solvent system: towards highly efficient catalyst recycling

    Chem. Eng. Process.

    (2016)
  • J.A. Barker et al.

    Perturbation theory and equation of state for fluids. II. A successful theory of liquids

    J. Chem. Phys.

    (1967)
  • C. Bischof et al.

    Combining source transformation and operator overloading techniques to compute derivatives for MATLAB programs

    Proceedings. Second IEEE International Workshop on Source Code Analysis and Manipulation

    (2002)
  • C.M. Bishop

    Pattern Recognition and Machine Learning

    (2006)
  • Y. Brunsch

    Temperaturgesteuertes Katalysatorrecycling für die homogen katalysierte Hydroformylierung langkettiger Alkene

    (2013)
  • E. Byvatov et al.

    Comparison of support vector machine and artificial neural network systems for drug/nondrug classification

    J. Chem. Inf. Comput. Sci.

    (2003)
  • J.A. Caballero et al.

    An algorithm for the use of surrogate models in modular flowsheet optimization

    AlChE J.

    (2008)
  • E.H. Chimowitz et al.

    Local models for representing phase equilibria in multicomponent, nonideal vapor-Liquid and liquid-Liquid systems. 1. thermodynamic approximation functions

    Ind. Eng. Chem. Process Des.Dev.

    (1983)
  • E.H. Chimowitz et al.

    Local models for representing phase equilibria in multicomponent, non-ideal vapor-liquid and liquid-liquid systems. 2. application to process design

    Ind. Eng. Chem. Process Des. Dev.

    (1984)
  • A. Cozad et al.

    Learning surrogate models for simulation-based optimization

    AlChE J.

    (2014)
  • Cited by (31)

    • Evaluating the Impact of Model Uncertainties in Superstructure Optimization to Reduce the Experimental Effort

      2022, Computer Aided Chemical Engineering
      Citation Excerpt :

      The gas solubilities as well as the phase separation are predicted using the equation of state PC-SAFT. As the iterative solution of the PC-SAFT equations is not feasible in the optimization, surrogate models were trained as proposed in (Nentwich & Engell, 2019). The membrane separation is modelled using a solution-diffusion model.

    View all citing articles on Scopus
    View full text