Modified genetic algorithm for nonlinear data reconciliation
Introduction
Process data measurements are generally corrupted with two types of errors: random and gross, causing violation of process constraints defined by the mass and energy balances (Narasimhan & Jordache, 2000). These errors can be eliminated by techniques of data reconciliation and gross error detection to improve the data accuracy. Data reconciliation (DR), which was first proposed by Kuehn and Davidson (1961), is an estimation technique for improving the accuracy of measurements, and thus satisfying process constraints. It exploits the redundancy property of measurements to reduce the effect of random error in the data. The principal difference between data reconciliation and other filtering techniques is that data reconciliation explicitly used the process model constraints so that the estimates satisfy the constraints, while other techniques do not.
Various optimization techniques have been developed and utilized to solve DR problems. Lagrange multipliers combined with a projection matrix were used for linear and nonlinear steady state DR problems (Crowe, 1986, Crowe et al., 1983). Stephenson and Shewchuk (1986) used a Newton–Raphson iterative method based on a quasi-Newton linearization of a nonlinear model. MacDonald and Howat (1988) extended the DR techniques to estimate process parameters using coupled and decoupled procedures. Pai and Fisher (1988) replaced Broyden's method in the Gauss–Newton iterative algorithm (Knepper & German, 1980). Tjoa and Biegler (1991) proposed efficient hybrid successive quadratic programming algorithm to solve nonlinear DR problems. Narasimhan and Harikumar (1993) used an efficient quadratic programming algorithm for a linear steady state model that incorporated bounds into the problem. Sanchez and Romagnoli (1996) applied Q–R factorization to analyze, decompose and solve linear and bilinear reconciliation problems. The disadvantage to all these techniques is that the derivatives of the equations require complex calculations, resulting in a complicated process. Therefore, Zhao and Jiang (1996) proposed a stochastic search method for solving the linear steady state DR problem. The significant advantage of this method is that it does not depend on any particular model structure and only needs simple algebraic calculation. A popular stochastic search named genetic algorithm (GA) was proposed in this work. GA, which was first invented by John Holland in the 1960s, is an optimization technique that parallels the concepts of natural selection and mutation. It attempts to mimic the biological evolution process to find better solution. GA is a popular method for optimization because it prevents a local optimum and gives a greater chance of success for a nonlinear optimization than conventional methods. Also, GA provides a solution to the complicated problem of discontinuous and non-convex objective functions since global or near global optimum conditions are determined without using derivatives.
The performance of GA depends on evolution strategies, and these strategies have been discussed by many researchers. Ghoshray and Yen (1995) proposed a modified genetic algorithm (MGA) that was a hybrid of the simple genetic algorithm (SGA) and the simulated annealing (SA). They have found that MGA provides an efficient heuristic search for solving various optimization problems. Shimodaira (1997) proposed a new genetic algorithm called diversity control oriented genetic algorithm (DCGA), which was remarkably superior to the simple GA in attaining the global optimum solution, but required more calculation time in the selection stage than the simple GA. Some merits of GA mentioned before have encouraged the wide application of GA to various optimization problems. Moros et al. (1996) used GA for generating the initial parameters for a kinetic model of a catalytic process. Murata (1996) applied GA to a flowshop scheduling problems. Wasanapradit (2000) proposed a variation of GA for solving mixed integer nonlinear programming problems (MINLP). Later, Prakotpol and Srinophakun (2003) extended this application to wastewater for the minimization of contaminants.
Data reconciliation processes usually use a weighted least squares objective function, and are based on the assumption that random errors are distributed normally with zero mean and a known variance. However, an objective function can lead to incorrect estimation and severely bias reconciliation when measured data contains gross errors. Recently, many objective functions of the DR problem have been developed that provide unbiased reconciled data from measured data with gross errors. Yamamura, Nakajima, and Matsuyama (1988) proposed an objective function based on the Akaike information criterion to identify faulty instruments. Tjoa and Biegler (1991) developed a bi-variable objective function that took into account contributions from bath random and gross errors. The major advantage of the robust estimator is the elimination of the combinatorial procedure for gross error detection. Albuquerque and Biegler (1996) used a robust estimator, the fair function, which is resistant to deviations from ideal error structures. Arora and Biegler (2001) proposed a new, yet invalidated, form of the fair function, termed the redescending estimator, which nullifies the effects of large gross errors (outliers). They compared the performance of the least square estimator, the fair function and the redescending estimator and found the latter superior in detecting outliers and robustness.
In this paper, GA is proposed as the method for handling the discontinuous and non-convex properties of the redescending estimator objective function in steady state nonlinear DR problems. In the next section, different properties of the weighted least square and the redescending estimator are discussed. The GA method is discussed in Sections 3 Modification of the genetic algorithm, 4 Implementation of GA to the DR problem. In Section 5, the effectiveness of this algorithm is demonstrated on an example problem. Finally, the conclusion is presented in Section 6.
Section snippets
Formulation of the data reconciliation problem
The data reconciliation problem is an optimization problem, which can be explained using statistical theory. The general form of the DR problem is:where F is the objective function, xm the measurement data for the corresponding estimate variable , u the set of unmeasured variables, f the set of equality constraints, g the set of inequality constraints, and superscripts L and U are the lower and upper bound of the variables respectively.
Modification of the genetic algorithm
Originally, the GA was designed as a computer based model to exhibit and describe the adaptive processes associated with natural genetics. Many researchers have proven that GA both theoretically and empirically provides a robust mathematical search mechanism. Subsequently, GA became a mathematical technique, rather than a biological model (Mitchell, 1996). Various algorithms for GA continue to be developed for improving the performance in solving specific problems (Ghoshray & Yen, 1995; Murata,
Implementation of GA to the DR problem
Data reconciliation problems generally have two kinds of variables, measured and unmeasured. However, it is the measured variables that are important for solving the problem, and thus, elimination of the unmeasured variables is necessary. Since the unmeasured variables are only presented in the constraint equations, the simplest strategy for solving the problem is to eliminate them from the constraints. This will not affect the objective function since it does not involve unmeasured variables.
Case study
In this section, the proposed GA developed by MATLAB™ is validated using the problem presented by Pai and Fisher (1988). Arora and Biegler (2001) also used this problem with the redescending estimator and the conventional optimization method in the GAMS package. They applied an interior point smoothing function to handle the non-differentiable terms of the redescending estimator, which consists of an intricate mathematical function and a complex calculation. Moreover, their constraint equations
Conclusion
This paper explored the use of GA to solve DR problem and handle to the non-differentiated term of the redescending estimator. The systematic approach of GA to the data reconciliation problem was proposed and implemented. This method started with the appropriate GA parameters searching and then used these parameters to solve the DR problem. The steady state nonlinear DR problem was tested with this developed program. The results showed that the GA method eliminated complex calculations and is
Acknowledgements
We are grateful to Dr. Nikhil Arora for very helpful suggestion on the redescending estimator and the AIC calculation and to Dr. Peter Hawken and Assoc. Prof. Dr. Tony Paterson for proving the paper. This research also had a funding from The Thailand Research Fund.
References (26)
Unifying the derivations for the Akaike and corrected Akaike information criteria
Statistics and Probability Letters
(1997)- et al.
A genetic algorithm for generating initial parameter estimations for kinetic models of catalytic processes
Computers and Chemical Engineering
(1996) Genetic algorithms for flowshop scheduling problems
Computers and Industrial Engineering
(1996)- et al.
A method to incorporate bounds in data reconciliation and gross error detection. 1. The bounded data reconciliation problem
Computers and Chemical Engineering
(1993) - et al.
Use of orthogonal transformation in data classification – reconciliation
Computers and Chemical Engineering
(1996) - et al.
Simultaneous strategies for data reconciliation and gross error detection of nonlinear systems
Computers and Chemical Engineering
(1991) - et al.
Data reconciliation and gross error detection for dynamic systems
American Institute of Chemical Engineering Journal
(1996) - et al.
Redescending estimators for data reconciliation and parameter estimation
Computers and Chemical Engineering
(2001) Reconciliation of process flow rates by matrix projection. Part II. The nonlinear case
American Institute of Chemical Engineering Journal
(1986)- et al.
Reconciliation of process flow rates by matrix projection
American Institute of Chemical Engineering Journal
(1983)