Modified genetic algorithm for nonlinear data reconciliation

https://doi.org/10.1016/j.compchemeng.2004.11.005Get rights and content

Abstract

Nonlinear data reconciliation problem are inherently difficult to solve with conventional optimization methods because of the existence of a multimodal function with differentiated solutions. In this paper, the genetic algorithm (GA) of Wasanapradit [Wasanapradit, T. (2000). Solving nonlinear mixed integer programming using genetic algorithm. Master Thesis, King Mongkut University of Technology Thonburi, Bangkok, Thailand. Available: [email protected]] based on modified cross-generational probabilistic survival selection (CPSS) is explored for solving the steady state nonlinear data reconciliation (DR) problem. The DR problem is defined by a redescending estimator as the objective function, which is both a non-convex and discontinuous function. In the GA method, first the appropriate GA parameters are found and then the algorithm must be validated with the problem. The results show that the GA solves the redescending function without the complex calculations required by conventional optimization methods, but the calculation time is longer.

Introduction

Process data measurements are generally corrupted with two types of errors: random and gross, causing violation of process constraints defined by the mass and energy balances (Narasimhan & Jordache, 2000). These errors can be eliminated by techniques of data reconciliation and gross error detection to improve the data accuracy. Data reconciliation (DR), which was first proposed by Kuehn and Davidson (1961), is an estimation technique for improving the accuracy of measurements, and thus satisfying process constraints. It exploits the redundancy property of measurements to reduce the effect of random error in the data. The principal difference between data reconciliation and other filtering techniques is that data reconciliation explicitly used the process model constraints so that the estimates satisfy the constraints, while other techniques do not.

Various optimization techniques have been developed and utilized to solve DR problems. Lagrange multipliers combined with a projection matrix were used for linear and nonlinear steady state DR problems (Crowe, 1986, Crowe et al., 1983). Stephenson and Shewchuk (1986) used a Newton–Raphson iterative method based on a quasi-Newton linearization of a nonlinear model. MacDonald and Howat (1988) extended the DR techniques to estimate process parameters using coupled and decoupled procedures. Pai and Fisher (1988) replaced Broyden's method in the Gauss–Newton iterative algorithm (Knepper & German, 1980). Tjoa and Biegler (1991) proposed efficient hybrid successive quadratic programming algorithm to solve nonlinear DR problems. Narasimhan and Harikumar (1993) used an efficient quadratic programming algorithm for a linear steady state model that incorporated bounds into the problem. Sanchez and Romagnoli (1996) applied Q–R factorization to analyze, decompose and solve linear and bilinear reconciliation problems. The disadvantage to all these techniques is that the derivatives of the equations require complex calculations, resulting in a complicated process. Therefore, Zhao and Jiang (1996) proposed a stochastic search method for solving the linear steady state DR problem. The significant advantage of this method is that it does not depend on any particular model structure and only needs simple algebraic calculation. A popular stochastic search named genetic algorithm (GA) was proposed in this work. GA, which was first invented by John Holland in the 1960s, is an optimization technique that parallels the concepts of natural selection and mutation. It attempts to mimic the biological evolution process to find better solution. GA is a popular method for optimization because it prevents a local optimum and gives a greater chance of success for a nonlinear optimization than conventional methods. Also, GA provides a solution to the complicated problem of discontinuous and non-convex objective functions since global or near global optimum conditions are determined without using derivatives.

The performance of GA depends on evolution strategies, and these strategies have been discussed by many researchers. Ghoshray and Yen (1995) proposed a modified genetic algorithm (MGA) that was a hybrid of the simple genetic algorithm (SGA) and the simulated annealing (SA). They have found that MGA provides an efficient heuristic search for solving various optimization problems. Shimodaira (1997) proposed a new genetic algorithm called diversity control oriented genetic algorithm (DCGA), which was remarkably superior to the simple GA in attaining the global optimum solution, but required more calculation time in the selection stage than the simple GA. Some merits of GA mentioned before have encouraged the wide application of GA to various optimization problems. Moros et al. (1996) used GA for generating the initial parameters for a kinetic model of a catalytic process. Murata (1996) applied GA to a flowshop scheduling problems. Wasanapradit (2000) proposed a variation of GA for solving mixed integer nonlinear programming problems (MINLP). Later, Prakotpol and Srinophakun (2003) extended this application to wastewater for the minimization of contaminants.

Data reconciliation processes usually use a weighted least squares objective function, and are based on the assumption that random errors are distributed normally with zero mean and a known variance. However, an objective function can lead to incorrect estimation and severely bias reconciliation when measured data contains gross errors. Recently, many objective functions of the DR problem have been developed that provide unbiased reconciled data from measured data with gross errors. Yamamura, Nakajima, and Matsuyama (1988) proposed an objective function based on the Akaike information criterion to identify faulty instruments. Tjoa and Biegler (1991) developed a bi-variable objective function that took into account contributions from bath random and gross errors. The major advantage of the robust estimator is the elimination of the combinatorial procedure for gross error detection. Albuquerque and Biegler (1996) used a robust estimator, the fair function, which is resistant to deviations from ideal error structures. Arora and Biegler (2001) proposed a new, yet invalidated, form of the fair function, termed the redescending estimator, which nullifies the effects of large gross errors (outliers). They compared the performance of the least square estimator, the fair function and the redescending estimator and found the latter superior in detecting outliers and robustness.

In this paper, GA is proposed as the method for handling the discontinuous and non-convex properties of the redescending estimator objective function in steady state nonlinear DR problems. In the next section, different properties of the weighted least square and the redescending estimator are discussed. The GA method is discussed in Sections 3 Modification of the genetic algorithm, 4 Implementation of GA to the DR problem. In Section 5, the effectiveness of this algorithm is demonstrated on an example problem. Finally, the conclusion is presented in Section 6.

Section snippets

Formulation of the data reconciliation problem

The data reconciliation problem is an optimization problem, which can be explained using statistical theory. The general form of the DR problem is:MinimizexˆF(xm,xˆ)s.t.f(xˆ,u)=0g(xˆ,u)0xˆLxˆxˆUuLuuUwhere F is the objective function, xm the measurement data for the corresponding estimate variable xˆ, u the set of unmeasured variables, f the set of equality constraints, g the set of inequality constraints, and superscripts L and U are the lower and upper bound of the variables respectively.

Modification of the genetic algorithm

Originally, the GA was designed as a computer based model to exhibit and describe the adaptive processes associated with natural genetics. Many researchers have proven that GA both theoretically and empirically provides a robust mathematical search mechanism. Subsequently, GA became a mathematical technique, rather than a biological model (Mitchell, 1996). Various algorithms for GA continue to be developed for improving the performance in solving specific problems (Ghoshray & Yen, 1995; Murata,

Implementation of GA to the DR problem

Data reconciliation problems generally have two kinds of variables, measured and unmeasured. However, it is the measured variables that are important for solving the problem, and thus, elimination of the unmeasured variables is necessary. Since the unmeasured variables are only presented in the constraint equations, the simplest strategy for solving the problem is to eliminate them from the constraints. This will not affect the objective function since it does not involve unmeasured variables.

Case study

In this section, the proposed GA developed by MATLAB™ is validated using the problem presented by Pai and Fisher (1988). Arora and Biegler (2001) also used this problem with the redescending estimator and the conventional optimization method in the GAMS package. They applied an interior point smoothing function to handle the non-differentiable terms of the redescending estimator, which consists of an intricate mathematical function and a complex calculation. Moreover, their constraint equations

Conclusion

This paper explored the use of GA to solve DR problem and handle to the non-differentiated term of the redescending estimator. The systematic approach of GA to the data reconciliation problem was proposed and implemented. This method started with the appropriate GA parameters searching and then used these parameters to solve the DR problem. The steady state nonlinear DR problem was tested with this developed program. The results showed that the GA method eliminated complex calculations and is

Acknowledgements

We are grateful to Dr. Nikhil Arora for very helpful suggestion on the redescending estimator and the AIC calculation and to Dr. Peter Hawken and Assoc. Prof. Dr. Tony Paterson for proving the paper. This research also had a funding from The Thailand Research Fund.

References (26)

  • S. Ghoshray et al.

    More efficient genetic algorithm for solving optimization problems

  • D.C. Hoaglin et al.

    Understanding robust and exploratory data analysis

    (1983)
  • J.C. Knepper et al.

    Statistic analysis of constrained data sets

    American Institute of Chemical Engineering Journal

    (1980)
  • Cited by (0)

    View full text