Elsevier

Computer-Aided Design

Volume 35, Issue 8, July 2003, Pages 751-760
Computer-Aided Design

Data fitting with a spline using a real-coded genetic algorithm

https://doi.org/10.1016/S0010-4485(03)00006-XGet rights and content

Abstract

To obtain a good approximation for data fitting with a spline, frequently we have to deal with knots as variables. The problem to be solved then becomes a continuous nonlinear and multivariate optimization problem with many local optima. Therefore, it is difficult to obtain the global optimum. In this paper, we propose a method for solving this problem by using a real-coded genetic algorithm. Our method can treat not only data with a smooth underlying function, but also data with an underlying function having discontinuous points and/or cusps. We search for the best model among candidate models by using the Bayes Information Criterion (BIC). With this, we can appropriately determine the number and locations of knots automatically and simultaneously. Five examples of data fitting are given to show the performance of our method.

Introduction

Data fitting is an important tool for geometric modeling [1], [2] and image analysis [3]. When the shape of the underlying function of measurement data is complicated, it is difficult to approximate the shape through a single polynomial. In this case, a spline is one of the most appropriate approximating functions [4], [5].

It is well known that the key point in using a spline successfully is the determination of good knots [5], [6]. To obtain a good approximation using a spline, we have to place the knots as precisely as possible. This naturally demands dealing with the knots as variables, and we have to solve a multivariate and multimodal nonlinear optimization problem [5]. Therefore, it is difficult to obtain the global optimum.

From this, many conventional methods have been proposed [5], [6], [7], [8], [9], [10], [11], [12]. Generally speaking, these methods are divided into two categories—knot insertion and knot deletion. However, these methods are not necessarily sufficient. For example, Powell's method [7] generally produces redundant knots, and Jupp's method [10] requires a good initial estimate of the knot locations, which is not easy to determine. Dierckx's [5] and Lyche and Mørken's [11] methods need an error tolerance or a smoothing factor that has to be determined subjectively in many cases. Therefore, these methods are not necessarily sufficient for automatically obtaining a good model of the underlying data function. Here, a good model means that the number of parameters in the model and the difference between the model and the underlying data function are both as small as possible.

Therefore, we have developed a third way in which we can insert or delete knots adaptively using genetic algorithms (GAs). GAs have been applied for curve fitting [15], curve design [16] and surface extraction [17]. In our previous papers [13], [14], we proposed a method to convert the original problem into a discrete combinatorial optimization problem and solve the converted problem through a genetic algorithm (GA). Here, we refer to this as our old method. The old method constructed individuals by considering candidates for the knot locations as genes. Since the genes were composed of bits, we had to convert the continuous problem into a discrete problem. As a result, it was inherently subject to discretization errors and could not treat multiple knots. Accordingly, the model function constructed was restricted to smooth functions with no discontinuous points or cusps.

In this paper, we propose a method for solving the data-fitting problem using a GA with real number genes; that is, a real-coded genetic algorithm. We use knots themselves as genes, and do not convert the original continuous problem into a discrete combinatorial problem. As a result, any influence of the errors caused by the discretization of knots is avoided, and quasi-multiple knots (see Section 3.5) can be constructed. Thus, data with an underlying function that has discontinuous points and/or cusps, as well as data with a smooth underlying function, can be treated. We search for the best model among candidate models by using the Bayes Information Criterion (BIC) [18]. Our new method can automatically and simultaneously determine the appropriate number and locations of knots. Three examples of data fitting from artificial data and one example of data fitting from a clay model of a car are discussed regarding the performance of our new method. The result when our method was applied to titanium heat data [5], [6] is also used to compare our method to an existing method.

Section snippets

Data fitting using a spline

Let us assume that the data to be fitted are given on interval [a,b] of the x-axis and are written asFj=f(xj)+ϵj(j=1,2,…,N).In this equation, f(x) is the underlying function of the data (unknown), and ϵj is the measurement error. Let ξi (i=1−m,2−m,…,n+m) be knots of a spline for data fitting, where n is the number of knots ξi (i=1,2,…,n) located in the interval (a,b) and m is the order (degree+1) of a B-spline Nm,i(x). At the ends of interval [a,b], we seta=ξ1−m=…=ξ0,b=ξn+1=…=ξn+m.In this

Application of a genetic algorithm

The most important reason for us using a GA for the data-fitting problem mentioned in Section 2 is that it is a multimodal optimization problem. (The other reasons are explained in our previous papers [13], [14]).

GAs are stochastic algorithms whose search methods model some natural phenomena such as genetic inheritance and the Darwinian struggle for survival [19], [20]. A population of candidate solutions called individuals is bred in a parallel fashion. The population evolves so that it

Data fitting algorithm

By using the GA described in Section 3, we obtain the following method for data fitting using a spline.

  • Step 1. Input the data to be fitted, which are given by Eq. (1). Input the degree of spline m.

  • Step 2. Input the control parameters: the number of individuals K and the knot ratio λ.

  • Step 3. Create an initial population by using random numbers.

  • Step 4. For each individual, compute data fitting by using its real number genes as knots, and obtain the fitness value.

  • Step 5. Test for convergence. If

Experimental results

To study the effectiveness of the algorithm described in Section 4, we performed many experiments. From among these, we discuss five examples in this section. The weight wj (Section 2) was set to 1. This meant that the precision of measurement data point Fj was equal at every point xj for j=1,2,…,N. As a model function, we used a third-degree spline (m=4). However, note that our method does not depend on the degree. In the following examples, small triangles in the figures show the locations of

Conclusions

In this paper, we have proposed a method for determining good knots of data fitting with a spline. We used a genetic algorithm with real-number genes and with the BIC. Major advantages of our method are as follows

  • (i)

    Knot locations are not influenced by discretization errors. Therefore, the locations are determined more precisely than the method described in our previous papers [13], [14]. Moreover, both single knot and quasi-multiple knots can be treated.

  • (ii)

    Data with an underlying function having

Acknowledgements

This work was partly supported by a Grant-in-Aid for Scientific Research from the Japan Society for the Promotion of Science under contracts 10558052 and 11680390.

Fujiichi Yoshimoto is a professor in the Department of Computer and Communication Sciences, at Wakayama University, Wakayama, Japan. His research interests are in geometric modeling, genetic algorithm, non-photorealistic computer graphics and mobile computing. He received the PhD in computer science from Kyoto University in 1977.

References (22)

  • D.L.B. Jupp

    Approximation to data by splines with free knots

    SIAM J Numer Anal

    (1978)
  • Cited by (0)

    Fujiichi Yoshimoto is a professor in the Department of Computer and Communication Sciences, at Wakayama University, Wakayama, Japan. His research interests are in geometric modeling, genetic algorithm, non-photorealistic computer graphics and mobile computing. He received the PhD in computer science from Kyoto University in 1977.

    Toshinobu Harada is an associate professor in the Department of Design and Information Science, Wakayama University. Formerly, he worked at the NISSAN Motor Corporate Design Center (1990–1996). He received a PhD in design method from Chiba University in 1996 and a BA in industrial design from Kyusyu Institute of Design in 1987.

    Yoshihide Yoshimoto is a research associate in Institute for Solid State Physics, University of Tokyo, Japan. He received a BSc, a MSc and a PhD in Physics from University of Tokyo in 1995, 1997 and 2000, respectively. Although he is working in the area of computational materials physics, his current research interests include numerical analysis and parallel computing.

    View full text