Elsevier

Applied Soft Computing

Volume 3, Issue 2, September 2003, Pages 159-175
Applied Soft Computing

Extrapolation detection and novelty-based node insertion for sequential growing multi-experts network

https://doi.org/10.1016/S1568-4946(03)00011-5Get rights and content

Abstract

Artificial neural networks (ANNs) have been used to construct empirical nonlinear models of process data. Because networks are not based on physical theory and contain nonlinearities, their predictions are suspect when extrapolating beyond the range of original training data. Standard networks give no indication of possible errors due to extrapolation. This paper describes a sequential supervised learning scheme for the recently formalized Growing multi-experts network (GMN). It is shown that certainty factor can be generated by GMN that can be taken as extrapolation detector for GMN. On-line GMN identification algorithm is presented and its performance is evaluated. The capability of the GMN to extrapolate is also indicated. Four benchmark experiments are dealt with to demonstrate the effectiveness and utility of GMN as a universal function approximator.

Introduction

The ability to interact with the environment and to learn from the effects of these interactions is one of the hallmarks of intelligence. A system with the ability to learn from observation thus compensates for an initial lack of a priori knowledge about its given task. Such flexibility is becoming increasingly important in automatic systems such as robots, washing machines, vehicles and process plants. Obtaining an accurate empirical-based model of a complex, nonlinear, dynamic process is the first step towards the creation of high performance diagnosis systems, supervisory systems and controllers. In recent years, much research in empirical modeling has been carried out within the artificial neural network (ANN) paradigm, using simplified formal models of physiological systems. Static feedforward neural networks such as multilayer perceptron (MLP), radial basis functions (RBFs) network have been widely used in different application domains and they are known to be universal approximators [1], [2]. That is, a network of sufficient complexity can approximate any bounded measurable function of practical interest. In this respect, they can be viewed as multi-dimensional interpolators. The accuracy of the response of such a network to a generalization request (query) is dependent upon its falling within the domain of validity for that model. Based on the consideration of neural network constraints and the inspiration from constructive approach, growing multi-experts network (GMN) was developed for the first time [3], [4], [5]. However, they are restricted to off-line application that required batch learning on the basis of the availability of full training data set. In general, the neural network-based system identification and control can be generally divided into off-line and on-line learning schemes. Off-line methods operate when the training data are available at one time. These data are collected before performing the neural network learning. Because of the nature of the data, it might so happen that control decitions in the new regions may not be learnt properly in as much as the insertion of new experts is based on accumulated error criterion [3], [4], [5]. On-line method, on the other hand, learns the data as it becomes available to the learning system. This paper describes the development of sequential or on-line learning growing multi-experts network. It may be used for on-line application.

The nature of incremental or sequential learning is such that the learning system updates its knowledge base as a new input sample arrives without having to revise old samples. According to Fu [6], this learning strategy is both biologically and psychologically plausible. From a machine learning point of view, an incremental learning system should be able to differentiate between spurious and rare but important information. Hence, generalization and selective learning are two main issues. On arrival of a new sample, the system has to decide either to absorb or assimilate the sample by generalizing its knowledge base, or to code or accommodate the sample by extending the existing knowledge base. In fact, the issue of generalization and selective learning based on novel input patterns is related to the interpolation and extrapolation capability of artificial neural networks. The interpolation domain of an artificial neural network may be treated as domain of the validity of the neural network. The accuracy of the response of such a network to a generalization request or query is dependent upon its falling within the domain of validity for that model. If the input patterns fall outside the domain of validity, the neural network may not absorb the sample by generalization based on existing learned knowledge. Therefore, it needs to extend its domain of validity to ‘accommodate’ the new sample.

Growing multi-experts network may be viewed as a multi-dimensional interpolator. However, common to many other forms of interpolators, the validity of the model formed by the GMN model is closely related to the boundaries of the training data. Once outside of these bounds it can usually be assumed that the ‘performance’ of the model will degenerate rapidly with the distance from the boundary. In “real world” industrial applications, it is rare to have access to training data that covers the entire input space. It becomes necessary, therefore, to determine the interpolation domain for a given training set. When the training set is fully representative of the inputs to the neural network during application, the network approximation can therefore appropriately by described as interpolation between training examples. On the other hand, if the training set can only represent certain features of the application at hand and it is expected that inputs from time to time will differ distinctly from those in the training set, the network requires extrapolation to regions where no samples have occurred. In the former case, smoothness and robustness are typically the most important requirements for sensible interpolation. The latter case, on the other hand, must satisfy the additional requirement that points away from the training set should be treated appropriately, which in practice means that they should be identified as such. Therefore, it is important for GMN model to identify those regions of feature space that have not been sampled. This has called for the implementation of certainty factor as extrapolation indicator in GMN model. In order to ensure sensible interpolation, additional local expert model will be inserted in the suspected extrapolation region. In this paper, GMN model implements a novelty-based node insertion approach in contrast with error-based insertion strategy. Furthermore, the newly proposed GMN model will adopt fully recursive learning scheme, coupled with on-line growing and pruning approach to obtain ‘minimal’ GMN model.

The rest of this paper is organized as follows: related work is briefly described in Section 2. GMN architecture is explained in Section 3. Section 4 devotes to the description of certain factor as extrapolation indicator. Section 5 illustrates the extrapolation problem of GMN model and the efficacy of certainty factor to counteract this problem. Section 6 presents the novelty-based node insertion strategy and on-line GMN learning algorithm. Section 7 denotes GMN equivalence to fuzzy model. The performance of the proposed algorithm is illustrated with benchmark problems in Section 8. The performance of GMN is further assessed using a continuous-stirred tank reactor (CSTR) model in Section 9. Finally, Section 10 concludes the paper.

Section snippets

Related Work

Growing multi-experts network shares the idea of ‘divide and conquer’ strategy where the problem space is decomposed into overlapping regions and local experts approximate the data in every region. When the local experts are linear functions, these can be seen as a kind of piecewise linear regression characterized by a softened transition between the pieces. When the local models reduce to real value the related networks are RBF [7], [8], [9], GRNN [10] and zero-order Takagi–Sugeno Fuzzy model

Growing multi-experts network (GMN)

Dynamic structure network such as GMN that allows for hypothesis space expansion and shrinkage according to the problem space constraint can reliably achieve the trade off between bias and variance of a model. Modeling problem involves approximation of an unknown nonlinear function. In nonlinear systems the function complexity varies throughout the input space. However, the input spaces can be partitioned into subspaces that are easier to handle. This can be achieved effectively using ‘divide

Certainty factor

Certainty factor was introduced by Buchanan and Shortliffe [22] for a medical diagnostic system. Donald K. Wedding II and Krzysztof J. Cios conducted a comparative study of this method and the other technique for measuring RBF reliability [19]. Certainty factor can be generated by GMN networks by using the μ(d(x;c,σ)) values from the expertise domain. Referring to Eq. (2), it is obvious that this function returns values between 0.0 and 1.0 in the same range as allowed by certainty factors.

Extrapolation problem in GMN model

GMN networks give good results for input data that are similar to its training data, but give poor results for unfamiliar input data. This unfamiliar data can yield highly unfavorable results that decreases the effectiveness of the network. An example of this problem is given below. To illustrate this aspect, we construct a one-dimensional example problem using the Hermite polynomial function.f(x)=1.1(1−x+2x2)exp{−12x2}where x∈R. A random sampling of the interval [−2, +2] is used in obtaining

On-line GMN identification algorithm

On-line GMN uses recursive learning algorithm for the structural and parametric learning. The learning process requires allocation of new local experts depending on the input novelty as well as on-line adaptation of network. Since the novelty of the input data is tested using certainty factor, on-line GMN is suited for on-line function approximation problems. The objective behind the development is to develop gradually the approximate complexity of the GMN network that is sufficient to

Equivalence to Generalized Takagi–Sugeno Fuzzy system

The local experts in GMN are defined as a linear function. Their outputs are mediated by expertise domain and expertise level gating function. Furthermore, GMN employs gradient-based error correction method for parameteric learning. Therefore, it is equivalent to a first-order Generalized Takagi–Sugeno Fuzzy model of which a typical single antecedent fuzzy rule Rk has the following form:Rk:(αk)IfxisAkthenŷk=θk(x)k=1,2,…,cwhere x is an input vector, Ak the fuzzy set of Rk with membership

Benchmark study

On-line GMN is tested with three benchmark modeling problems and a real plant model. It is assumed ηM=ηcM=ησM=ηαM for momentum rates.

System identification of a continuous-stirred tank reactor using GMN model

As an example of the application of GMN in system identification application, a plant representing a continuous-stirred tank reactor is chosen. The model of this plant is presented in [25]. The model is described by the following differential equations:Ċa(t)=qv(Ca0−Ca(t))−k0Ca(t)e−(E/RT(t)),Ṫ(t)=qv(T0−T(t))−k1Ca(t)e−(E/RT(t))+k2qc(t)(1−e−k3/qc(t))(Tc0−T(t))The process describes the reaction of two products, which are mixed and react to generate a compound A with concentration Ca(t), and with

Conclusions

In this paper, a GMN architecture performing as a Generalized Takagi–Sugeno Fuzzy system is presented. Based on the GMN, an on-line self-structuring learning algorithm, whereby both the structure and parameter estimation can be achieved simultaneously is outlined. Simulation studies and comprehensive comparisons show that GMN model could offer tremendous advantages in approximating both static functions and nonlinear dynamic systems. A parsimonious structure with high performance can also be

References (25)

  • J. Moody et al.

    Fast learning in networks of locally tuned processing units

    Neural Comput.

    (1989)
  • B. Fritzke

    Fast learning with incremental RBF networks

    Neural Process. Lett.

    (1994)
  • Cited by (11)

    • Robust incremental growing multi-experts network

      2006, Applied Soft Computing Journal
    • Artificial Neural Networks for Microwave Computer-Aided Design: The State of the Art

      2022, IEEE Transactions on Microwave Theory and Techniques
    • Recent advances in knowledge-based model structure optimization and extrapolation techniques for microwave applications

      2021, International Journal of Numerical Modelling: Electronic Networks, Devices and Fields
    • Extrapolation techniques for neural-based nonlinear electronic modeling and circuit simulation

      2019, 2019 6th International Conference on Systems and Informatics, ICSAI 2019
    View all citing articles on Scopus
    1

    Tel.:+60-4-6577888-2157.

    View full text