Extrapolation detection and novelty-based node insertion for sequential growing multi-experts network

doi:10.1016/S1568-4946(03)00011-5

Applied Soft Computing

Volume 3, Issue 2, September 2003, Pages 159-175

https://doi.org/10.1016/S1568-4946(03)00011-5 Get rights and content

Abstract

Artificial neural networks (ANNs) have been used to construct empirical nonlinear models of process data. Because networks are not based on physical theory and contain nonlinearities, their predictions are suspect when extrapolating beyond the range of original training data. Standard networks give no indication of possible errors due to extrapolation. This paper describes a sequential supervised learning scheme for the recently formalized Growing multi-experts network (GMN). It is shown that certainty factor can be generated by GMN that can be taken as extrapolation detector for GMN. On-line GMN identification algorithm is presented and its performance is evaluated. The capability of the GMN to extrapolate is also indicated. Four benchmark experiments are dealt with to demonstrate the effectiveness and utility of GMN as a universal function approximator.

Introduction

The ability to interact with the environment and to learn from the effects of these interactions is one of the hallmarks of intelligence. A system with the ability to learn from observation thus compensates for an initial lack of a priori knowledge about its given task. Such flexibility is becoming increasingly important in automatic systems such as robots, washing machines, vehicles and process plants. Obtaining an accurate empirical-based model of a complex, nonlinear, dynamic process is the first step towards the creation of high performance diagnosis systems, supervisory systems and controllers. In recent years, much research in empirical modeling has been carried out within the artificial neural network (ANN) paradigm, using simplified formal models of physiological systems. Static feedforward neural networks such as multilayer perceptron (MLP), radial basis functions (RBFs) network have been widely used in different application domains and they are known to be universal approximators [1], [2]. That is, a network of sufficient complexity can approximate any bounded measurable function of practical interest. In this respect, they can be viewed as multi-dimensional interpolators. The accuracy of the response of such a network to a generalization request (query) is dependent upon its falling within the domain of validity for that model. Based on the consideration of neural network constraints and the inspiration from constructive approach, growing multi-experts network (GMN) was developed for the first time [3], [4], [5]. However, they are restricted to off-line application that required batch learning on the basis of the availability of full training data set. In general, the neural network-based system identification and control can be generally divided into off-line and on-line learning schemes. Off-line methods operate when the training data are available at one time. These data are collected before performing the neural network learning. Because of the nature of the data, it might so happen that control decitions in the new regions may not be learnt properly in as much as the insertion of new experts is based on accumulated error criterion [3], [4], [5]. On-line method, on the other hand, learns the data as it becomes available to the learning system. This paper describes the development of sequential or on-line learning growing multi-experts network. It may be used for on-line application.

The nature of incremental or sequential learning is such that the learning system updates its knowledge base as a new input sample arrives without having to revise old samples. According to Fu [6], this learning strategy is both biologically and psychologically plausible. From a machine learning point of view, an incremental learning system should be able to differentiate between spurious and rare but important information. Hence, generalization and selective learning are two main issues. On arrival of a new sample, the system has to decide either to absorb or assimilate the sample by generalizing its knowledge base, or to code or accommodate the sample by extending the existing knowledge base. In fact, the issue of generalization and selective learning based on novel input patterns is related to the interpolation and extrapolation capability of artificial neural networks. The interpolation domain of an artificial neural network may be treated as domain of the validity of the neural network. The accuracy of the response of such a network to a generalization request or query is dependent upon its falling within the domain of validity for that model. If the input patterns fall outside the domain of validity, the neural network may not absorb the sample by generalization based on existing learned knowledge. Therefore, it needs to extend its domain of validity to ‘accommodate’ the new sample.

Growing multi-experts network may be viewed as a multi-dimensional interpolator. However, common to many other forms of interpolators, the validity of the model formed by the GMN model is closely related to the boundaries of the training data. Once outside of these bounds it can usually be assumed that the ‘performance’ of the model will degenerate rapidly with the distance from the boundary. In “real world” industrial applications, it is rare to have access to training data that covers the entire input space. It becomes necessary, therefore, to determine the interpolation domain for a given training set. When the training set is fully representative of the inputs to the neural network during application, the network approximation can therefore appropriately by described as interpolation between training examples. On the other hand, if the training set can only represent certain features of the application at hand and it is expected that inputs from time to time will differ distinctly from those in the training set, the network requires extrapolation to regions where no samples have occurred. In the former case, smoothness and robustness are typically the most important requirements for sensible interpolation. The latter case, on the other hand, must satisfy the additional requirement that points away from the training set should be treated appropriately, which in practice means that they should be identified as such. Therefore, it is important for GMN model to identify those regions of feature space that have not been sampled. This has called for the implementation of certainty factor as extrapolation indicator in GMN model. In order to ensure sensible interpolation, additional local expert model will be inserted in the suspected extrapolation region. In this paper, GMN model implements a novelty-based node insertion approach in contrast with error-based insertion strategy. Furthermore, the newly proposed GMN model will adopt fully recursive learning scheme, coupled with on-line growing and pruning approach to obtain ‘minimal’ GMN model.

The rest of this paper is organized as follows: related work is briefly described in Section 2. GMN architecture is explained in Section 3. Section 4 devotes to the description of certain factor as extrapolation indicator. Section 5 illustrates the extrapolation problem of GMN model and the efficacy of certainty factor to counteract this problem. Section 6 presents the novelty-based node insertion strategy and on-line GMN learning algorithm. Section 7 denotes GMN equivalence to fuzzy model. The performance of the proposed algorithm is illustrated with benchmark problems in Section 8. The performance of GMN is further assessed using a continuous-stirred tank reactor (CSTR) model in Section 9. Finally, Section 10 concludes the paper.

Section snippets

Related Work

Growing multi-experts network shares the idea of ‘divide and conquer’ strategy where the problem space is decomposed into overlapping regions and local experts approximate the data in every region. When the local experts are linear functions, these can be seen as a kind of piecewise linear regression characterized by a softened transition between the pieces. When the local models reduce to real value the related networks are RBF [7], [8], [9], GRNN [10] and zero-order Takagi–Sugeno Fuzzy model

Growing multi-experts network (GMN)

Dynamic structure network such as GMN that allows for hypothesis space expansion and shrinkage according to the problem space constraint can reliably achieve the trade off between bias and variance of a model. Modeling problem involves approximation of an unknown nonlinear function. In nonlinear systems the function complexity varies throughout the input space. However, the input spaces can be partitioned into subspaces that are easier to handle. This can be achieved effectively using ‘divide

Certainty factor

Certainty factor was introduced by Buchanan and Shortliffe [22] for a medical diagnostic system. Donald K. Wedding II and Krzysztof J. Cios conducted a comparative study of this method and the other technique for measuring RBF reliability [19]. Certainty factor can be generated by GMN networks by using the $μ(d(x ⇀; c ⇀, σ ⇀))$ values from the expertise domain. Referring to Eq. (2), it is obvious that this function returns values between 0.0 and 1.0 in the same range as allowed by certainty factors.

Extrapolation problem in GMN model

GMN networks give good results for input data that are similar to its training data, but give poor results for unfamiliar input data. This unfamiliar data can yield highly unfavorable results that decreases the effectiveness of the network. An example of this problem is given below. To illustrate this aspect, we construct a one-dimensional example problem using the Hermite polynomial function. $f(x)=1.1(1−x+2x^{2}) exp {− 12 x^{2}}$ where $x∈ R$ . A random sampling of the interval [−2, +2] is used in obtaining

On-line GMN identification algorithm

On-line GMN uses recursive learning algorithm for the structural and parametric learning. The learning process requires allocation of new local experts depending on the input novelty as well as on-line adaptation of network. Since the novelty of the input data is tested using certainty factor, on-line GMN is suited for on-line function approximation problems. The objective behind the development is to develop gradually the approximate complexity of the GMN network that is sufficient to

Equivalence to Generalized Takagi–Sugeno Fuzzy system

The local experts in GMN are defined as a linear function. Their outputs are mediated by expertise domain and expertise level gating function. Furthermore, GMN employs gradient-based error correction method for parameteric learning. Therefore, it is equivalent to a first-order Generalized Takagi–Sugeno Fuzzy model of which a typical single antecedent fuzzy rule $R_{k}$ has the following form: $R_{k} :(α_{k}) If x ⇀ is A_{k} then y ̂_{k} = θ ⇀_{k} (x ⇀) k=1,2,…,c$ where $x ⇀$ is an input vector, A_k the fuzzy set of $R_{k}$ with membership

Benchmark study

On-line GMN is tested with three benchmark modeling problems and a real plant model. It is assumed η_M=η_cM=η_σM=η_αM for momentum rates.

System identification of a continuous-stirred tank reactor using GMN model

As an example of the application of GMN in system identification application, a plant representing a continuous-stirred tank reactor is chosen. The model of this plant is presented in [25]. The model is described by the following differential equations: $C ̇_{a} (t)= q v (C_{a0} −C_{a} (t))−k_{0} C_{a} (t) e^{−(E/RT(t))}, T ̇ (t)= q v (T_{0} −T(t))−k_{1} C_{a} (t) e^{−(E/RT(t))} +k_{2} q_{c} (t)(1− e^{−k_{3}/q_{c}(t)})(T_{c0} −T(t))$ The process describes the reaction of two products, which are mixed and react to generate a compound A with concentration C_a(t), and with

Conclusions

In this paper, a GMN architecture performing as a Generalized Takagi–Sugeno Fuzzy system is presented. Based on the GMN, an on-line self-structuring learning algorithm, whereby both the structure and parameter estimation can be achieved simultaneously is outlined. Simulation studies and comprehensive comparisons show that GMN model could offer tremendous advantages in approximating both static functions and nonlinear dynamic systems. A parsimonious structure with high performance can also be

References (25)

K. Hornik
Multilayer feedforward networks are universal approximators
Neural Netw.
(1989)
D.K. Wedding et al.
Time series forecasting by combining RBF networks, certainty factors, and the Box-Jenkins model
Neurocomputing
(1996)
T.M. Martinetz
Topology representing networks
Neural Netw.
(1994)
K.B. Cho et al.
Radial basis function based adaptive fuzzy systems and their applications to system identification and prediction
Fuzzy Sets Syst.
(1996)
J. Park et al.
Universal approximation using radial basis function networks
Neural Comput.
(1991)
L. Chu Kiong, M. Rajeswari, Redundant expert detection for a growing multi-experts network, in: Proceedings of the...
L. Chu Kiong, M. Rajeswari, Growing multi-experts network, in: Proceedings of the TENCON 2000, Kuala Lumpur, Malaysia,...
L. Chu Kiong, M. Rajeswari, Growing multi-experts network for industrial control applications, in: Proceedings of the...
K.S. Fu, Neural Networks in Computer Intelligence, McGraw Hill, New York,...
F. Girosi et al.
Regularization theory and neural networks architectures
Neural Comput.
(1995)

J. Moody et al.

Fast learning in networks of locally tuned processing units

Neural Comput.

(1989)

B. Fritzke

Fast learning with incremental RBF networks

Neural Process. Lett.

(1994)

Cited by (11)

A design tool for predicting the capillary transport characteristics of fuel cell diffusion media using an artificial neural network
2008, Journal of Power Sources
Developing a robust, intelligent design tool for multivariate optimization of multi-phase transport in fuel cell diffusion media (DM) is of utmost importance to develop advanced DM materials. This study explores the development of a DM design algorithm based on artificial neural network (ANN) that can be used as a powerful tool for predicting the capillary transport characteristics of fuel cell DM. Direct measurements of drainage capillary pressure–saturation curves of the differently engineered DMs (5, 10 and 20 wt.% PTFE) were performed at room temperature under three compressions (0, 0.6 and 1.4 MPa) [E.C. Kumbur, K.V. Sharp, M.M. Mench, J. Electrochem. Soc. 154(12) (2007) B1295–B1304; E.C. Kumbur, K.V. Sharp, M.M. Mench, J. Electrochem. Soc. 154(12) (2007) B1305–B1314; E.C. Kumbur, K.V. Sharp, M.M. Mench, J. Electrochem. Soc. 154(12) (2007) B1315–B1324]. The generated benchmark data were utilized to systematically train a three-layered ANN framework that processes the feed-forward error back propagation methodology. The designed ANN successfully predicts the measured capillary pressures within an average uncertainty of ±5.1% of the measured data, confirming that the present ANN model can be used as a design tool within the range of tested parameters. The ANN simulations reveal that tailoring the DM with high PTFE loading and applying high compression pressure lead to a higher capillary pressure, therefore promoting the liquid water transport within the pores of the DM. Any increase in hydrophobicity of the DM is found to amplify the compression effect, thus yielding a higher capillary pressure for the same saturation level and compression.
Robust incremental growing multi-experts network
2006, Applied Soft Computing Journal
Most supervised neural networks are trained by minimizing the mean square error (MSE) of the training set. In the presence of outliers, the resulting neural network model can differ significantly from the underlying model that generates the data. This paper outlines two robust learning methods for a dynamic structure neural network called incremental growing multi-experts network (IGMN). It is convincingly shown by simulation that by using a scaled robust objective function instead of the least squares function, the influence of the outliers in the training data can be completely eliminated. The network generates a much better approximation in the neighborhood of outliers. Thus, the two proposed robust learning methods namely robust least mean squares (RLMSs) and least mean log squares (LMLSs) are insensitive to the presence of outliers unlike the least mean squares (LMSs) cost function. Moreover, various types of supervised learning algorithms can easily adopt LMLS, which is a parameter-free method.
Artificial Neural Networks for Microwave Computer-Aided Design: The State of the Art
2022, IEEE Transactions on Microwave Theory and Techniques
Recent advances in knowledge-based model structure optimization and extrapolation techniques for microwave applications
2021, International Journal of Numerical Modelling: Electronic Networks, Devices and Fields
Simulation and Automated Modeling of Microwave Circuits: State-of-the-Art and Emerging Trends
2021, IEEE Journal of Microwaves
Extrapolation techniques for neural-based nonlinear electronic modeling and circuit simulation
2019, 2019 6th International Conference on Systems and Informatics, ICSAI 2019

View all citing articles on Scopus

¹: Tel.:+60-4-6577888-2157.

View full text

Extrapolation detection and novelty-based node insertion for sequential growing multi-experts network

Abstract

Introduction

Section snippets

Related Work

Growing multi-experts network (GMN)

Certainty factor

Extrapolation problem in GMN model

On-line GMN identification algorithm

Equivalence to Generalized Takagi–Sugeno Fuzzy system

Benchmark study

System identification of a continuous-stirred tank reactor using GMN model

Conclusions

Neural Netw.

Neurocomputing

Neural Netw.

Fuzzy Sets Syst.

Universal approximation using radial basis function networks

Neural Comput.

Regularization theory and neural networks architectures

Neural Comput.

Fast learning in networks of locally tuned processing units

Neural Comput.

Fast learning with incremental RBF networks

Neural Process. Lett.