A multi-objective approach to RBF network learning

doi:10.1016/j.neucom.2007.11.021

Neurocomputing

Volume 71, Issues 7–9, March 2008, Pages 1203-1209

https://doi.org/10.1016/j.neucom.2007.11.021 Get rights and content

Abstract

The problem of inductive supervised learning is discussed in this paper within the context of multi-objective (MOBJ) optimization. The smoothness-based apparent (effective) complexity measure for RBF networks is considered. For the specific case of RBF network, bounds on the complexity measure are formally described. As the synthetic and real-world data experiments show, the proposed MOBJ learning method is capable of efficient generalization control along with network size reduction.

Introduction

The supervised learning paradigm has been widely discussed in recent years. From the statistical learning theory, it is known that conditions for a good generalization can be reached at the minimum of the true risk. However, this turns out to be difficult to achieve since only empirical risk can be minimized directly. Non-synthetic data are most likely to be finite and disturbed by noise, often leading direct empirical risk minimization methods to a poor generalization and over-fitting. The key to a good generalization lies in the control of the capacity (complexity) of a learning machine that is formulated by the well-known structural risk minimization (SRM) principle [21]. The search for a proper complexity can be also viewed as a problem of balance between bias and variance [8]. Pruning methods [15], regularization networks [9] and support vector machines [20] (which are a certain form of regularization) are commonly used to achieve that goal. While pruning algorithms control the complexity by manipulating a network structure (e.g. number of nodes), regularization methods aim at controlling the network output response in spaces of smoothness [19], manipulating the so-called apparent complexity. Being a more general complexity representation, the apparent complexity as measure of smoothness may restrict the learning capabilities of the structure. Another example of apparent complexity is the known relationship between learning capacity of multi-layer perceptrons (MLP) and the size of its weights [3].

The approaches mentioned above jointly minimize network complexity and empirical error in the form of a single loss-function that usually consists of an error cost function and a regularizer. Although this may result in good generalization models, they are highly dependent on user defined training parameters. In addition, it is generally known that error and complexity are conflicting objectives and, similarly to bias and variance, they demand trading-off instead of the joint minimization. This viewpoint led to the development of multi-objective machine learning (MOML) methods [10], which treat the empirical risk and learning capacity of a hypothesis class as two separate objectives. The evolutionary algorithms [11] are commonly used for solving MOML problems, however [4], [6], [16] show the efficiency of multi-objective (MOBJ) algorithms based on nonlinear programming.

Consider a neural network of a certain structure with its input–output response determined by the parameter vector $ω \in Ω$ . Introducing error and complexity objective functions $φ_{e} (ω)$ and $φ_{c} (ω)$ , respectively, with the first one representing the empirical risk and the second one representing the learning capacity, we may now formulate the vector optimization problem for MOML as $\min_{ω \in Ω} [φ_{e} (ω), φ_{c} (ω)] .$ Since the objectives are conflicting in the region of interest, the solution of (1) results in a set of Pareto-optimal solutions $Ω^{*} = {ω^{*} \in Ω : φ_{e} (ω^{*}) ⩽ φ_{e} (ω)$ and $φ_{c} (ω^{*}) ⩽ φ_{c} (ω), \forall ω \in Ω},$ which form the so-called Pareto front of the feasible region. In other words, the Pareto front $Ω^{*}$ contains the solutions which represent the best compromise between the two objectives. It means that for each solution $ω \notin Ω^{*}$ there always exists at least one solution $ω^{*} \in Ω^{*}$ having lower complexity and error. As common, the squared error criterion and the norm of network weights $∥ w ∥$ are taken as error and complexity measures for MLP networks [16]. To the authors’ knowledge, a general definition for assessing apparent complexity of other network types is not known, that is an obstacle for MOBJ learning. In this paper we present the smoothness-based apparent complexity measure and determine its bounds for radial basis function (RBF) neural networks. In particular, we show further that the apparent complexity can be limited by the ratio of the $1$ -norm of the weights vector to the width of the RBFs. It is also demonstrated that such form of the complexity control leads to sparse weights, eventually making the proposed MOBJ learning method to be capable of reducing the network size.

Section snippets

Apparent complexity measure for RBF networks

Consider the smooth input–output mapping function $f : X \to R$ with the compact support in $X \subset R^{n}$ . In general, one can express the smoothness of f in Sobolev space [1] $W^{k, p} (X)$ , endowed with the norm $∥ f ∥_{k, p} = {(\sum_{0 ⩽ | r | ⩽ k} ∥ D^{r} f ∥_{p}^{p})}^{1 / p}$ or its equivalent norm $∥ f ∥_{k, p}^{'} = ∥ f ∥_{p} + \sum_{| r | = k} ∥ D^{r} f ∥_{p} .$ Here $D^{r} f (x) = \frac{\partial^{| r |}}{\partial x_{1}^{r_{1}} \partial x_{2}^{r_{2}} \dots \partial x_{n}^{r_{n}}}, D^{0} f = f (x)$ is the rth order generalized partial derivative operator, where $r = (r_{1}, r_{2}, \dots, r_{n})$ and $| r | = \sum_{i = 1}^{n} r_{i}$ , and the $L^{p}$ -norm is defined as $∥ f ∥_{p} = \int_{X} | f (x) |^{p} \partial x$ , $p ⩾ 1$ .

In particular case for $p = 2$ , the spaces $W^{k, 2}$ are

MOBJ learning

Usually, the vector optimization problem (1) is difficult to solve analytically; however, in practice it is enough to approximate the Pareto front $Ω^{*}$ with a finite number of solutions. According to MOBJ learning concept, when a certain approximation is obtained, the resulted “best” solution must be selected from $Ω^{*}$ on a decision making step, with respect to a posteriori model selection criterion (e.g. minimum validation error or maximum entropy).

A variety of methods for the obtaining of

Experiments and discussion

In this section, we present further several experiments on both the synthetic and real-world data. All experiments follow the same MOBJ learning scheme described in Section 3. The classical ellipsoid method was applied for solving the optimization problem (10).

Conclusions

In this work, we derive the bounds on smoothness-based apparent complexity for RBF networks exploring a formulation of the smoothness in Sobolev spaces. It is confirmed experimentally that the proposed bounds participating in the MOBJ scheme provide the efficient solutions of learning problems.

The proposed MOBJ method controls the generalization in loss-smoothness spaces similarly to regularization learning, but in addition, it provides a complete solution search, regardless of the convexity of

Acknowledgements

The authors are grateful to CNPq, Brazil for financial support.

Illya Kokshenev received the M.S. degree in Computer Science from the Kharkov National University of Radioelectronics (Ukraine) in 2004. He is currently a Ph.D. student at the Department of Electronic Engineering, Federal University of Minas Gerais (Brazil). His research interests cover machine learning, neural networks, neuro-fuzzy systems and, presently, multi-objective machine learning.

References (21)

M.A. Costa et al.
Training neural networks with a multi-objective sliding mode control algorithm
Neurocomputing
(2003)
R.A. Adams et al.
Sobolev Spaces
(2003)
A. Asuncion, D. Newman, UCI machine learning repository. URL 〈http://www.ics.uci.edu/∼mlearn/MLRepository.html〉,...
P.L. Bartlett
For valid generalization the size of the weights is more important than the size of the network
A.P. Braga et al.
Multi-objective algorithms for neural-networks learning
M.A. Costa, A.P. Braga, Optimization of neural networks with multi-objective lasso algorithm, in: International Joint...
B. Efron, T. Hastie, I. Johnstone, R. Tibshirani, Least angle regression, Technical Report, Department of Statistics,...
S. Geman et al.
Neural network and the bias/variance dilemma
Neural Comput.
(1992)
F. Girosi et al.
Regularization theory and neural networks architectures
Neural Comput.
(1995)

There are more references available in the full text version of this article.

Cited by (28)

A novel method for structure selection of the Recurrent Random Neural Network using multiobjective optimisation
2019, Applied Soft Computing Journal
The Random Neural Network (RNN) has extensively investigated over the past few decades; this research has resulted in a considerable number of theoretical and application papers. Although, great effort has been done to develop a systematic procedure to train the recurrent fashion of the RNN, the choice of the number of neurons remains an open question. To overcome this problem, at least partially, this paper uses multiobjective optimisation (MOP) to select the number of neurons. The MOP framework used the mean square error (MSE) and the number of neurons (N) as the objectives to be minimised. The stochastic nondominated algorithm (SNA) to exclude dominated solutions of the Pareto-set has been also introduced. Instead of using only the best solution, candidates to the Pareto-set are excluded by statistical comparison among mean values of the two objectives in all training runs. The SNA allows a statistically correct exclusion of dominated solutions; the best solution can be picked up using classical decision-making procedures. Numerical and real examples illustrate the potentiality of the proposed method in two areas: classification problems and system identification.
A hybrid multiobjective RBF-PSO method for mitigating DoS attacks in Named Data Networking
2015, Neurocomputing
Citation Excerpt :
Some of the existing algorithms to achieve that are given in [32,36,41–43]. Almost all algorithms compute the optimal estimation of the basis function centers by means of error minimization, i.e., accuracy based on Mean-Square Error (MSE) [36,43–45]. However, MSE is not suitable for determining the optimal position of basis function centers.
Named Data Networking (NDN) is a promising network architecture being considered as a possible replacement for the current IP-based (host-centric) Internet infrastructure. NDN can overcome the fundamental limitations of the current Internet, in particular, Denial-of-Service (DoS) attacks. However, NDN can be subject to new type of DoS attacks namely Interest flooding attacks and content poisoning. These types of attacks exploit key architectural features of NDN. This paper presents a new intelligent hybrid algorithm for proactive detection of DoS attacks and adaptive mitigation reaction in NDN. In the detection phase, a combination of multiobjective evolutionary optimization algorithm with PSO in the context of the RBF neural network has been applied in order to improve the accuracy of DoS attack prediction. Performance of the proposed hybrid approach is also evaluated successfully by some benchmark problems. In the adaptive reaction phase, we introduced a framework for mitigating DoS attacks based on the misbehaving type of network nodes. The evaluation through simulations shows that the proposed intelligent hybrid algorithm (proactive detection and adaptive reaction) can quickly and effectively respond and mitigate DoS attacks in adverse conditions in terms of the applied performance criteria.
Memetic multiobjective particle swarm optimization-based radial basis function network for classification problems
2013, Information Sciences
This paper presents a new multiobjective evolutionary algorithm applied to a radial basis function (RBF) network design based on multiobjective particle swarm optimization augmented with local search features. The algorithm is named the memetic multiobjective particle swarm optimization RBF network (MPSON) because it integrates the accuracy and structure of an RBF network. The proposed algorithm is implemented on two-class and multiclass pattern classification problems with one complex real problem. The experimental results indicate that the proposed algorithm is viable, and provides an effective means to design multiobjective RBF networks with good generalization capability and compact network structure. The accuracy and complexity of the network obtained by the proposed algorithm are compared with the memetic non-dominated sorting genetic algorithm based RBF network (MGAN) through statistical tests. This study shows that MPSON generates RBF networks coming with an appropriate balance between accuracy and simplicity, outperforming the other algorithms considered.
Reinforcement radial basis function neural networks with an adaptive annealing learning algorithm
2013, Applied Mathematics and Computation
Citation Excerpt :
In addition, RBFNNs can be used to approximate desired outputs without requiring a mathematical description of their functional dependence on the inputs and are therefore often referred to as model-free estimators [12]. Recently, several researchers proposed efficient learning algorithms for RBFNNs [13–18]. Besides, a number of sequential learning algorithms for resource allocation networks have been developed to improve training error convergence and computational efficiency in the extant literature [19–26].
This article proposes reinforcement radial basis function neural networks (RBFNNs) to identify dynamical systems. The proposed algorithm adopts a support vector machine (SVM) to determine the initial structure of RBFNNs. After initialization, an adaptive annealing learning algorithm (AALA) is applied to optimize RBFNNs. When utilizing the optimal RBFNNs to identify dynamic systems, researchers often have problems determining the appropriate learning rates for the evolutionary algorithm and generally obtain better values through trial and error. However, these values may not be the best combinations. This paper proposes a systematic architecture method to determine these parameters. First, orthogonal array (OA) matrix experiments are adopted to find an appropriate combination of learning rates. Then the optimal combination for the evolutionary procedure is obtained. In the learning algorithm procedure, an OA-based AALA (OA-AALA) is provided to determine the optimal RBFNNs (OA-AALA-RBFNNs). Reinforcement RBFNNs can be constructed to identify dynamic systems. To demonstrate the superiority of OA-AALA-RBFNNs for system identification, this study compares the simulation results of the proposed OA-AALA-RBFNNs, ARLA-RBFNNs with an annealing robust learning algorithm (ARLA), and OA-ARLA-RBFNNs with an OA-based annealing robust learning algorithm.
Convergence analysis of sliding mode trajectories in multi-objective neural networks learning
2012, Neural Networks
The Pareto-optimality concept is used in this paper in order to represent a constrained set of solutions that are able to trade-off the two main objective functions involved in neural networks supervised learning: data-set error and network complexity. The neural network is described as a dynamic system having error and complexity as its state variables and learning is presented as a process of controlling a learning trajectory in the resulting state space. In order to control the trajectories, sliding mode dynamics is imposed to the network. It is shown that arbitrary learning trajectories can be achieved by maintaining the sliding mode gains within their convergence intervals. Formal proofs of convergence conditions are therefore presented. The concept of trajectory learning presented in this paper goes further beyond the selection of a final state in the Pareto set, since it can be reached through different trajectories and states in the trajectory can be assessed individually against an additional objective function.
Multi-objective hybrid evolutionary algorithms for radial basis function neural network design
2012, Knowledge-Based Systems
Citation Excerpt :
In addition, the researchers in [30] used multi-objective optimization to find the best of RBF, number of hidden units, centers and widths of RBFNs. The previous works [21–30] have been proposed for unsupervised learning or supervised learning based RBFN separately. However, the current work has developed hybrid learning (unsupervised learning and supervised learning) and structure optimization simultaneously based on proposed methods to obtain simple and accurate RBFNs.
This paper presents new multi-objective evolutionary hybrid algorithms for the design of Radial Basis Function Networks (RBFNs) for classification problems. The algorithms are memetic Pareto particle swarm optimization based RBFN (MPPSON), Memetic Elitist Pareto non dominated sorting genetic algorithm based RBFN (MEPGAN) and Memetic Elitist Pareto non dominated sorting differential evolution based RBFN (MEPDEN). The proposed methods integrate accuracy and structure of RBFN simultaneously. These algorithms are implemented on two-class and multiclass pattern classification problems with one complex real problem. The results reveal that the proposed methods are viable, and provide an effective means to solve multi-objective RBFNs with good generalization ability and simple network structure. The accuracy and complexity of the network obtained by the proposed algorithms are compared through statistical tests. This study shows that the proposed methods obtain RBFNs with an appropriate balance between accuracy and simplicity.

View all citing articles on Scopus

Antonio Padua Braga was born in Brazil in 1963. He obtained his first degree in Electrical Engineering and his Master's in Computer Science, both at Federal University of Minas Gerais, Brazil. The main subject of his Ph.D. which was received from University of London, England, was Storage Capacity of Boolean Neural Systems. After finishing his Ph.D. he returned to his position as a professor at Department of Electronics Engineering at Federal University of Minas Gerais. He has published several papers in International Journals, Conferences and has written two books in NN. He is the head of the Computational Intelligence Laboratory at Department of Electronics and was co-editor in chief of the International Journal of Computational Intelligence and Applications (IJCIA), published by Imperial College Press. His current research interest areas are the development of learning algorithms for NNs, hybrid neural systems and applications of NNs.

View full text

A multi-objective approach to RBF network learning

Abstract

Introduction

Section snippets

Apparent complexity measure for RBF networks

MOBJ learning

Experiments and discussion

Conclusions

Acknowledgements

Neurocomputing

Sobolev Spaces

For valid generalization the size of the weights is more important than the size of the network

Multi-objective algorithms for neural-networks learning

Neural network and the bias/variance dilemma

Neural Comput.

Regularization theory and neural networks architectures

Neural Comput.