An evolutionary constructive and pruning algorithm for artificial neural networks and its prediction applications
Introduction
Many numerical algorithms to accurately predict the trends of time series in the future have been proposed, including the autocorrelation method [1], the covariance method [2], and grey theory [3]. Recently, to further improve the accuracy of time series prediction, investigators have focused on intelligent algorithms based on artificial neural networks (ANNs) due to their learning abilities and powerful prediction capability [4], [5].
ANNs were first developed to imitate biological neural systems and are organized into several interconnected simple processing units called neurons or nodes. ANNs are data-driven approaches that learn from examples, even when the input–output relationships are unknown [6]. Thus, ANNs can accurately solve problems without prior knowledge when sufficient observed data are supplied. This property is useful for evaluating numerous forecasting problems because acquiring data is easier than making good theoretical guesses about certain systems.
An important component of every ANN is architecture selection, which involves determining an appropriate architecture to accurately fit the underlying function described by the training data [7]. An architecture that is too large may precisely fit the training data but may provide poor generalization due to overfitting of the training data. Conversely, an architecture that is too small saves computational costs but may not possess sufficient processing ability to accurately approximate the underlying function. Therefore, architecture selection should consider both network complexity and goodness of fit.
For prediction purposes, it has been shown that a feedforward ANN with a single hidden layer is sufficient to achieve any desired accuracy [8]. In most applications, ANNs are fully connected, i.e., all inputs are fully connected to all hidden neurons. Numerous studies have shown that partially connected ANNs have better storage capability per connection than fully connected ANNs [9], [10]. Furthermore, partially connected ANNs can yield improved generalization capabilities with reduced cost in terms of hardware and processing time [11]. However, how to determine the optimal numbers of hidden neurons and connections remains an open question.
Among several algorithms for designing three-layered ANNs, the most frequently used algorithms are the constructive, pruning, and constructive-pruning algorithms. A constructive algorithm [12] starts with a minimal ANN architecture, a three-layered ANN with one hidden neuron. The algorithm adds hidden neurons to the minimal ANN, one-by-one, during the training phase. The advantage of the constructive algorithm is that the initial phase can simply set the number of hidden layers and neurons as one each. However, deciding when to add hidden neurons or connections and when to stop the addition process is difficult.
A pruning algorithm [13] starts with an oversized architecture and then deletes unnecessary hidden neurons or connections, either during training or upon convergence to a local minimum. Each iteration of the pruning algorithm determines which unit, i.e., which hidden neuron or connection, to prune via its relevance or significance. Several pruning criteria have been proposed, for example, sensitivity analysis [14] and magnitude-based pruning [15]. Sensitivity analysis is based on Taylor expansion and reflects the ways in which the derivatives of a performance function can be applied to quantify a system's response to unit perturbations. Magnitude-based pruning assumes that small weights are irrelevant. However, no criterion can be used to determine the initially oversized architecture for a given problem [12].
In the constructive algorithm, the architecture of the ANN may become oversized if the addition procedure is not appropriately stopped. A number of algorithms have attempted to combine constructive and pruning algorithms to solve the aforementioned problem [16], [17]. These constructive-pruning algorithms first estimate the number of hidden neurons and/or connections via a constructive method. A pruning method is then used to delete the inappropriate hidden neurons and/or connections to find a near-optimal architecture for a given problem. However, determining when to stop the pruning procedure is difficult [18].
Several researchers have developed methods for designing ANNs using evolutionary algorithms (EAs). EAs emerged as a biologically plausible approach for adapting various ANN parameters such as weight values and architectures [19]. Recently, several studies have been proposed to employ various EAs to prune NNs. Mantzaris et al. [20] pruned probabilistic neural network by genetic algorithm to minimize the number of diagnostic factors, and therefore minimized the number of input nodes and hidden layers. Curry and Morgan [21] proposed a modified feedforward neural network which is pruned and optimized by means of differential evolution for seasonal data. Huang and Du [22] use particle swarm optimization to prune the radial basis probabilistic neural networks. Masutti and Castro [23] combined characteristics from self-organizing networks and artificial immune systems to solve the traveling salesman problem and pruned neurons which are not related to a city. Furthermore, numerous works have been done to perform EAs and pruning methods separately or simultaneously. Kaylani et al. [24] incorporated prune operator into a genetic algorithm as a mutation operator to design ARTMAP architecture for classification problems. Goh et al. [25] developed a hybrid multiobjective evolutionary approach for adaptation of ANNs structures and a geometrical approach in identifying hidden neurons to prune for classification problems. Hervás-Martínez et al. [26] applied an evolutionary algorithm to design the structure and weights of a product-unit neural network, and finally used a backward stepwise procedure to prune variables sequentially until no further pruning can be made to improve the fit. However, most encoding schemes must predefine the chromosome length, which is problem-dependent. This user-defined length can affect the flexibility of problem representation and EA efficiency [27], [28].
Herein, we propose a new approach for designing ANNs, the evolutionary constructive and pruning algorithm (ECPA). This algorithm directs the evolution of the ANN topology using constructive and pruning methods in an evolutionary manner. In ECPA, a variable-length chromosome representation is adopted to describe ANNs with different architectures. Thus, it is not necessary to predefine the length of the chromosome, and this makes the use of memory more efficient. Furthermore, ECPA introduces the concept of constructive method into the crossover and mutation operations in a manner that allows the initial structure of the ANN to be simply set as a minimal network containing one hidden neuron with a single connection to one input. The crossover and mutation operations then enlarge the architecture by adding hidden neurons and connections. ECPA then prunes the resulting ANNs via a newly developed scheme consisting of cluster-based pruning (CBP) and age-based survival selection (ABSS).
The rest of this paper is organized as follows. Section 2 describes the proposed ECPA in detail. Section 3 demonstrates the proposed algorithm's ability to evolve partially connected ANNs for a variety of problems of interest. Finally, in Section 4, we present our conclusions.
Section snippets
ECPA
Based on the characteristics of ANNs and EA, we propose ECPA to develop ANNs based on an evolutionary constructive and pruning manner. As discussed in [29], theoretical work has shown that a single hidden layer is sufficient for forecasting purposes. Therefore, in this work, we designed a three-layer feedforward ANN with an input layer, a hidden layer, and an output layer. The major steps of ECPA are summarized in Fig. 1 and explained below.
Initialization phase
- (Step 1)
Generate an initial population
Experimental results
In this section, we demonstrate the performance of the proposed algorithm using three time series prediction problems: Mackey-Glass, sunspots, and vehicle count. The first time series is generated from the Mackey-Glass differential equation, the second series is recorded from the sunspots, and the third series is obtained from the hourly vehicle count for the Monash Freeway outside Melbourne in Victoria, Australia, beginning in August, 1995. During the evolutionary process, the
Conclusions
A novel structural learning algorithm, called ECPA, is proposed for the design of ANNs based on an evolutionary constructive and pruning algorithm. ECPA evolves the ANNs starting with a minimal structure: one hidden neuron connected to an input node. The crossover and mutation operations make the ANN structures more complex, whereas CBP and ABSS make the ANN structures more compact. The results of the numerical simulations show that the use of CBP and ABSS operations indeed generates compact
Acknowledgments
This work was supported in part by the National Science Council, Taiwan, R.O.C., under Contract No. NSC 99-2221-E-009-107 and in part by a grant provided by the Industrial Technology Research Institute under Contract No. A353C40000B1-4.
Shih-Hung Yang received his B.S. degree in Mechanical Engineering and M.S. degree in Electrical and Control Engineering from National Chiao Tung University, Taiwan, in 2002 and 2004, respectively. He is currently working toward the Ph.D. degree in Electrical and Control Engineering at National Chiao Tung University, Taiwan. His researches include machine vision, neural networks, and evolution computation. He was a recipient of the Outstanding Teaching Assistant Award from the ECE Department at
References (46)
- et al.
Multilayer feedforward networks are universal approximators
Neural Networks
(1989) - et al.
Creating artificial neural networks that generalize
Neural Networks
(1991) - et al.
Back-propagation algorithm which varies the number of hidden units
Neural Networks
(1991) - et al.
Genetic algorithm pruning of probabilistic neural networks in medical disease estimation
Neural Networks
(2011) - et al.
Neuro-immune approach to solve routing problems
Neurocomputing
(2009) - et al.
AG-ART: an adaptive approach to evolving ART architectures
Neurocomputing
(2009) - et al.
Multilogistic regression by means of evolutionary product-unit neural networks
Neural Networks
(2008) - et al.
Forecasting with artificial neural networks: the state of the art
Int. J. Forecast.
(1998) - et al.
Perturbation method for deleting redundant inputs of perceptron networks
Neurocomputing
(1997) - et al.
Time series prediction using evolving radial basis function networks with new encoding scheme
Neurocomputing
(2008)
The effect of different basis functions on a radial basis function network for time series prediction: a comparative study
Neurocomputing
Time series analysis using normalized PG-RBF network with regression weights
Neurocomputing
Time-series prediction using a local linear wavelet neural network
Neurocomputing
Radial basis function based adaptive fuzzy systems and their applications to system identification and prediction
Fuzzy Sets Syst.
Adaptive neural network model for time-series forecasting
Eur. J. Oper. Res.
Time series forecasting using a hybrid ARIMA and neural network model
Neurocomputing
Adaptive metrics in the nearest neighbours method
Phys. D
Linear Prediction of Speech
Linear prediction: a tutorial review
Proc. IEEE
Introduction to grey system theory
J. Grey Syst.
How effective are neural networks at forecasting and prediction? A review and evaluation
J. Forecast.
Intelligent forecasting system using grey model combined with neural network
Int. J. Fuzzy Syst.
Neural Networks and Fuzzy Systems: A Dynamical Systems Approach to Machine Intelligence
Cited by (63)
Multiobjective bilevel programming model for multilayer perceptron neural networks
2023, Information SciencesExploration for a BP-ANN model for gas identification and concentration measurement with an ultrasonically radiated catalytic combustion gas sensor
2022, Sensors and Actuators B: ChemicalOptimization of ANN Architecture: A Review on Nature-Inspired Techniques
2019, Machine Learning in Bio-Signal Analysis and Diagnostic ImagingRobust convolution kernel quantity determination based on corner radiation area adaptation
2018, NeurocomputingCitation Excerpt :One is network pruning, the other is convolution kernel quantity adaptation. Network pruning reduces overfitting by pruning the model structure and a great deal of further developments [7–11] have been proposed. Convolution kernel quantity adaptation adapts the number of convolution kernels through the features of the data set itself to avoid overfitting.
Constructive Deep Neural Network for Breast Cancer Diagnosis
2018, IFAC-PapersOnLineOptimization of neural network using kidney-inspired algorithm with control of filtration rate and chaotic map for real-world rainfall forecasting
2018, Engineering Applications of Artificial IntelligenceCitation Excerpt :However, classical techniques frequently face difficulties in solving optimization problems in the real world; they tend to require a large amount of computational time, large amount of memory, become trapped in local optima and produce poor-quality solutions. In order to overcome these difficulties, metaheuristic algorithms are increasingly been used to train ANNs (see for example, Islam et al., 2009; Oh et al., 2009; Curry and Morgan, 2010; Kaylani et al., 2010; Mantzaris et al., 2011; Zanchettin et al., 2011; Yang and Chen, 2012; Jaddi et al., 2013, 2015a,b; Jaddi and Abdullah, 2017; Jaddi et al., 2017a). A review of the design of feed forward neural networks is given in Ojha et al. (2017).
Shih-Hung Yang received his B.S. degree in Mechanical Engineering and M.S. degree in Electrical and Control Engineering from National Chiao Tung University, Taiwan, in 2002 and 2004, respectively. He is currently working toward the Ph.D. degree in Electrical and Control Engineering at National Chiao Tung University, Taiwan. His researches include machine vision, neural networks, and evolution computation. He was a recipient of the Outstanding Teaching Assistant Award from the ECE Department at National Chiao Tung University in 2008 and 2009, and Student Scholarships from IEEE Industrial Electronics Society and IEEE Computational Intelligence Society in 2010 and 2011, respectively.
Yon-Ping Chen received his B.S. degree in Electrical Engineering from National Taiwan University, Taiwan, in 1981, and M.S. and Ph.D. degrees in Electrical Engineering from University of Texas at Arlington, USA, in 1986 and 1989, respectively. He is a Distinguished Professor in the Department of Electrical Engineering, National Chiao Tung University, Taiwan. His researches include control, image signal processing, and intelligent system design.