Artificial neural networks with evolutionary instance selection for financial forecasting
Introduction
In general, artificial neural networks (ANNs) can produce robust performance when a large amount of data is available. However, ANN often exhibits inconsistent and unpredictable performance on noisy data. In addition, it may not be possible to train ANN or the training task cannot be effectively carried out without data reduction when a data set is too huge. Data reduction can be achieved in many ways such as feature selection or feature discretization (Blum and Langley, 1997, Kim and Han, 2000, Liu and Motoda, 1998).
One facet of data mining concerns the selection of relevant instances for this reason. Instances are a collection of training examples in supervised learning and instance selection chooses a part of the data that is representative and relevant to the characteristics of all the data. Instance selection is one of popular methods for dimensionality reduction and is directly related to data reduction. Although instance selection is the most complex form of data reduction because the computationally expensive prediction methods must be invoked more often to determine the effectiveness of instance selection, we can usually remove irrelevant instances as well as noise and redundant data (Liu and Motoda, 2001, Weiss and Indurkhya, 1998).
Many researchers have suggested instance selection methods such as squashed data, critical points, prototype construction, in addition to many forms of sampling (Liu & Motoda, 2001). The efforts to select relevant instances from an initial data set have stemmed from the need to reduce immense storage requirements and computational loads (Kuncheva, 1995). The other perspective on this subject, as pointed out in Dasarathy (1990), is to achieve enhanced performance from the learning algorithm through instance selection. In addition, training time may be shortened by use of the proper instance selection algorithm.
This paper proposes a new hybrid model of ANN and genetic algorithms (GAs) for instance selection. An evolutionary instance selection algorithm reduces the dimensionality of data and may eliminate noisy and irrelevant instances. In addition, this study simultaneously searches the connection weights between layers in ANN through an evolutionary search. The genetically evolved connection weights mitigate the well-known limitations of gradient descent algorithm.
The rest of this paper is organized as follows: Section 2 presents the research background. Section 3 proposes the evolutionary instance selection algorithm and describes the benefits of the proposed algorithm. Section 4 describes the application of the proposed algorithm. Conclusions and the limitations of this study are presented in Section 5.
Section snippets
Research background
For some applications, quality of data mining is improved with additional instances. However, the number of instances may tend to increase the complexity of induced solution. Increased complexity is not desirable, but may be the price to pay for better performance. In addition, increased complexity decreases the interpretability of the result (Weiss & Indurkhya, 1998). In this sense, many researchers have suggested instance selection methods. The following sections present some instance
A GA approach to instance selection for ANN
As mentioned earlier, there are many studies on instance selection for the instance-based learning algorithm. However, there are few studies on instance selection for ANN. Thus, there are few relevant theories concerning instance selection for ANN. This paper proposes the GA approach to instance selection for ANN (GAIS). The overall framework of GAIS is shown in Fig. 1. In this study, the GA supports the simultaneous optimization of connection weights and selection of relevant instances.
The
Application: analysis of the stock market data
This section applies GAIS to stock market prediction. The efficiency and effectiveness of GAIS may be properly tested because the stock market data is very noisy and complex. Many studies on stock market prediction using artificial intelligence techniques were performed in the past decade. Some of them, however, did not produce outstanding prediction accuracy partly because of the tremendous noise and non-stationary characteristics in stock market data. If these factors are not appropriately
Concluding remarks
Prior studies tried to optimize the controlling parameters of ANN using global search algorithms. Some of them only focused on the optimization of the connection weights of ANN. Others placed little emphasis on the optimization of the learning algorithm itself, but most studies focused little on instance selection for ANN. In this paper, I use the GA for ANN in two ways. I first use the GA to determine the connection weights between layers. This may mitigate the well-known limitations of the
References (41)
- et al.
Selection of relevant features and examples in machine learning
Artificial Intelligence
(1997) - et al.
Comparing backpropagation with a genetic algorithm for neural network training
Omega
(1999) - et al.
A framework for the description of evolutionary algorithms
European Journal of Operational Research
(2000) - et al.
Genetic algorithms approach to feature discretization in artificial neural networks for the prediction of stock price index
Expert Systems with Applications
(2000) ‘Change-glasses’ approach in pattern recognition
Pattern Recognition Letters
(1993)Editing for the k-nearest neighbors rule by a genetic algorithm
Pattern Recognition Letters
(1995)- et al.
Expert systems for predicting stock market timing using a candlestick chart
Expert Systems with Applications
(1999) Automating case selection in the construction of a case library
Knowledge Based Systems
(2000)- et al.
Using change-point detection to support artificial neural networks for interest rates forecasting
Expert Systems with Applications
(2000) - et al.
Toward global optimization of neural networks: A comparison of the genetic algorithm and backpropagation
Decision Support Systems
(1998)