Treatment of skewed multi-dimensional training data to facilitate the task of engineering neural models

https://doi.org/10.1016/j.eswa.2006.07.010Get rights and content

Abstract

Successful application of neural network models relies heavily on problem-dependent internal parameters. As the theory does not facilitate the choice of the optimal parameters of neural models, these can solely be obtained through a tedious trial-and-error process. The process requires performing multiple training simulations with various network parameters, until satisfactory performance criteria of a neural model are met. In literature, it has been shown that neural models are not consistently good in prediction under highly skewed data. Consequently, the cost of engineering neural models rises in such circumstance to seek for appropriate internal parameters. In this paper the aim is to show that a recently proposed treatment of highly skewed data eases the task of practitioners in engineering neural network models to meet satisfactory performance criteria. As the applications of neural models grows dramatically in diverse engineering domains, the understanding of the treatment show indispensable practical values.

Introduction

A remarkable increase in the employment of Artificial Neural Networks (ANN) in recent decade has been witnessed in science and engineering disciplines. ANNs are seen to be a highly promising technology in modeling due to their ability to learn system behavior under inspection from samples and become a regular method to provide solution in optimization, regression and estimation problems.

Multi Layered Perceptron (MLP) neural networks have been widely used to model the complex interdependencies and phenomena that appear in engineering applications. At the practical level, the standard approach to using neural models requires a finite set of training examples and the setting of a certain number of network parameters which have to be fixed a priori. For a successful application of MLP neural networks, one should determine internal parameters, such as initial weights and network structure, to meet required performance criteria (Cigizoglu and Alp, 2006, Ghedira and Bernier, 2004). In engineering neural network models, this is one of the main problems, as an inadequate network would be unable to learn. The problem of finding a suitable architecture and the corresponding weights of the network, however, is a very complex task (García-Pedrajas et al., 2005, Saxén and Pettersson, 2006) and main difficulties arises from the fact that the theory does not provide instruments to choice optimal values for these parameters. This, in turn, results in the need to train an ANN multiple times, effectively using different initializations and architectures. Consequently, the practitioner is involved in a process that requires more development time as well as experience and intuition. (Krimpenis & Vosniakos, 2005). Furthermore, this approach does not guarantee the optimality of the obtained parameters for a given problem (Esposito, Aversano, & Quek, 2001).

Recently, methods have been proposed to find optimal parameters in literature by Ma and Khorasani, 2003, García-Pedrajas et al., 2005, Saxén and Pettersson, 2006. However, the processes to obtain optimal network parameters are usually extremely computationally intensive and for individual problem it is often quicker to use traditional model selection (Bullinaria, 2005).

In regression problems, neural networks are expected to discern highly non-linear and multi-dimensional functions. In such circumstances, inspecting of the training data set to reveal the distribution of experimental data would be beneficial where highly non-uniform and skewed distribution is likely encountered in the training data. SubbaNarasimha, Arinze, and Anandarajan (2000) has conducted to see whether neural network models are better than multiple regression at prediction in samples which are highly skewed. They concluded that ANNs are not consistently good in prediction under this circumstance. A recent study by Usha (2005) which compares neural networks and classical regression methods shows that skewness in input training data set should be reduced using some transformation like power transformation before carrying out neural network analysis in order to improve neural learning. Altun, Bilgil, and Fidan (2006) show that skewness in either input and target data sets should manipulate in an appropriate way. The present article contributes to this line of research. It will be shown in this paper that the proposed data treatment by Altun et al. (2006) results in a remarkable improvement on the performance of neural network model, regardless of the initialization of weights and network structural parameters. This is interpreted as a strong indicator that the practitioner is likely not to be required to perform tedious trial-error procedure to seek for optimal parameters in order to meet given performance criteria. Consequently, the computational costs in developing neural network models will be effectively reduced. The sediment estimation, which is an important problem in river engineering, is taken as case study. A set of experiments with three weight initializations and three structures of MLP neural network are carried out. In each turn, a neural network with identical parameters is trained using nine training sets, resulting 81 training sessions.

The organization of the paper is follows. Section 2 introduces the sediment prediction problem as a case study and gives recent achievements in the sediment prediction and the use of neural networks in river engineering. Experiments are detailed in Section 3. Firstly, the data used in the regression problem is described. Treatment of input and target data sets is explained. Finally, training process of ANNs is detailed in Section 4. The paper is concluded by discussing the results.

Section snippets

Sediment prediction and neural networks in river engineering

The estimation of sediment transport is very important in hydraulic engineering applications such as the control and management of watersheds, the design of water construction and reservoir, and the prediction of economical life of investments, etc. It is a highly non-linear problem and is difficult to solve. Numerous classical approaches are proposed in literature up to date. Outcomes of these are empirical formulas which try to map easily measurable parameters into sediment discharge, which

Manipulation of the input and target data sets

The data used to train neural networks is obtained from Kizilirmak river, Turkey as explained in Altun et al. (2006). Kızılırmak is an important river with length of 11,551 km and having the watershed area of 79,380 km2. The analyzing area of Kızılırmak river includes almost the length of 929 km. The training set consists of the data from four measurement stations; station 1501, 1528, 1535 and 1539, respectively.

In this study, the relationship between suspended sediment and the measured parameters

Efficacy of learning under various network parameters

The input and target data sets are paired in order to construct training sets. As there are three input data sets, IN0, IN1 and IN2, along with three target data sets, TAR0, TAR1 and TAR2, nine training sets are obtained. In order to fairly explore the effect of the proposed treatment under various network parameters, three different initial weights (labeled as 1., 2. and 3.Weight initialization) and three different structures (ANN with 5-5-10-1, ANN with 5-10-10-1 and ANN with 5-15-10-1) are

Conclusion

In this paper, we have investigated the effect of recently proposed multi-dimensional data treatment under the internal parameters of a neural network model. The aim has been to show that the treatment eases the task of practitioners in engineering neural network models to meet satisfactory performance criteria, in the regression problems where data is highly skewed. In literature it is shown that neural network models are not consistently good in prediction under highly skewed training data

References (23)

  • García-Pedrajas, N., Ortiz-Boyer, D., & Hervás-Martínez, C. (2005). An alternative approach for neural network...
  • Cited by (0)

    View full text