A comparative predictive analysis of neural networks (NNs), nonlinear regression and classification and regression tree (CART) models
Introduction
Classical statistical methods have been applied in industry for years. Recently, Neural Networks (NNs) methods have become tools of choice for a wide variety of applications across many disciplines. It has been recognized in the literature that regression and neural network methods have become competing model-building methods (Smith & Mason, 1997). For a large class of pattern-recognition processes, NNs is the preferred technique (Setyawati, Sahirman, & Creese, 2002). NNs methods have also been used in the areas of prediction and classification (Warner & Misra, 1996).
Since NNs was developed as generalizations of mathematical models of human cognition through biological neurons, it is regarded as an information processing system that has certain performance characteristics in common with human neural biology. The characteristics include ability for storing knowledge and making it available for use whenever necessary, propensity to identify patterns, even in the presence of noise, aptitude for taking past experiences into consideration and make inferences and judgments about new situations.
Statistical methods such as regression analysis, multivariate analysis, Bayesian theory, pattern recognition and least square approximation models have been applied to a wide range of decisions in many disciplines (Buntine & Weigend, 1991). These models are attractive to decision makers because of their established methodology, long history of application, availability of software and deep-rooted acceptance among practitioners and academicians alike. NNs are data dependent and therefore, their performance improves with sample size. Statistical methods, such as Regression perform better for extremely small sample size, and also when theory or experience indicates an underlying relationship between dependent and predictor variables (Warner & Misra, 1996). Classification and Regression Tree (CART) models use tree-building algorithms, which are a set of if-then (split) conditions that permit prediction or classification of cases. A CART model that predicts the value of continuous variables from a set of continuous and/or categorical predictor variables is referred as regression-type model. For the prediction of the value of categorical variable from a set of continuous and/or categorical predictor variables, classification-type CART model is used. One noticeable advantage of decision tree based models, such as CART, is that the decision tree based models are scalable to large problems and can handle smaller data set than NNs models (Marcham, Mathieu, & Wray, 2000).
Despite the apparent substantive and applied advantages of statistical models, Neural Networks (NNs) methods have also gained popularity in recent years (Ripley, 1994). These methods are particularly valuable when the functional relationship between independent and dependent variables are unknown and there are ample training and test data available for the process. NNs models also have high tolerance for noise in the data and complexity. Moreover, the software technologies, such as, SPSS-Clementine, SAS-Enterprise Minor and Brain Maker that deploy neural networks algorithm have become extremely sophisticated and user-friendly in recent years.
Our research objective was to compare the predictive ability of multiple regression, NNs method and CART model using a set of data on smokers that include mostly categorical variables. Comparison of predictive abilities of statistical and NNs models are plentiful in the literature. It is also widely recognized that the effectiveness of any model is largely dependent on the characteristics of data used to fit the model. Goss and Vozikis (2002) compared NNs methods with Binary Logit Regression (BLR) and concluded that NNs model's prediction accuracy was better than that of BLR model. Shang, Lin, and Goetz (2000) also concluded similarly. Feng and Wang (2002) compared nonlinear regression with NNs methods in reverse engineering application using all non-categorical variables in their study. Both models provided comparably satisfactory prediction, however, the regression model produced a slightly better performance in model construction and model verification. Brown, Corruble, and Pittard (1993) show that NNs do better than CART models on multimodal classification problems where data sets are large with few attributes. The authors also concluded that the CART model did better than the NNs model with smaller data sets and with large numbers of irrelevant attributes. For non-linear data sets, NNs and CART models outperform linear discriminant analysis (Curram & Mingers, 1994). In our research, a three-way comparison involving nonlinear regression, NNs and CART models is performed. The prediction errors of these three models are compared where the dependent variable is continuous and predictor variables are all categorical.
The rest of the paper is organized as follows: The Section 2 provides literature review on comparative analysis of NNs and statistical models. Section 3 provides a brief description and organization of data and the research model. Section 4 provides a brief discussion on NNs, regression and CART models and presents test hypotheses. In Section 5, we examine results of these three models and provide analysis. Based on the analysis in Section 5, conclusions are drawn and presented in Section 6.
Section snippets
Classical statistical tools
Some of the widely used traditional Statistical tools applied for prediction and diagnosis in many disciplines are Discriminant analysis (Press and Wilson, 1978, Flury and Riedwyl, 1990), Logistic regression (Press and Wilson, 1978, Hosmer and Lemeshow, 1989, Studenmund, 1992), Bayesian approach (Duda and Hart, 1973, Buntine and Weigend, 1991), and Multiple Regression (Snedecor and Cochran, 1980, Neter et al., 1985, Myers, 1990, Menard, 1993). These models have been proven to be very effective,
Organization of data
In this study we have used a set of data on the smoking habits of people. The data set contained 35 variables and 3652 records. Among 35 available variables, initially we choose 10 variables considered to be most intuitively related to illness, and ran a correlation analysis. Based on the results of correlation analysis, the following variables (presented in Table 1), considered to be significant contributor towards the prediction of the dependent variable (Days in bed due to Illness), are
Neural network model
We choose to use NNs method because it handles nonlinearity associated with the data well. NNs methods imitate the structure of biological neural network. Processing elements (PEs) are the neurons in a Neural Network. Each neuron receives one or more inputs, processes those inputs, and generates a single output. Main components of information processing in the Neural Networks are: Inputs, Weights, Summation Function (weighted average of all input data going into a processing element (PE),
Regression
A stepwise regression procedure was conducted using SPSS. In the process, some of the variables and nonlinear interactions were thrown away by the procedure due to lack of significant contributions towards the prediction of the value of the dependent variable, Y. Multicollinearity among independent variables was also a factor in the final selection of the model. The final nonlinear regression model is as follows:
The following table
Conclusion
In this research we perform a three-way comparison of prediction accuracy involving nonlinear regression, NNs and CART models. The prediction errors of these three models are compared where the dependent variable is continuous and predictor variables are all categorical. As mentioned before, many comparative studies have been done in the past, however, very few involved CART model in their studies.
NNs and CART models, in our study, produced better prediction accuracy than non-linear regression
References (81)
- et al.
A comparison of decision tree classifiers with backpropagation neural networks for multimodal classification problems
Pattern Recognition
(1993) - et al.
Neural network models for intelligent support of managerial decision making
Decision Support Systems
(1994) Determining market response functions by neural network modeling: A comparison to econometric techniques
European Journal of Operational Research
(1993)- et al.
Optimum design based on mathematical model and neural network to predict weld parameters for fillet joints
Journal of Manufacturing Systems
(1997) - et al.
A neural network process model for abrasive flow machining operations
Journal of Manufacturing Systems
(1998) - et al.
Predicting the secondary structure of globular proteins using neural network models
Journal of Molecular Biology
(1988) - et al.
A hybrid neural approach to combinatorial optimization
Computers and Operations Research
(1996) Links between artificial neural networks (ANNs) and statistical pattern recognition
- et al.
Improving predictive accuracy with a combination of human intuition and mechanical decision aids
Organizational Behavior and Human Decision Processes
(1998) - et al.
Neural network applications in business: A review and analysis of the literature (1988ā95)
Decision Support Systems
(1997)
Improving error compensation via a fuzzy-neural hybrid model
Journal of Manufacturing Systems
Data-mining and choice classic models/neural networks
Decisions Marketing
Use of an artificial neural network for data analysis in clinical decision-making: The diagnosis of acute coronary occlusion
Neural Computation
Stacked regressions
Machine Learning
Classification and regression Trees, Wadsworth, Belmont, CA
Bayesian Back-propagation
Complex Systems
Bayesian CART model search
Journal of the American Statistical Association
Static neural network process models: Considerations and case studies
International Journal of Production Research
Neural networks, decision tree induction and discriminant analysis: An empirical comparison
Journal of the Operational Research Society
Pattern classification and scene analysis
Fundamentals of neural networks: Architecture, algorithms and applications
An experimental study of the effect of dizitizing parameters on ditizing uncertainty with a CMM
International Journal of Production Research
Digitizing uncertainty modeling for reverse engineering applications: Regression versus neural networks
Journal of Intelligent Manufacturing
Multivariate statistics: A practical approach
Simulating neural networks with mathematica
Application of artificial neural network to computer-aided diagnosis of coronary artery disease in myocardial spect Bull's-eye images
Journal of Nuclear Medicine
Improving health care organizational management through neural network learning
Health Care Management Science
Data mining: Building competitive advantage
Introduction to the theory of neural computation, Santa Fe Institute Studies in the Sciences of Complexity (vol. 1)
Applied logistic regression
An example of simulation optimization using a neural network metamodel: Finding the optimal number of Kanbans in manufacturing system
Journal of Operational Research Society
Performance of selected part-machine grouping techniques for data sets of wide ranging sizes and imperfections
Decision Sciences
Stock market predictions with modular neural networks
An empirical comparison of neural networks and logistic regression models
Marketing Letters
Introduction to neural networks: Design, theory, and applications
Combining estimates in regression and classification
Journal of the American Statistical Association
Forecasting creditworthiness: Logistic vs. artificial neural net
The Journal of Business Forecasting Methods and Systems
Integrating neural networks and semi-Markov processes for automated knowledge acquisition: An application to real time scheduling
Decision Sciences
Cited by (278)
Development of a basin-scale total nitrogen prediction model by integrating clustering and regression methods
2024, Science of the Total EnvironmentTowards machine-learning driven prognostics and health management of Li-ion batteries. A comprehensive review
2024, Renewable and Sustainable Energy ReviewsClass-overlap undersampling based on Schur decomposition for Class-imbalance problems
2023, Expert Systems with ApplicationsRobust bag classification approach for multi-instance learning via subspace fuzzy clustering
2023, Expert Systems with Applications