Skip to main content
Log in

Transforming input variables for RBFN based on PCA-ASH multivariate correlation analysis and its application

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

The mutual information (MI) based on averaged shifted histogram (ASH) probability density estimator is considered as a good indicator of relevance between input variables and output variable. However, it cannot deal with redundant input variables problem. Therefore, a method integrates principal component analysis (PCA) with MI is proposed for radial basis function network (RBFN) to improve the predicting performance of RBFN. Firstly, PCA is employed to characterize the PCs from original variables, among which there is non-correlation. Secondly, MI based on ASH is applied to select the several closest correlation PCs with output variable as the new input variables. Finally, PCA-ASH-RBFN is employed to develop the housing price model based on the Boston housing data set. The result shows that PCA-ASH-RBFN has better prediction and robust performance than PCA-RBFN and RBFN integrating with robust feature selection for input variables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Huth R, Pokorna L (2005) Simulations analysis of climatic trend in multiple variable an example of application of multivariate statistical methods. Int J Climatol 25:469–484

    Article  Google Scholar 

  2. Shi WL, Gao TB, Wang SE (2008) Evaluation of urbanization level using principal component analysis and cluster analysis. Ind Eng J 11:112–115

    Google Scholar 

  3. Martinez Lopez J, Llamas Borrajo J, De Miguel Garcia E, Rey Arrans J, Hidalgo Estevez Ma C, Saez Castillo AJ (2008) Multivariate analysis of contamination in the mining district of Linares. Appl Geochem 23:2324–2336

    Article  Google Scholar 

  4. Chen YH, Rangarajan G, Feng JF (2004) Analyzing multiple nonlinear time series with extended Granger causality. Phys Lett A 324:26–35

    Article  MathSciNet  MATH  Google Scholar 

  5. Joliffe IT (2002) Principal component analysis, 2nd edn. Springer, New York

    Google Scholar 

  6. Yucceer M (2010) Artificial neural network models for HFCS isomerization process. Neural Comput Appl 19(7):979–986

    Article  Google Scholar 

  7. Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24:417–441

    Article  Google Scholar 

  8. Malinowski ER (1991) Factor analysis in chemistry, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  9. Perkins RG, Underwood GJC (2000) Gradients of chlorophyll a and water chemistry along an eutrophic reservoir with determination of the limiting nutrient by in situ nutrient addition. Water Res 34:713–724

    Article  Google Scholar 

  10. Lai D (2003) Principal component analysis on human development indicators of China. Soc Indic Res 61:319–330

    Article  Google Scholar 

  11. Diamantaras K, Papadimitriou T (2009) Applying PCA neural models for the blind separation of signals. Neurocomputing 73:3–9

    Article  Google Scholar 

  12. Wachs A, Lewin DR (1999) Improved PCA methods for process disturbance and failure identification. AIChE J 45:1688–1700

    Article  Google Scholar 

  13. Croux C, Ruiz-Gazen A (1996) A fast algorithm for robust principal components based on Projection Pursuit. In: Prat A (ed) Compstat: proceedings in computational statistics. Physica, Heidelberg, pp 211–216

  14. Cover TM, Thomas JA (2006) Elements of information theory, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  15. Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Trans Neural Netw 5:537–550

    Article  Google Scholar 

  16. Kwak N Choi CH (1999) Improved mutual information feature selector for neural networks in supervised learning. In: Proceedings of the international joint conference on neural networks, Washington, DC

  17. Kwak N, Choi CH (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13:143–159

    Article  Google Scholar 

  18. Rossi F, Lendase A, Francois D, Wertz V, Verleysen M (2006) Mutual information for the selection of relevant variables in spectrometric nonlinear modeling. Chemom Intell Lab Syst 80(2):215–226

    Article  Google Scholar 

  19. Scott DW (1985) Averaged shifted histograms: effective nonparametric estimators in several dimensions. Ann Stat 13:1024–1040

    Article  MATH  Google Scholar 

  20. Powell MJD (1987) Radial basis functions for multivariable interpolation: a review. In: Mason JC, Cox MG (eds) Algorithms for approximation. Clarendon Press, Oxford, pp 143–167

    Google Scholar 

  21. Moody J, Darken C (1989) Fast learning in networks of locally-tuned processing units. Neural Comput 4:740–747

    Google Scholar 

  22. Broomhead DS, Lowe D (1988) Multivariable functional interpolation and adaptive networks. Complex Syst 2:321–355

    MathSciNet  MATH  Google Scholar 

  23. Francois D, Rossi F, Wertz V, Verleysen M (2007) Resampling methods for parameter-free and robust feature selection with mutual information. Neurocomputing 70:1276–1288

    Article  Google Scholar 

  24. Scott D (1992) Multivariable density estimation: theory, practice, and visualization. Wiley, New York

    Book  Google Scholar 

  25. Fernando TMKG, Maier HR, Dandy GC (2009) Selection of input variables for data driven models: an average shifted histogram partial mutual information estimator approach. J Hydrol 367:165–176

    Article  Google Scholar 

  26. Scott DW, Terrell GR (1987) Biased and unbiased cross-validation in density estimation. J Am Stat Assoc 82:1131–1146

    Article  MathSciNet  MATH  Google Scholar 

  27. http://www.mathworks.cn/help/toolbox/nnet/ref/newrbe.html

  28. Blake CL, Merz CJ (1998) UCI repository of machine learning databases, Department of Information and Computer Science, University of California, Irvine, CA. http://www.ics.uci.edu/~mlearn/MLRepository.html

  29. Chen DZ (1998) Multivariate data processing. Chemical Industry Press, Beijing

    Google Scholar 

Download references

Acknowledgments

The authors gratefully acknowledge the supports from the following foundations: National Natural Science Foundation of China (20776042), Doctoral Fund of Ministry of Education of China (20090074110005), Program for New Century Excellent Talents in University (NCET-09-0346), “Shu Guang” project (09SG29) and the Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuefeng Yan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, C., Yan, X. Transforming input variables for RBFN based on PCA-ASH multivariate correlation analysis and its application. Neural Comput & Applic 22 (Suppl 1), 101–111 (2013). https://doi.org/10.1007/s00521-012-0968-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-012-0968-4

Keywords

Navigation