Skip to main content
Log in

Fast statistical regression in presence of a dominant independent variable

  • ICONIP 2011
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Bivariate statistical regression is a statistical tool that allows performing regression on a multivariate data set under the hypothesis that one of the independent variables is dominant. Statistical regression is profitable when the amount of available data is enough to explain the relevant statistical features of the phenomenon underlying the data. The present paper suggests a fast statistical regression method based on a neural system that is able to match its input–output statistic to the marginal statistic of the available data sets. A key point of the implementation proposed in the present paper is that it is based on purely numerical-algebraic operations, which guarantee a computationally advantageous way of implementing neural systems. A number of numerical experiments, performed on real-world data sets, provide some insights into the behavior of the devised neural-system-based statistical regression method and its limitations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Beckers JM, Rixen M (2003) EOF calculations and data filling from incomplete oceanographic data sets. J Atmos Ocean Technol 20(12):1839–1856

    Article  Google Scholar 

  2. Biagiotti J, Fiori S, Torre L, López-Manchado MA, Kenny JM (2004) Mechanical properties of polypropylene matrix composites reinforced with natural fibers: a statistical approach. Polym Compos 25(1):26–36

    Article  Google Scholar 

  3. Cook NR (2006) Imputation strategies for blood pressure data nonignorably missing due to medication use. Clin Trials 3(5):411–420

    Article  Google Scholar 

  4. Dargahi-Noubary GR, Razzaghi M (1994) Earthquake hazard assessment based on bivariate exponential distribution. Reliab Eng Syst Saf 44:135–166

    Article  Google Scholar 

  5. Dupacová J, Hurt J, Štepán J (2002) Stochastic modeling in economics and finance. Kluwer (Applied Optimization Series), Dordrecht

    MATH  Google Scholar 

  6. Enders CK (2006) A primer on the use of modern missing-data methods in psychosomatic medicine research. Psychosom Med 68(3):427–436

    Article  MathSciNet  Google Scholar 

  7. Fiori S (2003) Non-symmetric PDF estimation by artificial neurons: application to statistical characterization of reinforced composites. IEEE Trans Neural Netw 14(4):959–962

    Article  Google Scholar 

  8. Fiori S, Rossi R (2004) Statistical characterization of some electrical and mechanical phenomena by a neural probability density function estimation technique. Neural Netw World 2:153–176

    Google Scholar 

  9. Fiori S (2006) Neural systems with numerically-matched input-output statistic: variate generation. Neural Process Lett 23(2):143–170

    Article  Google Scholar 

  10. Fiori S (2011) Statistical nonparametric bivariate isotonic regression by look-up-table-based neural networks, In: B.-L. Lu, L. Zhang and J. Kwok (Eds.)Proceedings of the 2011 international conference on neural information processing (ICONIP 2011, Shanghai (China), November 14–17, 2011), Part III, LNCS 7064, pp. 365–372. Springer, Heidelberg

  11. Frank A, Asuncion A (2010) UCI Machine learning repository [http://archive.ics.uci.edu/ml], University of California at Irvine, School of Information and Computer Science

  12. Greve HR, Tuma NB, Strang D (2001) Estimation of diffusion processes from incomplete data (a simulation study). Sociol Methods Res 29(4):435–467

    Article  MathSciNet  Google Scholar 

  13. Härdle W (1992) Applied nonparametric regression. Cambridge University Press, Cambridge

    Google Scholar 

  14. Katch F, McArdle W (1977) Nutrition, weight control, and exercise. Houghton Mifflin Co., Boston

    Google Scholar 

  15. Little RJA, Rubin DA (1987) Statistical analysis with missing data. Wiley, New York

    MATH  Google Scholar 

  16. Luchinsky DG, Millonas MM, Smelyanskiy VN, Pershakova A, Stefanovska A, McClintock PVE (2005) Nonlinear statistical modeling and model discovery for cardiorespiratory data. Phys Rev E 72:021905

    Article  Google Scholar 

  17. Nikoloulopoulos AK, Karlis D (2010) Regression in a copula model for bivariate count data. J Appl Stat 37(9):1555–1568

    Article  MathSciNet  Google Scholar 

  18. Peugh JL, Enders CK (2004) Missing data in educational research: a review of reporting practices and suggestions for improvement. Rev Educ Res 74(4):525–556

    Article  Google Scholar 

  19. Rosenblum M, Cimponeriu L, Pikovsky A (2006) Coupled oscillators approach in analysis of bivariate data. In: Schelter B, Winterhalder M, Timmer J (eds) Handbook of time series analysis, Wiley, New York, pp 159–180

    Chapter  Google Scholar 

  20. Salanti G (2003) The isotonic regression framework: estimating and testing under order restrictions, PhD Dissertation, Fakultät für Matematik, Ludwig-Maximilians-Universität München

  21. Schafer JL, Graham JW (2002) Missing data: our view of the state of the art. Psychol Methods 7(2):147–177

    Article  Google Scholar 

  22. Schneider T (2001) Analysis of incomplete climate data: estimation of mean values and covariance matrices and imputation of missing values. J Clim 14:853–871

    Article  Google Scholar 

  23. SOCR data BMI regression (2012) [http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_BMI_Regression], University of California at Los Angeles

  24. Torgo L (2007) Regression datasets [http://www.liaad.up.pt/~ltorgo/Regression/DataSets.html], Artificial Intelligence and Computer Science Laboratory, University of Porto (Portugal)

  25. Velikova MV (2006) Monotone models for prediction in data mining, Ph.D. Dissertation, Dutch Graduate School for Information and Knowledge Systems and Graduate School of the Faculty of Economics and Business Administration of Tilburg University

  26. Verde PE (2010) Meta-analysis of diagnostic test data: a bivariate Bayesian modeling approach. Stat Med 29:3088–3102

    Article  MathSciNet  Google Scholar 

  27. Yeh I-C (1998) Modeling of strength of high performance concrete using artificial neural networks. Cem Concr Res 28(12):1797–1808

    Article  Google Scholar 

Download references

Acknowledgments

The present paper is an extended version of the conference paper [10]. The author wishes to thank Andrew Leung for the invitation to submit the present extended version to the special issue of Neural Computation and Applications dedicated to the ICONIP’2011 conference.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simone Fiori.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fiori, S. Fast statistical regression in presence of a dominant independent variable. Neural Comput & Applic 22, 1367–1378 (2013). https://doi.org/10.1007/s00521-012-0958-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-012-0958-6

Keywords

Navigation