Abstract
In our days in the social sciences, time series (or longitudinal data) are ubiquitous, used in any analytic process, with the main scope to estimate or predict the future. The main issues are represented by the large variety of time series (sometime with an unknown size), the identification of outliers, and by the impossibility to estimate the error or numerical stability of statistical analysis. This paper proposed a matrix-based model for predictive analytics and, using a statistical estimation for different finite samples extracted from time series, estimated the residual and factorial variance for a group of samples. The proposed methods are applied on different samples of social data: number of births in a community, number of inhabitants, natural mobility of population, life expectancy (by sex and area), life expectancy at birth, fertility rate, infant mortality rate.
Similar content being viewed by others
References
Akaike H (1969) Fitting autoregressive models for prediction. Ann Inst Stat Math 21(1):243–247
Avram F, Leonenko NN, S̆uvak N (2012) Hypothesis testing for Fisher–Snedecor diffusion. J Stat Plan Inference 142(8):2308–2321
Barbierato E, Gribaudo M, Iacono M, Marrone S (2011) Performability modeling of exceptions-aware systems in multiformalism tools. In: Al-Begain K, Balsamo S, Fiems D, Marin A (eds) Analytical and stochastic modeling techniques and applications, volume 6751 of lecture notes in computer science. Springer, Berlin, pp 257–272
Cline AK, Moler CB, Stewart GW, Wilkinson JH (1979) An estimate for the condition number of a matrix. SIAM J Numer Anal 16(2):368–375
Forrest JY-L, Liu S (2010) Advances in grey systems research. Springer, Berlin
Griebel M, Kuo FY, Sloan IH (2010) The smoothing effect of the anova decomposition. J Complex 26(5):523–551
Hooker C, Leach JJ, McClennen EF (2012) Foundations and applications of decision theory: volume I theoretical foundations, vol 13. Springer Science and Business Media
Iacono M, Romano E, Marrone S (2010) Adaptive monitoring of marine disasters with intelligent mobile sensor networks. In: 2010 IEEE workshop on environmental energy and structural monitoring systems (EESMS), pp 38–45, Sept 2010
Kryftis Y, Mavromoustakis CX, Batalla JM, Mastorakis G, Pallis E, Skourletopoulos G (2014) Resource usage prediction for optimal and balanced provision of multimedia services. In: 2014 IEEE 19th international workshop on computer aided modeling and design of communication links and networks (CAMAD), pp 255–259, Dec 2014
Kryftis Y, Mavromoustakis CX, Mastorakis G, Pallis E, Mongay Batalla J, Rodrigues JJPC, Dobre C, Kormentzas G (2015) Resource usage prediction algorithms for optimal selection of multimedia content delivery methods. In: 2015 IEEE international conference on communications (ICC), pp 5903–5909, June 2015
Lee DJ, Durbán M, Eilers P (2013) Efficient two-dimensional smoothing with p-spline anova mixed models and nested bases. Comput Stat Data Anal 61:22–37
Ljung L (1981) Analysis of a general recursive prediction error identification algorithm. Automatica 17(1):89–99
Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic, New York
McCleary R, Hay RA, Meidinger EE, McDowall D (1980) Applied time series analysis for the social sciences. Sage Publications, Beverly Hills
Miller RG Jr (1997) Beyond ANOVA: basics of applied statistics. CRC Press, Boca Raton
Misirli E, Gurefe Y (2011) Multiplicative Adams Bashforth–Moulton methods. Numer Algorithms 57(4):425–439
Pickup M (2014) Introduction to time series analysis. quantitative applications in the social sciences, (QASS) series, vol 174. SAGE Publications, Beverly Hills
Saboia JLM (1977) Autoregressive integrated moving average (arima) models for birth forecasting. J Am Stat Assoc 72(358):264–270
Shaw JW, Pickard AS, Yu S, Chen S, Iannacchione VG, Johnson JA, Coons SJ (2010) A median model for predicting United States population-based eq-5d health state preferences. Value Health 13(2):278–288
Simoes A, Costa E (2014) Prediction in evolutionary algorithms for dynamic environments. Soft Comput 18(8):1471–1497
Skourletopoulos G, Bahsoon R, Mavromoustakis CX, Mastorakis G, Pallis E (2014) Predicting and quantifying the technical debt in cloud software engineering. In: 2014 IEEE 19th international workshop on computer aided modeling and design of communication links and networks (CAMAD), pp 36–40, Dec 2014
Sotiriadis S, Bessis N, Antonopoulos N (2013) Towards inter-cloud simulation performance analysis: Exploring service-oriented benchmarks of clouds in simic. In: 2013 27th international conference on advanced information networking and applications workshops (WAINA), pp 765–771, March 2013
Sotiriadis S, Bessis N, Huang Y, Sant P, Maple C (2010) Defining minimum requirements of inter-collaborated nodes by measuring the weight of node interactions. In: 2010 international conference on complex, intelligent and software intensive systems (CISIS), pp 291–298, Feb 2010
Swany M, Wolski R (2002) Multivariate resource performance forecasting in the network weather service. In: Proceedings of the 2002 ACM/IEEE Conference on Supercomputing, SC ’02, pp 1–10, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press
Xu L-W, Fang-Qin Y, Aji’Erguli A, Shuang Q (2013) A parametric bootstrap approach for two-way anova in presence of possible interactions with unequal variances. J Multivar Anal 115:172–180
Yang L, Foster I, Schopf JM (2003) Homeostatic and tendency-based cpu load predictions. In: Proceedings of the 17th international symposium on parallel and distributed processing, IPDPS ’03, pp 42.2, Washington, DC, USA, 2003. IEEE Computer Society
Zhang L, Zhou W-D (2015) Time series prediction using sparse regression ensemble based on l2–l1 problem. Soft Comput 19(3):781–792
Acknowledgments
We are grateful to the referees for their constructive and valuable input.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors of this paper (Cristina Serbanescu and Cosmina-Elena Pop) declare that they have no conflict of interest.
Human and animal studies
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
This paper is made and published under the aegis of the Research Institute for Quality of Life, Romanian Academy as a part of programme co-funded by the European Union within the Operational Sectorial Programme for Human Resources Development through the project for Pluri and interdisciplinary in doctoral and post-doctoral programmes Project Code: POSDRU/159/1.5/S/141086.
Rights and permissions
About this article
Cite this article
Şerbănescu, C., Pop, CE. Data analysis and statistical estimation for time series: improving presentation and interpretation. Soft Comput 21, 3919–3930 (2017). https://doi.org/10.1007/s00500-016-2041-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-016-2041-1