Skip to main content
Log in

Time series prediction using sparse regression ensemble based on \(\ell _2\)\(\ell _1\) problem

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Sparse regression ensemble (SRE) is to sparsely combine the outputs of multiple learners using a sparse weight vector. This paper deals with SRE based on the \(\ell _2\)\(\ell _1\) problem and applies it to time series prediction problems. The \(\ell _2\)\(\ell _1\) problem consists of \(\ell _2\)-norm and \(\ell _1\)-norm regularization terms, where the former denotes the total ensemble empirical risk, and the latter represents the ensemble complexity. Thus, the goal is both to minimize the total ensemble training error and control the ensemble complexity. Experiments on real-world data for regression and time series prediction are given.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Asuncion A, Newman DJ (2007) UCI machine learning repository. From http://www.ics.uci.edu/mlearn/MLRepository.html

  • Baraniuk R, Davenport M, DeVore R, Wakin M (2008) A simple proof of the restricted isometry property for random matrices. Constr Approx 28(3):253–263

    Article  MATH  MathSciNet  Google Scholar 

  • Barreto A, Araujo AA, Kremer S (2003) A taxonomy for spatiotemporal connectionist networks revisited: the unsupervised case. Neural Comput 15:1255–1320

    Article  MATH  Google Scholar 

  • Barron A, Cohen A, Dahmen W, DeVore R (2008) Approximation and learning by greedy algorithm. Ann Stat 36(1):64–94

    Article  MATH  MathSciNet  Google Scholar 

  • Benediktsson JA, Sveinsson JR, Ersoy OK, Swain PH (1997) Parallel consensual neural networks. IEEE Trans Neural Netw 8(1):54–64

    Article  Google Scholar 

  • Bontempi G, Birattari M, Bersini H (1999) Local learning for iterated time-series prediction. In: Bratko I, Dzeroski S (eds) Proceedings of the sixteenth international conference on machine learning, Morgan Kaufmann Publishers, San Francisco, pp 32–38

  • Bouchachia A, Bouchachia S (2008) Ensemble learning for time series prediction. In: First international workshop on nonlinear dynamics and synchronization

  • Brazdil P, Giraud-Carrier C, Soares C, Vilalta R (2009) Metalearning Springer, Berlin Heidelberg

  • Breiman L (1996) Bagging predictors. Mach Learn 24:123–140

    MATH  MathSciNet  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  MATH  Google Scholar 

  • Brown G, Wyatt JL, Tin̆o P (2005) Managing diversity in regression ensembles. J Mach Learn Res 6:1621–1650

  • Candès E, Tao T (2005) Decoding by linear programming. IEEE Trans Inf Theory 51(12):4203–4215

    Article  MATH  Google Scholar 

  • Candès E, Wakin M (2008) An introduction to compressed sampling. IEEE Signal Process Mag 25(2):21–30

    Article  Google Scholar 

  • Candès E, Romberg J, Tao T (2006) Stable signal recovery from incomplete and inaccurate measurements. Commun Pure Appl Math 59(8):1207–1223

    Article  MATH  Google Scholar 

  • Cao L (2003) Support vector machines experts for time series forecasting. Neurocomputing 51:321–339

    Article  Google Scholar 

  • Cernuda C, Lughofer E, Hintenaus P, Marzinger W, Reischer T, Pawlicek M, Kasberger J (2013) Hybrid adaptive calibration methods and ensemble strategy for prediction of cloud point in melamine resin production. Chemom Intell Lab Syst 126:60–75

    Article  Google Scholar 

  • Chang FJ, Chiang YM, Chang LC (2007) Multi-step-ahead neural networks for flood forecasting. Hydrolog Sci J 52(1):114–130

    Article  Google Scholar 

  • Chen H, Tino P, Yao X (2009) Predictive ensemble pruning by expectation propagation. IEEE Trans Knowl Data Eng 21(7):999–1013

    Article  Google Scholar 

  • Cheng CH, Cheng GW, Wang JW (2008) Multi-attribute fuzzy time series method based on fuzzy clustering. Expert Syst Appl 34:1235–1242

    Article  Google Scholar 

  • Cohen A, Dahmen W, DeVore R (2009) Compressed sensing and best k-term approximation. J Am Math Soc 22:211–231

    Article  MATH  MathSciNet  Google Scholar 

  • Cowper MR, Mulgrew B, Unsworth CP (2002) Nonlinear prediction of chaotic signals using a normalized radial basis function network. Signal Process 82(5):775–789

    Article  MATH  Google Scholar 

  • DeVore RA (2007) Deterministic constructions of compressed sensing matrices. J Complexity 23(4–6):918–925

    Article  MATH  MathSciNet  Google Scholar 

  • Donoho D (2006) Compressed sensing. IEEE Trans Inf Theory 52:1289–1306

    Article  MATH  MathSciNet  Google Scholar 

  • Duda R, Hart P, Stork D (2000) Pattern classification, 2nd edn. Wiley, New Jersey

  • Figueiredo M, Nowak R (2003) An em algorithm for wavelet-based image restoration. IEEE Trans Image Process 12:906–916

    Article  MATH  MathSciNet  Google Scholar 

  • Figueiredo M, Nowak R (2005) A bound optimization approach to wavelet-based image deconvolution. In: IEEE international conference on image processing—ICIP’2005, Genoa, Italy

  • Figueiredo MAT, Nowak RD, Wright SJ (2007) Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J Sel Topics Signal Process Special Issue Convex Optim Methods Signal Process 1(4):586–598

    Article  Google Scholar 

  • Floyd S, Warmuth M (1995) Sample compression, learnability, and the vapnik-chervonenkis dimension. Mach learn 21(3):269–304

    Google Scholar 

  • Freund Y, Shapire R (1996) Experiments with a new boosting algorithm. Proceedings of the thirteenth international conference on machine learning. Morgan Kaufmann, Bary, pp 148–156

    Google Scholar 

  • Gheyas IA, Smith LS (2011) A novel neural network ensemble architecture for time series forecasting. Neurocomputing 74(18):3855–3864

    Article  Google Scholar 

  • Girard A, Rasmussen CE, nonero Candela JQ, Murray-Smith R, (2002) Gaussian process priors with uncertain inputs—application to multiple-step ahead time series forecasting. Advances in neural information processing systems, vol 15. Vancouver, pp 529–536

  • Graepel T, Herbrich R, Shawe-Taylor J (2000) Generalisation error bounds for sparse linear classifiers. In: Proceedings of the thirteenth annual conference on computational learning theory, pp 298–303

  • Grassberger P, Procaccia I (1983) Estimation of the Kolmogorov entropy from a chaotic signal. Phys Rev A 28(4):2591–2593

    Article  Google Scholar 

  • Hale ET, Yin W, Zhang Y (2008) Fixed-point continuation for \(\ell _1\)-minimization: methodology and convergence. SIAM J Optim 19:1107–1130

    Article  MATH  MathSciNet  Google Scholar 

  • He W, Wang Z, Jiang H (2008) Model optimizing and feature selecting for support vector regression in time series forecasting. Neurocomputing 72:600–611

    Article  Google Scholar 

  • Hernández-Lobato D, Noz GMM, Suárez A (2011) Empirical analysis and evaluation of approximate techniques for pruning regression bagging ensembles. Neurocomputing 74(12—-13):2250–2264

    Article  Google Scholar 

  • Holcapek MI, Novák V, Perfilieva I (2013) Noise reduction in time series using F-transform. In: 2013 IEEE international conference on fuzzy systems, pp 1–8

  • Kim KJ (2003) Financial time series forecasting using support vector machines. Neurocomputing 55:307–319

    Article  Google Scholar 

  • Kim SJ, Koh K, Lustig M, Boyd S, Gorinevsky D (2007) An interior-point method for large-scale \(\ell _1\)-regularized least squares. IEEE J Sel Topics Signal Process 1(4):606–617

    Article  Google Scholar 

  • Kuo IH, Horng SJ, Kao TW, Lin TL, Lee CL, Pan Y (2009) An improved method for forecasting enrollments based on fuzzy time series and particle swarm optimization. Expert Syst Appl 36:6108–6117

    Article  Google Scholar 

  • Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley Inc, Hoboken

    Book  Google Scholar 

  • Lasota T, Telec Z, Trawiński B, Trawiński K (2009) A multi-agent system to assist with real estate appraisals using bagging ensembles. In: Nguyen N, Kowalczyk R, Chen S-M (eds) Computational collective intelligence. Semantic web, social networks and multiagent systems, vol. 5796. Springer, Heidelberg, pp 813–824

  • Liebert W, Schuster HG (1989) Proper choice of the time delay for the analysis of chaotic time series. Phys Rev A 142:107–111

    MathSciNet  Google Scholar 

  • Liu Y, Yao X (1999) Ensemble learning via negative correlation. Neural Netw 12(10):1399–1404

    Article  Google Scholar 

  • Martínez-Muñoz G, Hernández-Lobato D, Suárez A (2009) An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Trans Pattern Anal Mach Intell 31(2):245–259

    Article  Google Scholar 

  • Di Martino F, Loia V, Sessa S (2011) Fuzzy transforms method in prediction data analysis. Fuzzy Sets Syst 180(1):146–163

    Article  MATH  Google Scholar 

  • Minku FL, White A, Yao X (2010) The impact of diversity on on-line ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22:730–742

    Article  Google Scholar 

  • Müller KR, Smola AJ, Ratsch G, Schölkopf B, Kohlmorgen J, Vapnik VN (1997) Predicting time series with support vector machines. In: Proceedings of 7th international conference artificial neural networks, Lausanne, vol 1327, pp 999–1004

  • Parlos AG, Rais OT, Atiya AF (2000) Multi-step-ahead prediction using dynamic recurrent neural networks. Neural Netw 13(7):765– 786

  • Rätsch G, Demiriz A, Bennett KP (2002) Sparse regression ensembles in infinite and finite hypothesis spaces. Mach Learn 48:189– 218

  • Pears R, Widiputra H, Kasabov N (2013) Evolving integrated multi-model framework for on line multiple time series prediction. Evol Syst 4:99–117

  • Polikar R (2006) Ensemble based systems in decision making. IEEE Circuits Syst Mag 6:21–45

    Article  Google Scholar 

  • Roli F, Kittler J, Windeatt T (eds) (2004) Multiple classifier systems. Lecture notes in computer Science, vol 3077. Springer-Verlag, Berlin, Heidelberg

  • Sing SR (2007) A simple time variant method for fuzzy time series forecasting. Cybern Syst Int J 38:305–321

    Article  Google Scholar 

  • Shi Z, Han M (2007) Support vector echo-state machine for chaotic time-series prediction. IEEE Trans Neural Netw 18(2):359–372

    Article  Google Scholar 

  • Sorjamaa A, Hao J, Reyhani N, Ji Y, Lendasse A (2007) Methodology for long-term prediction of time series. Neurocomputing 70:2861–2869

    Article  Google Scholar 

  • Taieb SB, Sorjamaa A, Bontempi G (2010) Multiple-output modeling for multi-step-ahead time series forecasting. Neurocomputing 73(10–12):1950–1957

  • Tresp V, Taniguchi M (1995) Combining estimators using non-constant weighting functions. Adv Neural Inf Process Syst 7:419–426

    Google Scholar 

  • Ueda N, Nakano R (1996) Generalization error of ensemble estimators. In: Proceedings of international conference on neural networks, p 90–95

  • Wan EA (1994) Time series prediction by using a connectionist network with internal delay lines. In: Proceedings of NATO advanced research workshop comparative time series analysis, Addison-Wesley, Reading, pp 195–217

  • Weigend AS, Gershenfeld NA (1994) Time series prediction: forecasting the future and understanding the past. Addison-Wesley, Reading. http://www-psych.stanford.edu/andreas/Time-Series/SantaFe.html#SantaFeTop

  • Wen Z, Yin W, Goldfarb D, Zhang Y (2010) A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation. SIAM J Sci Comput 32(4):1832–1857

  • Widiputra H, Pears R, Kasabov N (2012) Dynamic learning of multiple time series in a nonstationary environment. In: Sayed-Mouchaweh M, Lughofer E (eds) Learning in non-stationary environments: methods and applications, Springer, New York, p 303–348

  • Windeatt T, Roli F (eds) (2003) Multiple classifier systems. Lecture notes in computer science, vol 2709. Springer-Verlag, Berlin, Heidelberg

  • Wright S, Nowak R, Figueiredo M (2009) Sparse reconstruction by separable approximation. IEEE Trans Signal Process 57(7):2479–2493

  • Yang H, Huang K, King I, Lyu MR (2009) Localized support vector regression for time series prediction. Neurocomputing 72:2659–2669

  • Yao X, Liu Y (1998) Making use of population information in evolutionary artificial neural networks. IEEE Trans Syst Man Cybern Part B Cybern 28(3):417–425

    MathSciNet  Google Scholar 

  • Zhang G, Patuwo BE, Hu MY (1998) Forecasting with artificial neural networks: the state of the art. Int J Forecast 14:35–62

  • Zhang L, Zhou W (2010) On the sparseness of 1-norm support vector machines. Neural Netw 23:373–385

    Article  Google Scholar 

  • Zhang L, Zhou WD (2011) Sparse ensembles using weighted combination methods based on linear programming. Pattern Recognit 44(1):97–106

    Article  MATH  Google Scholar 

  • Zhou ZH, Wu JX, Tang W (2002) Ensembling neural networks: many could be better than all. Artif Intell 137(1–2):239–263

  • Zhou Z (2012) Ensemble methods: foundations and algorithms. Chapman & Hall/CRC data mining and knowledge discovery series, Boca Raton

Download references

Acknowledgments

We would like to thank two anonymous reviewers and Editor A. Castiglione for their valuable comments and suggestions, which have significantly improved this paper. This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 61373093, 61033013,and 61271301, by the Natural Science Foundation of Jiangsu Province of China under Grant Nos. BK2011284 and BK201222725, by the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under Grant No.13KJA520001, and by the Qing Lan Project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Zhang.

Additional information

Communicated by A. Castiglione.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, L., Zhou, WD. Time series prediction using sparse regression ensemble based on \(\ell _2\)\(\ell _1\) problem. Soft Comput 19, 781–792 (2015). https://doi.org/10.1007/s00500-014-1304-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-014-1304-y

Keywords

Navigation