Skip to main content
Log in

Automatic lag selection in time series forecasting using multiple kernel learning

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

This paper reports the feasibility of employing the recent approach on kernel learning, namely the multiple kernel learning (MKL), for time series forecasting to automatically select the optimal lag length or size of sliding windows. MKL is an approach to choose suitable kernels from a given pool of kernels by exploring the combination of multiple kernels. In this paper, we extend the MKL capability to select the optimal size of sliding windows for time series domain by adopting the data integration approach which has been previously studied in the domain of image processing. In this study, each kernel represents the different lengths of time series lag. In addition, we also examine the feasibility of MKL for decomposed time series. We use the dataset from previous time series competitions as our benchmark. Our experimental results indicate that our approaches perform competitively compared to the previous methods using the same dataset. Furthermore, MKL may predict the detrended time series without explicitly computing the seasonality. The advantage of our method is in its ability in automatically selecting the optimal size of sliding windows and finding the pattern of time series.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. http://www.neural-forecasting-competition.com/downloads/NN3/datasets/download.htm.

  2. www.forecasters.org/data/m3comp/m3comp.htm.

References

  1. Clements MP, Franses PH, Swanson NR (2004) Forecasting economic and financial time-series with non-linear models. Int J Forecast 20:169–183

    Article  Google Scholar 

  2. González-Romera E, Jaramillo-Morán MÁ, Carmona-Fernández D (2006) Monthly electric energy demand forecasting based on trend extraction. IEEE Trans Power Syst 21:1946–1953

    Article  Google Scholar 

  3. Makridakis SG, Wheelwright SC, Hyndman RJ (1998) Forecasting: methods and applications. Wiley, New York

    Google Scholar 

  4. Cao L (2003) Support vector machines experts for time series forecasting. Neurocomputing 51:321–339

    Article  Google Scholar 

  5. Zhang GP, Kline DM (2007) Quarterly time-series forecasting with neural networks. Neural Netw IEEE Trans 18:1800–1814

    Article  Google Scholar 

  6. Kourentzes N, Crone SF (2008) Automatic modelling of neural networks for time series prediction—in search of a uniform methodology across varying time frequencies. In: Proceedings of the 2nd European Symposium Time Series Predict

  7. Crone SF, Kourentzes N (2009) Forecasting seasonal time series with multilayer perceptrons – an empirical evaluation of input vector specifications for deterministic seasonality. In: Proceedings of the 2009 international conference on data mining, DMIN 2009, Las Vegas. CSREA Press, pp 232–238

  8. Clemen T (1989) Combining forecasts: a review and annotated bibliography. Int J Forecast 5:559–583

    Article  Google Scholar 

  9. Siwek K, Osowski S, Szupiluk R (2009) Ensemble neural network approach for accurate load forecasting in a power system. Int J Appl Math Comput Sci 19:303–315

    Article  MATH  Google Scholar 

  10. Huang C, Yang D, Chuang Y (2008) Application of wrapper approach and composite classifier to the stock trend prediction. Expert Syst Appl 34:2870–2878

    Article  Google Scholar 

  11. Armstrong JS (1989) Combining forecasts. Int J Forecast 5:585–588

    Article  Google Scholar 

  12. Poncela P, Rodríguez J, Sánchez-Mangas R, Senra E (2011) Forecast combination through dimension reduction techniques. Int J Forecast 27:224–237

    Article  Google Scholar 

  13. Andrawis RR, Atiya AF, El-Shishiny H (2011) Combination of long term and short term forecasts, with application to tourism demand forecasting. Int J Forecast 27:870–886

    Article  Google Scholar 

  14. Kourentzes N, Petropoulos F, Trapero JR (2014) Improving forecasting by estimating time series structural components across multiple frequencies. Int J Forecast 30:291–302

    Article  Google Scholar 

  15. Cortes C (2011) Ensembles of Kernel Predictors. In: Proceedings of the 27th Conference Uncertainty Artificial Intelligence

  16. Lee W, Verzakov S, Duin RPW (2007) Kernel combination versus classifier combination. Multi Classification System Lecture Notes Computer Science, vol 4472, pp 22–31

  17. Kim H-C, Pang S, Je H-M, Kim D, Yang Bang S (2003) Constructing support vector machine ensemble. Pattern Recognit 36:2757–2767

    Article  MATH  Google Scholar 

  18. Rakotomamonjy A, Bach FR, Grandvalet Y, Canu S (2008) SimpleMKL. J Mach Learn Res 9:2491–2521

    MATH  MathSciNet  Google Scholar 

  19. Bach FR, Lanckriet GRG, Jordan MI (2004) Multiple kernel learning, conic duality, and the SMO algorithm. Twenty-first Int Conf Mach Learn—ICML’04 6

  20. Gonen M, Alpaydin E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12:2211–2268

    MathSciNet  Google Scholar 

  21. Yeh C, Huang C, Lee S (2011) Expert Systems with Applications A multiple-kernel support vector regression approach for stock market price forecasting q. Expert Syst Appl 38:2177–2186

    Article  Google Scholar 

  22. Zhang X, Hu L, Wang Z (2010) Multiple kernel support vector regression for economic forecasting. In: 2010 international conference on management science and engineering, Melbourne. IEEE, pp 129–134

  23. Tschernig R, Yang L (2000) Nonparametric lag selection for time series. J Time Ser Anal 21:457–487

    Article  MATH  MathSciNet  Google Scholar 

  24. Crone SF, Kourentzes N (2010) Feature selection for time series prediction—A combined filter and wrapper approach for neural networks. Neurocomputing 73:1923–1936

    Article  Google Scholar 

  25. Simon G, Verleysen M (2006) Lag selection for regression models using high-dimensional mutual information. In: European symposium on artificial neural networks, Bruges, Belgium, pp 395–400

  26. Ribeiro GHT, Neto PSGDM, Cavalcanti GDC, Tsang IR (2011) Lag selection for time series forecasting using particle swarm optimization. The 2011 International Joint Conference, pp 2437–2444

  27. Davey N, Hunt SP, Frank RJ Time series prediction and neural networks. In: Proceedings of the 5th International Conference on Engineering Applications of Neural Networks (EANN’99), pp 3–8

  28. Leon F, Zaharia MH (2010) Stacked heterogeneous neural networks for time series forecasting. Math Probl Eng 2010:1–20

    Article  Google Scholar 

  29. Yoshida S, Hatano K, Takimoto E (2011) Adaptive online prediction using weighted windows. IEICE Trans Inf Syst 94-D:1917–1923

    Article  Google Scholar 

  30. Sharda R, Patil RB (1992) Connectionist approach to time series prediction: an empirical test. J Intell Manuf 3:317–323

    Article  Google Scholar 

  31. Nelson M, Hill T, Remus T, O’Connor M (1999) Time series forecasting using NNs: should the data be deseasonalized first? J Forecast 8:359–367

    Article  Google Scholar 

  32. Theodosiou M (2011) Forecasting monthly and quarterly time series using STL decomposition. Int J Forecast 27:1178–1195

    Article  Google Scholar 

  33. Christodoulos C, Michalakelis C, Varoutas D (2010) Forecasting with limited data: combining ARIMA and diffusion models. Technol Forecast Soc Change 77:558–565

    Article  Google Scholar 

  34. Dileep AD, Sekhar CC (2009) Representation and feature selection using multiple kernel learning. In: Proceedings International Joint Conference Neural Networks. Atlanta, Georgia, pp 717–722

  35. Foresti L, Tuia D, Timonin V, Kanevski M (2010) Time series input selection using multiple kernel learning: 28–30

  36. Crone SF, Hibon M, Nikolopoulos K (2011) Advances in forecasting with neural networks? Empirical evidence from the NN3 competition on time series prediction. Int J Forecast 27:635–660

    Article  Google Scholar 

  37. Makridakis S, Hibon M (2000) The M3-Competition: results, conclusions and implications. Int J Forecast 16:451–476

    Article  Google Scholar 

  38. Zien A (2008) Multiple Kernel Learning. In: Friedrich Miescher Lab. http://raetschlab.org/lectures/mkl-tutorial.pdf. Accessed 14 Dec 2012

  39. Kloft M, Laskov P, Zien A (2010) Efficient and accurate Lp-norm multiple kernel learning. Neural Inf Proc Sys 22:997–1005

    Google Scholar 

  40. Anderson DR (2004) Multimodel inference understanding AIC and BIC in model selection. Soc Methods Res 33:261–304

    Article  Google Scholar 

  41. Hyndman RJ (2011) Statistical tests for variable selection. http://robjhyndman.com/hyndsight/tests2/. Accessed 5 Jan 2013

  42. Berrar DP, Sturgeon B, Bradbury I, Dubitzky W (2003) Microarray data integration and machine learning techniques for lung cancer survival prediction. In: Proceedings of the the International Conference of Critical Assessment of Microarray Data Analysis

  43. Napolitano F, Zhao Y, Moreira VM, Tagliaferri R, Kere J, Amato MD, Greco D (2013) Drug repositioning: a machine-learning approach through data integration. J Cheminform 5:1–9

    Article  Google Scholar 

  44. Ozen A, Gönen M, Alpaydın E, Haliloğlu T (2009) Machine learning integration for predicting the effect of single. BMC Struct Biol 17:1–17

    Google Scholar 

  45. Bucak SS, Member S, Jin R, Jain AK (2014) Multiple kernel learning for visual object recognition : a review. IEEE Trans Pattern Anal Mach Intell 36:1354–1369

    Article  Google Scholar 

  46. Yu S, Falck T, Daemen A, Tranchevent L, Suykens JAK, Moor B De, Moreau Y (2010) L2-norm multiple kernel learning and its application to biomedical data fusion. BMC Bioinform 11:309–322

    Article  Google Scholar 

  47. Hyndman RJ, Athanasopoulos G (2012) Forecasting: principles and practice. In: Online, Open Access Textb. https://www.otexts.org/fpp

  48. Ramasubramanian V (2007) Time series analysis. IASRI, Library Avenue, New Delhi

    Google Scholar 

  49. Torres-reyna O (2012) Time series. In: Data and statistical services. Princeton University. http://www.princeton.edu/~otorres/TS101.pdf. Accessed 14 Dec 2013

  50. Pearson R (2011) Exploring data in engineering, the science and medicine. Oxford University Press, Oxford

    Google Scholar 

  51. Nielsen ML (2012) Hampel filter. http://www.mathworks.com/matlabcentral/fileexchange/34795-outlier-detection-and-removal-hampel/content/hampel.m. Accessed 27 Jan 2013

  52. Yeh Y, Lin T, Chung Y, Wang YF (2012) A novel multiple kernel learning framework for heterogeneous feature fusion and variable selection 14:563–574

  53. Wang X, Smith-miles K, Hyndman R (2009) Rule induction for forecasting method selection: meta-learning the characteristics of univariate time series. Neurocomputing 72:2581–2594

    Article  Google Scholar 

  54. Hyndman RJ, Khandakar Y (2008) Automatic time series forecasting: the forecast package for R. J Stat Softw 27:1–22

    Article  Google Scholar 

  55. Box GEP, Jenkins GM (1970) Time series analysis: forecasting and control. Wiley, San Francisco

    MATH  Google Scholar 

  56. Hyndman RJ (2006) Another look at forecast-accuracy metrics for intermittent demand. Foresight 4:43–46

    Google Scholar 

  57. Kourentzes N (2007) Exponential smoothing models. http://nikolaos.kourentzes.com. Accessed 15 Nov 2012

  58. Hyndman RJ (2013) Forecasting without forecasters. In: Keynote lecture at the 2013 international symposium forecast, Seoul

  59. Wang X-Z, He Q, Chen D-G, Yeung D (2005) A genetic algorithm for solving the inverse problem of support vector machines. Neurocomputing 68:225–238

    Article  Google Scholar 

  60. Wang X-Z, Lu S-X, Zhai J-H (2008) fast fuzzy multicategory svm based on support vector domain description. Int J Pattern Recognit Artif Intell 22:109–120

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Agus Widodo.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Widodo, A., Budi, I. & Widjaja, B. Automatic lag selection in time series forecasting using multiple kernel learning. Int. J. Mach. Learn. & Cyber. 7, 95–110 (2016). https://doi.org/10.1007/s13042-015-0409-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-015-0409-7

Keywords

Navigation