Divide and Conquer Ensemble Method for Time Series Forecasting

Kostrzewa, Jan; Mazzocco, Giovanni; Plewczynski, Dariusz

doi:10.1007/978-3-662-53525-7_8

Jan Kostrzewa^16,17,
Giovanni Mazzocco^16,17 &
Dariusz Plewczynski¹⁷

Part of the book series: Lecture Notes in Computer Science ((TCCI,volume 9770))

470 Accesses

Abstract

Time series forecasting have attracted a great deal of attention from various research communities. There are many methods which divide time series into subseries. Information granules, fuzzy clustering and data segmentation are among the most popular methods in this field. However all these methods are designed to recognize dependencies between adjacent points. In order to do so, they divide the time series into time intervals. This imply some limitations in findings strongly non-local dependencies between points scatter across whole time series. The Divide and Conquer ensemble algorithm here presented was designed to address such limitations. The model samples the series into many subseries, searches for possible patterns and finally chooses the most significant subseries for further investigation. Since the prediction error evaluated on the subseries is lower than the one evaluated on the original time-series, the proposed strategy can significantly mitigate the overall prediction error. In order to evaluate the efficiency of our approach we performed the analysis on various artificial datasets. In a real world example our algorithm showed a 3-fold improvement of the accuracy with respect to other state-of-the-art methods. Although the algorithm was designed for time-series forecasting, it can be easily used for noise filtering purposes. Simulations reported in the present work illustrate the potential of the method in this field of application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Australian Bureau of Statistics. https://datamarket.com/data/set/22xn/quarterly-australian-gross-farm-product-m-198990-prices-sep-59-mar-93/, Accessed 19-July-2015
de Boor, C.: A practical guide to splines (1978)
Google Scholar
Karypis, G., Han, E.H., Kumar, V.: Chameleon: hierarchical clustering using dynamic modeling. Computer 32, 68–75 (1999)
Article Google Scholar
http://lib.stat.cmu.edu/datasets/
Wu, H., Sharp, G., Salzberg, B., Kaeli, D., Shirato, H., Jiang, S.: Subsequence matching on structured time series data. In: SIGMOD (2005)
Google Scholar
Hppner, F.: Knowledge discovery from sequential data (2002)
Google Scholar
Han, J., Kamber, M.: Data mining: Concepts and techniques. Morgan Kaufmann, San Francisco (2001)
MATH Google Scholar
Han, J., Kamber, M.: Application of neural networks to an emerging financial market: forecasting and trading the taiwan stock index. Comput. Oper. Res. 30, 901–923 (2003)
Article Google Scholar
Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing sax: a novel symbolic representation of time series. Data Min. Knowl. Discov. 15(2), 107–144 (2007)
Article MathSciNet Google Scholar
JF, A.: Maintaining knowledge about temporal intervals, pp. 832–843 (1983)
Google Scholar
Keogh, E.: A survey and novel approach, pp. 1–22 (2004)
Google Scholar
Kovai, Z.: Time series analysis, faculty of economics (1995)
Google Scholar
La, Z.: Fuzzy sets and information granularity, pp. 3–18 (1979)
Google Scholar
Ester, M., Kriegel, H.-P., Jiirg, S., Xiaowei, X.: A densitybased algorithm for discovering clusters in large spatial databases. In: Proceedings of the 1996 International Conference on Knowledge Discovery and Data Mining (KDD 1996) (1996)
Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability 1, pp. 281–297. University of California Press (1967)
Google Scholar
Marquardt, D.: An algorithm for least-squares estimation of nonlinear parameters. SIAM J. Appl. Math. 11(2), 431–441 (1963)
Article MathSciNet MATH Google Scholar
Cheeseman, P., Stutz, J.: Sting: a statistical information grid approach to spatial data mining. Bayesian classification (AutoClass): theory and results. In: Fayyard, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press, Cambridge, MA (1996)
Google Scholar
Pedrycz, W., Vukovich, G.: Abstraction and specialization of information granules, pp. 106–111 (2001)
Google Scholar
Ramsay, J.O., Silverman, B.W.: Functional data analysis (1997)
Google Scholar
Makridakis, S., Wheelwright, S., Hyndman, R.: Forecasting: Methods and applications. Wiley, New York (1997)
Google Scholar
Song, H.J., Shen, Z.Q., Miao, C., Miao, Y.C.: Fuzzy cognitive map learning based on multi-objective particle swarm optimization. IEEE Trans. Fuzzy 18(2), 233–250 (2010)
Google Scholar
Tong, H.: Threshold models in non-linear time series analysis. Springer, Heidelberg (1983)
Book MATH Google Scholar
Wang, W., Yang, J., Reeves, M.R.: Sting: a statistical information grid approach to spatial data mining. In: Proceedings of the 1997 International Conference on Very Large Data Base (VLDB 1997) (1997)
Google Scholar
Wang, W., WitoldPedry, X.L.: Time series long-term forecasting model based on information granules and fuzzy clustering, pp. 17–24 (2015)
Google Scholar
Zhang, G.: Time series forecasting using a hybrid arima and neural network model. Neurocomputing 50, 159–175 (2003)
Article MATH Google Scholar
Zhang, G.: A neural network ensemble method with jittered training data for time series forecasting. Inf. Sci. 177, 5329–5346 (2007)
Article Google Scholar

Download references

Acknowledgements

This research was supported by the European Union from financial resources of the European Social Fund, Project PO KL Information technologies: Research and their interdisciplinary applications and by the Polish National Science Centre with the grants 2014/15/B/ST6/05082 and 2013/09/B/NZ2/00121.

Author information

Authors and Affiliations

Institute of Computer Science, Polish Academy of Sciences, ul. Jana Kazimierza 5, 01-248, Warsaw, Poland
Jan Kostrzewa & Giovanni Mazzocco
Centrum of New Technologies, University of Warsaw, ul.Banacha 2c, 02-097, Warsaw, Poland
Jan Kostrzewa, Giovanni Mazzocco & Dariusz Plewczynski

Authors

Jan Kostrzewa
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Mazzocco
View author publications
You can also search for this author in PubMed Google Scholar
Dariusz Plewczynski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giovanni Mazzocco .

Editor information

Editors and Affiliations

Faculty of Computer Science and Manageme, Wrocław University of Technology, Wrocław, Poland
Ngoc Thanh Nguyen
Swinburne University of Technology, Hawthorn, Australia
Ryszard Kowalczyk
Escola Superior de Tecnologiy de Setébual, Setúbal, Portugal
Joaquim Filipe

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kostrzewa, J., Mazzocco, G., Plewczynski, D. (2016). Divide and Conquer Ensemble Method for Time Series Forecasting. In: Nguyen, N., Kowalczyk, R., Filipe, J. (eds) Transactions on Computational Collective Intelligence XXIV. Lecture Notes in Computer Science(), vol 9770. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53525-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-662-53525-7_8
Published: 30 September 2016
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-53524-0
Online ISBN: 978-3-662-53525-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics