Abstract
This paper presents a comparison of the impact of various unsupervised ensemble learning methods on electricity load forecasting. The electricity load from consumers is simply aggregated or optimally clustered to more predictable groups by cluster analysis. The clustering approach consists of efficient preprocessing of data obtained from smart meters by a model-based representation and the K-means method. We have implemented two types of unsupervised ensemble learning methods to investigate the performance of forecasting on clustered or simply aggregated load: bootstrap aggregating based and the newly proposed density-clustering based. Three new bootstrapping methods for time series analysis methods were newly proposed in order to handle the noisy behaviour of time series. The smart meter datasets used in our experiments come from Australia, London, and Ireland, where data from residential consumers were available. The achieved results suggest that for extremely fluctuating and noisy time series the forecasting accuracy improvement through the bagging can be a challenging task. However, our experimental evaluation shows that in most of the cases the density-based unsupervised ensemble learning methods are significantly improving forecasting accuracy of aggregated or clustered electricity load.
Similar content being viewed by others
References
Adhikari, R., Verma, G., Khandelwal, I. (2015). A model ranking based selective ensemble approach for time series forecasting. Procedia Computer Science, 48, 14–21.
Ankerst, M., Breunig, M.M., Kriegel, H.P., Sander, J. (1999). Optics: ordering points to identify the clustering structure. In ACM Sigmod record (Vol. 28, pp. 49–60). ACM.
Arthur, D., & Vassilvitskii, S. (2007). K-means++: the advantages of careful seeding. In SODA ’07 Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms (pp. 1027–1035).
Bartholomew, D.J., Box, G.EP., Jenkins, G.M. (1971). Time series analysis forecasting and control. Operational Research Quarterly (1970–1977), 22(2), 199. https://doi.org/10.2307/3008255.
Bergmeir, C., Hyndman, R.J., Benítez, JM. (2016). Bagging exponential smoothing methods using stl decomposition and box–cox transformation. International Journal of Forecasting, 32(2), 303–312.
Bilton, M., Carmichael, R., Schofield, J.R., Strbac, G., Tindemans, S.,Woolf, M. (2016). Low carbon london project: data from the dynamic time-of-use electricity pricing trial, 2013. [data collection]. UK Data Service. SN: 7857, https://doi.org/10.5255/UKDA-SN-7857-2.
Bouktif, S., Fiaz, A., Ouni, A., Serhani, M.A. (2018). Optimal deep learning LSTM model for electric load forecasting using feature selection and genetic algorithm: comparison with machine learning approaches. Energies, 11(7). https://doi.org/10.3390/en11071636.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. https://doi.org/10.1023/A:1018054314350.
Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A. (1984). Classification and regression trees. Boca Raton: CRC Press.
Ceci, M., Corizzo, R., Fumarola, F., Malerba, D., Rashkovska, A. (2017). Predictive modeling of pv energy production: how to set up the learning task for a better prediction? IEEE Transactions on Industrial Informatics, 13(3), 956–966. https://doi.org/10.1109/TII.2016.2604758.
Cerqueira, V., Torgo, L., Pinto, F., Soares, C. (2017). Arbitrated ensemble for time series forecasting. In M. Ceci, J. Hollmén, L. Todorovski, C. Vens, S. Džeroski (Eds.) , Machine learning and knowledge discovery in databases (pp. 478–494). Cham: Springer International Publishing.
Cleveland, R.B., Cleveland, W.S., McRae, J.E., Terpenning, I. (1990). STL: a seasonal-trend decomposition procedure based on Loess. Journal of Official Statistics, 6(1), 3–73.
Davies, D.L., & Bouldin, DW. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(2), 224–227. https://doi.org/10.1109/TPAMI.1979.4766909.
Ester, M., Kriegel, H.P., Sander, J., Xu, X., et al. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In Kdd (Vol. 96, pp. 226–231).
Friedman, J., Hastie, T., Tibshirani, R. (2001). The elements of statistical learning, vol 1. Springer series in statistics, New York.
Hamerly, G., & Elkan, C. (2004). Learning the k in k-means. In Advances in neural information processing systems (pp. 281–288).
Holt, C.C. (2004). Forecasting seasonals and trends by exponentially weighted moving averages. International Journal of Forecasting, 20(1), 5–10.
Hyndman, R., & Khandakar, Y. (2008). Automatic time series forecasting: the forecast package for r. Journal of Statistical Software, Articles, 27(3), 1–22. https://doi.org/10.18637/jss.v027.i03. https://www.jstatsoft.org/v027/i03.
Hyndman, R.J., & Koehler, AB. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679–688.
Hyndman, R.J., Koehler, A.B., Snyder, R.D., Grose, S. (2002). A state space framework for automatic forecasting using exponential smoothing methods. International Journal of Forecasting, 18(3), 439–454. https://doi.org/10.1016/S0169-2070(01)00110-8.
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S. (2001). Dimensionality reduction for fast similarity search in large time series databases. Knowledge and information Systems, 3(3), 263–286.
Kosková, G, Laurinec, P, Rozinajová, V, Ezzeddine, AB, Lucká, M, Lacko, P, Vrablecová, P, Návrat, P. (2015). Incremental ensemble learning for electricity load forecasting. Acta Polytechnica Hungarica, 13(2), 97–117.
Kunsch, H.R. (1989). The jackknife and the bootstrap for general stationary observations. Annals of Statistics, 17(3), 1217–1241. https://doi.org/10.1214/aos/1176347265.
Laurinec, P. (2018). Tsrepr r package: time series representations. Journal of Open Source Software, 3(23), 577. https://doi.org/10.21105/joss.00577.
Laurinec, P, & Lucká, M. (2016). Comparison of representations of time series for clustering smart meter data. In Lecture notes in engineering and computer science: proceedings of the world congress on engineering and computer science (pp. 458–463).
Laurinec, P, & Lucká, M. (2018a). Clustering-based forecasting method for individual consumers electricity load using time series representations. Open Computer Science, 8(1), 38–50.
Laurinec, P, & Lucká, M. (2018b). Usefulness of unsupervised ensemble learning methods for time series forecasting of aggregated or clustered load. In A. Appice, C. Loglisci, G. Manco, E. Masciari, Z.W. Ras (Eds.) , New frontiers in mining complex patterns (pp. 122–137). Cham: Springer International Publishing.
Laurinec, P, Lóderer, M, Vrablecová, P, Lucká, M, Rozinajová, V, Ezzeddine, AB. (2016). Adaptive time series forecasting of energy consumption using optimized cluster analysis. In 2016 IEEE 16th international conference on data mining workshops (ICDMW) (pp. 398–405). IEEE.
Petropoulos, F, Hyndman, RJ, Bergmeir, C. (2018). Exploring the sources of uncertainty: why does bagging for time series forecasting work? European Journal of Operational Research, 268(2), 545–554. https://doi.org/10.1016/j.ejor.2018.01.045.
Pravilovic, S, Bilancia, M, Appice, A, Malerba, D. (2017). Using multiple time series analysis for geosensor data forecasting. Information Sciences, 380, 31–52.
Rendon, J, & de Menezes, LM. (2016). Structural combination of neural network models. In 2016 IEEE 16th international conference on data mining workshops (ICDMW) (pp. 406–413), DOI https://doi.org/10.1109/ICDMW.2016.0064, (to appear in print).
Shahzadeh, A., Khosravi, A, Nahavandi, S. (2015). Improving load forecast accuracy by clustering consumers using smart meter data. In 2015 international joint conference on neural networks (IJCNN). https://doi.org/10.1109/IJCNN.2015.7280393(pp. 1–7). IEEE.
Shen, W., Babushkin, V., Aung, Z.,Woon,W.L. (2013). An ensemble model for day-ahead electricity demand time series forecasting. In Proceedings of the fourth international conference on future energy systems (pp. 51–62). ACM.
Smiti, A, & Elouedi, Z. (2012). Dbscan-gm: an improved clustering method based on gaussian means and dbscan techniques. In 2012 IEEE 16th international conference on intelligent engineering systems (INES) (pp. 573–578). IEEE.
Strasser, H, & Weber, C. (1999). On the asymptotic theory of permutation statistics. SFB Adaptive Information Systems and Modelling in Economics and Management Science.
Wijaya, T.K., Vasirani, M., Humeau, S., Aberer, K. (2015). Cluster-based aggregate forecasting for residential electricity demand using smart meter data. In 2015 IEEE international conference on Big data (Big data). https://doi.org/10.1109/BigData.2015.7363836 (pp. 879–887). IEEE.
Xia, L., & Jing, J. (2007). An ensemble density-based clustering method. In International conference on intelligent systems and knowledge engineering 2007. Atlantis Press, DOI https://doi.org/10.2991/iske.2007.45, (to appear in print).
Acknowledgments
This work was supported by Slovak Research and Development Agency under the contract No. APVV-16-0484 and No. APVV-16-0213, as well as with the support of the Research and Development Operational Programme for the project International centre of excellence for research of intelligent and secure information-communication technologies and systems, ITMS 26240120039, co-funded by the ERDF.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Laurinec, P., Lóderer, M., Lucká, M. et al. Density-based unsupervised ensemble learning methods for time series forecasting of aggregated or clustered electricity consumption. J Intell Inf Syst 53, 219–239 (2019). https://doi.org/10.1007/s10844-019-00550-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-019-00550-3