Correlation and instance based feature selection for electricity load forecasting

doi:10.1016/j.knosys.2015.02.017

Knowledge-Based Systems

Volume 82, July 2015, Pages 29-40

https://doi.org/10.1016/j.knosys.2015.02.017 Get rights and content

Abstract

Appropriate feature (variable) selection is crucial for accurate forecasting. In this paper we consider the task of forecasting the future electricity load from a time series of previous electricity loads, recorded every 5 min. We propose a two-step approach that identifies a set of candidate features based on the data characteristics and then selects a subset of them using correlation and instance-based feature selection methods, applied in a systematic way. We evaluate the performance of four feature selection methods – one traditional (autocorrelation) and three advanced machine learning (mutual information, RReliefF and correlation-based), in conjunction with state-of-the-art prediction algorithms (neural networks, linear regression and model tree rules), using two years of Australian electricity load data. Our results show that all feature selection methods were able to identify small subsets of highly relevant features. The best two prediction models utilized instance and autocorrelation based feature selectors and an efficient neural network prediction algorithm. They were more accurate than advanced exponential smoothing prediction models, a typical industry model and other baselines used for comparison.

Introduction

Forecasting the future electricity load is an important task in the management of modern energy systems. It is used to make decisions about the commitment of generators, setting reserve requirements for security and scheduling maintenance. Its goal is to ensure reliable electricity supply while minimizing the operating cost.

Electricity load forecasting is classified into four types based on the forecasting horizon: long-term (years ahead), medium-term (months to a year ahead), short-term (1 day to weeks ahead) and very short-term (minutes and hours ahead). In this paper we consider Very Short-Term Load Forecasting (VSTLF), in particular 5 min ahead forecasting. VSTLF plays an important role in competitive energy markets such as the Australian national electricity market. It is used by the market operator to set the required demand and its price and by the market participants to prepare bids. The importance of VSTLF increases with the emergence of the smart grid technology as the demand response mechanism and the real time pricing require predictions at very short intervals [1].

Predicting the electricity load with high accuracy is a challenging task. The electricity load time series is complex and non-linear, with daily, weekly and annual cycles. It also contains random components due to fluctuations in the electricity usage of individual users, large industrial units with irregular hours of operation, special events and holidays and sudden weather changes.

Various approaches for VSTLF have been proposed; the most successful are based on Holt–Winters exponential smoothing and Autoregressive Integrated Moving Average (ARIMA) [2], Linear Regression (LR) and Neural Networks (NNs) trained with the backpropagation algorithm [3], [4], [5], [6], [7]. The problem of feature selection for VSTLF, however, has not received enough attention, and it is the focus of this paper.

Feature (variable) selection is the process of selecting a set of representative features (variables) that are relevant and sufficient for building a prediction model. It has been an active research area in machine learning [8], [9], [10]. Good feature selection improves the predictive accuracy, leads to faster training and smaller complexity of the prediction model. It is considered as one of the key factors for successful prediction.

Most of the existing approaches for VSTLF identify features in a non-systematic way or use standard autocorrelation analysis, which only captures linear dependencies between the predictor variables and the output variable that is predicted. The main goal of this paper is to show how advanced machine learning feature selection methods can be applied for electricity load forecasting, and more generally to energy time series forecasting. In particular, our contribution can be summarized as follows:

•
We adapt and apply three advanced machine learning feature selection algorithms – Mutual Information (MI), RReliefF (RF) and Correlation-Based Selection (CFS) – to the task of load forecasting. We chose these methods as they are appropriate for the nature of the electricity load data – they can identify both linear and non-linear relationships (MI and RF) and capture both relevant and redundant features (CFS, RF), see Section 3. For comparison we also apply a method based on Autocorrelation (AC). We show how these feature selection methods can be applied in a systematic way to energy time series.
•
We propose a two-step approach for feature selection. In the first step we form a set of candidate features by applying a 1 week sliding window. A 1 week sliding window greatly reduces dimensionality while still capturing the main characteristics of data. In the second step we use a feature selection method to evaluate the quality of the candidate features and select a final subset of features.
•
We use the selected features with state-of-the-art prediction algorithms: NN, LR and Model Tree Rules (MTR). Hippert et al. [11] reviewed the application of NNs for electricity load forecasting and noted the need for systematic and fair comparison between NNs, standard linear statistical methods such as LR and other prediction algorithms.
•
We conduct a comprehensive evaluation using two years of Australian electricity data. This includes a comparison with exponential smoothing (one of the most successful methods for load forecasting), a typical prediction model used by industry forecasters and several other benchmarks.
•
We investigate additional aspects of the feature selection algorithms such as effect of the number of neighbors in AC and the number of features in MI and RF.

The rest of this paper is organized as follows. Section 2 reviews the related work. Section 3 analyses the data characteristics. Section 4 describes the proposed feature selection methods and how they were applied to our task. Section 5 presents the prediction algorithms we used and their parameters. Section 6 describes the methods used for comparison. Section 7 summarizes the experimental setup. Section 8 presents and discusses the results. Finally, Section 9 concludes the paper.

Section snippets

Previous work

VSTLF is a relatively new area that has become important with the introduction of competitive electricity markets, and more recently, with the arrival of the smart grid. In contrast, short-term load forecasting has been widely studied, e.g. see [11], [12], [13], [14].

There are two main groups of approaches for VSTLF: traditional statistical and computational intelligence. Prominent examples of the first group are exponential smoothing and ARIMA; these methods are linear and model-based. The

Data analysis

We use electricity load data measured at 5 min intervals for a period of two years: from 1st January 2006 until 31st December 2007. Each measurement represents the total electricity load for the state of New South Wales (NSW) in Australia. The data was provided by the Australian Electricity Market Operator (AEMO) [18].

In order to build accurate prediction models, it is important to understand the data characteristics and the external variables affecting the forecasting.

Feature selection

Feature selection is the process of removing irrelevant and redundant features and selecting a small set of informative features that are necessary and sufficient for good prediction. Feature selection has been an active area of research in machine learning and statistics [8], [9], [10], [20]. Feature selection increases predictive accuracy by reducing overfitting and addressing the curse of dimensionality problem. It also affects the speed of the prediction algorithm – smaller feature set

Prediction algorithms

We applied three state-of-the-art machine learning algorithms, representing different learning paradigms: NN, LR and MTR.

Prediction methods used for comparison

We compare the performance of our approach with four baselines, a typical industry model and three different versions of the exponential smoothing method. Exponential smoothing is one of the most popular and successful econometric methods used for electricity forecasting.

Data

The available data is a time series of 5 min electricity loads for two years, 2006 and 2007. The total number of samples is 210,240 (2 years × 365 days × 24 h × 12 measurements). There were 272 missing data points (0.1% of all data) that were replaced with the average of the previous 3 load values. The data has been normalized between −1 and 1. For our prediction task, one example is a 2016-dimensional feature vector after the initial feature selection and a 35–50-dimensional vector after the secondary

Results and discussion

Table 4 shows the performance of the four proposed feature sets with NN, LR and MTR. Table 5 shows the performance of the baselines and the methods used for comparison.

Conclusions

We considered the task of predicting the electricity load one step ahead from a time series of previous electricity loads measured every 5 min. We evaluated the performance of four feature selection methods – three advanced machine learning (MI, RF and CFS) and one traditional statistical method (AC). These methods differ in the type of relationships they detect (both linear and non-linear), ability to capture relationships between features and the generation of the feature subset (explicit or

References (40)

J.W. Taylor
An evaluation of methods for very short-term load forecasting using minute-by-minute British data
Int. J. Forecast.
(2008)
R. Kohavi et al.
Wrappers for feature selection
Artif. Intell.
(1997)
J.W. Taylor et al.
A comparison of univariate methods for forecasting electricity demand up to a day ahead
Int. J. Forecast.
(2006)
G.A. Darbellay et al.
Forecasting the short-term demand for electricity – do neural networks stand a better chance?
Int. J. Forecast.
(2000)
G.P. Zhang et al.
Neural network forecasting for seasonal and trend time series
Euro. J. Oper. Res.
(2005)
P.H. Franses et al.
Recognising changing seasonal patterns using artificial neural networks
J. Economet.
(1997)
C. Chen et al.
Online 24 h solar power forecasting based on weather type classification using artificial neural networks
Solar Energy
(2011)
S.C. Chan et al.
Load/price forecasting and managing demand response for smart grids
IEEE Signal Proc. Mag.
(2012)
K. Liu et al.
Comparison of very short-term load forecasting techniques
IEEE Trans. Power Syst.
(1996)
W. Charytoniuk et al.
Very short-term load forecasting using artificial neural networks
IEEE Trans. Power Syst.
(2000)

P. Shamsollahi, K.W. Cheung, Q. Chen, E.H. Germain, A neural network based very short term load forecaster for the...

D. Chen, M. York, Neural network based very short term load prediction, in: Proceedings of the IEEE Power and Energy...

I. Koprinska, M. Rana, V.G. Agelidis, Yearly and seasonal models for electricity load forecasting, in: Proceedings of...

I. Guyon et al.

An introduction to variable and feature selection

J. Mach. Learn. Res.

(2003)

L. Yu et al.

Efficient feature selection via analysis of relevance and redundancy

J. Mach. Learn. Res.

(2004)

H.S. Hippert et al.

Neural Networks for short-term load forecasting: a review and evaluation

IEEE Trans. Power Syst.

(2001)

E.A. Feinberg et al.

Load forecasting

S. Fan et al.

Short-term load forecasting based on a semi-parametric additive model

IEEE Trans. Power Syst.

(2012)

F. Martínez-Álvarez et al.

Energy time series forecasting based on pattern sequence similarity

IEEE Trans. Knowl. Data Eng.

(2011)

A.J.R. Reis et al.

Feature extraction via multiresolution analysis for short-term load forecasting

IEEE Trans. Power Syst.

(2005)

Cited by (191)

A data-driven evidential regression model for building hourly energy consumption prediction with feature selection and parameters learning
2023, Journal of Building Engineering
Building energy consumption prediction is critical for building energy management and energy policy formulation, and its inherent uncertainty can significantly affect the utilization of current energy market benefits for market participants. To capture the uncertainty of energy consumption and enhance the predictive capability of the model, in this study, a data-driven evidential regression (EVREG) model with integrated feature selection function is proposed based on Dempster–Shafer theory and mutual information, which can perform point prediction and interval prediction for building hourly energy consumption to describe its fluctuation and uncertainty. Different from the traditional EVREG model, this method enables simultaneous feature selection and model parameters learning instead of treating feature selection as a separate data pre-processing step. Specifically, an evaluation function is defined to describe the significance of a candidate feature, taking into account the predictive power of regression model and the redundancy between the candidate feature and already selected features. According to a search strategy, features with high significance are selected to minimize the objective function. A real dataset from a commercial building is used to evaluate the performance of the proposed method. The results demonstrate that the proposed method can select fewer features while achieving better prediction performance compared to traditional feature selection methods used as data preprocessing. The proposed method also achieves better or comparable performance compared to commonly applied point prediction and interval prediction methods.
Comparison of electric vehicle load forecasting across different spatial levels with incorporated uncertainty estimation
2023, Energy
Accurate load forecasting is important to mitigate the negative impact of Electric vehicle integration into the existing grid. Previous studies mostly focus on individual or aggregated levels without specifying the impact of accuracy due to the selection of different spatial levels and lack the integration of uncertainty estimation in the forecasting models. To address these issues, this study compares the predictive performance of a Random Forest and Artificial Neural Networks at different spatial levels with 15-min resolution data across case studies (i) with 2 Electric Vehicles charging poles and 3 users, (ii) with 75 charging poles, 8 charging rails and 70 users. The outcome shows that forecasting the Electric Vehicle load of smaller case studies will require the presence or calendar information of users. Whereas in case studies with more than 10 charging piles, the features “previous week's power”, “hour of the day” and the “number of connections” can achieve similar results. The results also showed that the aggregated forecasting was more accurate than individual charging piles. Moreover, the uncertainty plot generated for a 90% prediction interval showed that the uncertainty estimates were more reliable for the case study with large numbers of Electric Vehicles.
Residual LSTM based short-term load forecasting
2023, Applied Soft Computing
As the modern energy systems is becoming more complex and flexible, accurate load forecasting has been the key to scheduling power to meet customers’ needs, load switching, and infrastructure development. In this paper, we propose a neural network framework based on a modified deep residual network (DRN) and a long short-term memory (LSTM) recurrent neural network (RNN) for addressing the short-term load forecasting (STLF) problem. The proposed model not only inherits the DRN’s excellent characteristic to avoid vanishing gradient for training deeper neural networks, but also continues the LSTM’s strong ability to capture nonlinear patterns for time series forecasting. Moreover, through the dimension weighted units based on attention mechanism, the dimension-wise feature response is adaptively recalibrated by explicitly modeling the interdependencies between dimensions, so that we can jointly improve the performance of the model from three aspects: depth, time and feature dimension. The snapshot ensemble method has also been applied to improve the accuracy and robustness of the proposed model. By implementing multiple sets of experiments on two public datasets, we demonstrate that the proposed model has high accuracy, robustness and generalization capability, and can perform STLF better than the existing mainstream models.
A clustering-based feature enhancement method for short-term natural gas consumption forecasting
2023, Energy
Natural gas consumption forecasting is crucial for planning and operating of sustainable energy systems. The accuracy of consumption forecasting is significantly affected by the quality of the collected features. Previous feature clustering methods, such as K-means and Gaussian mixed model (GMM), ignore the interference of factors with weak correlation on the clustering effect and thus fail to extract key information from the collected features. This paper proposes a novel feature enhancement method, namely, Gaussian correlation mixed clustering (GCMC), to extract fluctuation patterns from the highest correlation factors and divides the original sequence into multiple clusters to enhance the feature quality while reducing the complex fluctuation. Among them, correlation coefficient analysis, GMM, Bayesian information criterion and an improved information evaluation method are combined to cluster the selected highest correlation feature based on fluctuation patterns and evaluate the enhancement effect of feature quality. Then, each of the divided clusters is regarded as an independent dataset of the long short-term memory (LSTM) model for parallel forecasting and the results are restored to the structure of original sequence. In our experiments, we design four real-life datasets with different complexities. The results reveal that the proposed method outperforms GMM in terms of information entropy and accuracy. The information entropy for evaluating feature quality is improved by 6.13–9.66%. In comparison with other classic forecasting models, the mean absolute range normalized error (MARNE) of GCMC-LSTM for Karditsa, Thessaloniki, Oinofyta and Salfa Anthoussa are 6.06%, 4.62%, 14.18% and 15.30%, respectively, which presents the best performance and robustness. Especially for datasets with high complexity, by introducing GCMC, the MARNE is improved by 32.72% in Oinofyta.
Near real-time wind speed forecast model with bidirectional LSTM networks
2023, Renewable Energy
Wind is an important source of renewable energy, often used to provide clean electricity to remote areas. For optimal extraction of this energy source, there is a need for an accurate and robust wind speed forecasting. The intermittent nature of wind makes this goal quite challenging. This research proposes a novel hybrid bidirectional LSTM (BiLSTM) model for near real-time wind speed forecasting. The hybrid model is developed using wind speed and selected climate indices from a group of neighbouring reference stations as predictors to forecast wind speed of a target station. A 3-stage feature selection is applied on the predictors to robustly extract highly significant input features. Stage 1 employs partial auto-correlation and cross-correlation, stage 2 uses the RReliefF filter algorithm, and Boruta-RF wrapper method is implemented in the final stage to improve the BiLSTM model with an efficient Bayesian optimization used for hyperparameter tuning. The proposed model has been benchmarked with comparative models including standalone and hybrid LSTM, RNN, MLP and RF. The proposed hybrid BiLSTM algorithm is found to be superior in wind speed prediction for all tested sites with $\approx$ $76.6 - 84.8 %$ of errors being $\leq$ $| 0.5 | {ms}^{- 1}$ . The hybrid BiLSTM model also registered the lowest Relative Root Mean Square Error $(9.6 - 23.8 %)$ and Mean Absolute Percentage Error $(8.8 - 21.5 %)$ among all the tested algorithms. This research ascertains that the proposed model can accurately predict wind speed and capacitate wind energy availability to be regularly monitored at a near real-time level.
How to capture tourists’ search behavior in tourism forecasts? A two-stage feature selection approach
2023, Expert Systems with Applications
Search engine data have been widely used and shown to be useful in tourism demand forecasting. However, considering of the vast amounts of search keywords, how to better capture the tourists’ attention and explore the most predictive keyword combination remain unsolved. In this study, a two-stage feature selection-based methodology is proposed to address this question. Specifically, i.e., single feature selection method comparison for selecting a relative effective way to reduce the data dimension and ensure the quality of the initial subset, genetic algorithm in the second stage for obtaining feature subset better suitable for forecasting model with stronger predictive power. Experimental results indicate that the two-stage feature selection method outperforms all the considered benchmarks.

View all citing articles on Scopus

View full text

Correlation and instance based feature selection for electricity load forecasting

Abstract

Introduction

Section snippets

Previous work

Data analysis

Feature selection

Prediction algorithms

Prediction methods used for comparison

Data

Results and discussion

Conclusions

Int. J. Forecast.

Artif. Intell.

Int. J. Forecast.

Int. J. Forecast.

Euro. J. Oper. Res.

J. Economet.

Solar Energy

Load/price forecasting and managing demand response for smart grids

IEEE Signal Proc. Mag.

Comparison of very short-term load forecasting techniques

IEEE Trans. Power Syst.

Very short-term load forecasting using artificial neural networks

IEEE Trans. Power Syst.

An introduction to variable and feature selection

J. Mach. Learn. Res.

Efficient feature selection via analysis of relevance and redundancy

J. Mach. Learn. Res.

Neural Networks for short-term load forecasting: a review and evaluation

IEEE Trans. Power Syst.

Load forecasting

Short-term load forecasting based on a semi-parametric additive model

IEEE Trans. Power Syst.

Energy time series forecasting based on pattern sequence similarity

IEEE Trans. Knowl. Data Eng.

Feature extraction via multiresolution analysis for short-term load forecasting

IEEE Trans. Power Syst.