Elsevier

Knowledge-Based Systems

Volume 82, July 2015, Pages 29-40
Knowledge-Based Systems

Correlation and instance based feature selection for electricity load forecasting

https://doi.org/10.1016/j.knosys.2015.02.017Get rights and content

Abstract

Appropriate feature (variable) selection is crucial for accurate forecasting. In this paper we consider the task of forecasting the future electricity load from a time series of previous electricity loads, recorded every 5 min. We propose a two-step approach that identifies a set of candidate features based on the data characteristics and then selects a subset of them using correlation and instance-based feature selection methods, applied in a systematic way. We evaluate the performance of four feature selection methods – one traditional (autocorrelation) and three advanced machine learning (mutual information, RReliefF and correlation-based), in conjunction with state-of-the-art prediction algorithms (neural networks, linear regression and model tree rules), using two years of Australian electricity load data. Our results show that all feature selection methods were able to identify small subsets of highly relevant features. The best two prediction models utilized instance and autocorrelation based feature selectors and an efficient neural network prediction algorithm. They were more accurate than advanced exponential smoothing prediction models, a typical industry model and other baselines used for comparison.

Introduction

Forecasting the future electricity load is an important task in the management of modern energy systems. It is used to make decisions about the commitment of generators, setting reserve requirements for security and scheduling maintenance. Its goal is to ensure reliable electricity supply while minimizing the operating cost.

Electricity load forecasting is classified into four types based on the forecasting horizon: long-term (years ahead), medium-term (months to a year ahead), short-term (1 day to weeks ahead) and very short-term (minutes and hours ahead). In this paper we consider Very Short-Term Load Forecasting (VSTLF), in particular 5 min ahead forecasting. VSTLF plays an important role in competitive energy markets such as the Australian national electricity market. It is used by the market operator to set the required demand and its price and by the market participants to prepare bids. The importance of VSTLF increases with the emergence of the smart grid technology as the demand response mechanism and the real time pricing require predictions at very short intervals [1].

Predicting the electricity load with high accuracy is a challenging task. The electricity load time series is complex and non-linear, with daily, weekly and annual cycles. It also contains random components due to fluctuations in the electricity usage of individual users, large industrial units with irregular hours of operation, special events and holidays and sudden weather changes.

Various approaches for VSTLF have been proposed; the most successful are based on Holt–Winters exponential smoothing and Autoregressive Integrated Moving Average (ARIMA) [2], Linear Regression (LR) and Neural Networks (NNs) trained with the backpropagation algorithm [3], [4], [5], [6], [7]. The problem of feature selection for VSTLF, however, has not received enough attention, and it is the focus of this paper.

Feature (variable) selection is the process of selecting a set of representative features (variables) that are relevant and sufficient for building a prediction model. It has been an active research area in machine learning [8], [9], [10]. Good feature selection improves the predictive accuracy, leads to faster training and smaller complexity of the prediction model. It is considered as one of the key factors for successful prediction.

Most of the existing approaches for VSTLF identify features in a non-systematic way or use standard autocorrelation analysis, which only captures linear dependencies between the predictor variables and the output variable that is predicted. The main goal of this paper is to show how advanced machine learning feature selection methods can be applied for electricity load forecasting, and more generally to energy time series forecasting. In particular, our contribution can be summarized as follows:

  • We adapt and apply three advanced machine learning feature selection algorithms – Mutual Information (MI), RReliefF (RF) and Correlation-Based Selection (CFS) – to the task of load forecasting. We chose these methods as they are appropriate for the nature of the electricity load data – they can identify both linear and non-linear relationships (MI and RF) and capture both relevant and redundant features (CFS, RF), see Section 3. For comparison we also apply a method based on Autocorrelation (AC). We show how these feature selection methods can be applied in a systematic way to energy time series.

  • We propose a two-step approach for feature selection. In the first step we form a set of candidate features by applying a 1 week sliding window. A 1 week sliding window greatly reduces dimensionality while still capturing the main characteristics of data. In the second step we use a feature selection method to evaluate the quality of the candidate features and select a final subset of features.

  • We use the selected features with state-of-the-art prediction algorithms: NN, LR and Model Tree Rules (MTR). Hippert et al. [11] reviewed the application of NNs for electricity load forecasting and noted the need for systematic and fair comparison between NNs, standard linear statistical methods such as LR and other prediction algorithms.

  • We conduct a comprehensive evaluation using two years of Australian electricity data. This includes a comparison with exponential smoothing (one of the most successful methods for load forecasting), a typical prediction model used by industry forecasters and several other benchmarks.

  • We investigate additional aspects of the feature selection algorithms such as effect of the number of neighbors in AC and the number of features in MI and RF.

The rest of this paper is organized as follows. Section 2 reviews the related work. Section 3 analyses the data characteristics. Section 4 describes the proposed feature selection methods and how they were applied to our task. Section 5 presents the prediction algorithms we used and their parameters. Section 6 describes the methods used for comparison. Section 7 summarizes the experimental setup. Section 8 presents and discusses the results. Finally, Section 9 concludes the paper.

Section snippets

Previous work

VSTLF is a relatively new area that has become important with the introduction of competitive electricity markets, and more recently, with the arrival of the smart grid. In contrast, short-term load forecasting has been widely studied, e.g. see [11], [12], [13], [14].

There are two main groups of approaches for VSTLF: traditional statistical and computational intelligence. Prominent examples of the first group are exponential smoothing and ARIMA; these methods are linear and model-based. The

Data analysis

We use electricity load data measured at 5 min intervals for a period of two years: from 1st January 2006 until 31st December 2007. Each measurement represents the total electricity load for the state of New South Wales (NSW) in Australia. The data was provided by the Australian Electricity Market Operator (AEMO) [18].

In order to build accurate prediction models, it is important to understand the data characteristics and the external variables affecting the forecasting.

Feature selection

Feature selection is the process of removing irrelevant and redundant features and selecting a small set of informative features that are necessary and sufficient for good prediction. Feature selection has been an active area of research in machine learning and statistics [8], [9], [10], [20]. Feature selection increases predictive accuracy by reducing overfitting and addressing the curse of dimensionality problem. It also affects the speed of the prediction algorithm – smaller feature set

Prediction algorithms

We applied three state-of-the-art machine learning algorithms, representing different learning paradigms: NN, LR and MTR.

Prediction methods used for comparison

We compare the performance of our approach with four baselines, a typical industry model and three different versions of the exponential smoothing method. Exponential smoothing is one of the most popular and successful econometric methods used for electricity forecasting.

Data

The available data is a time series of 5 min electricity loads for two years, 2006 and 2007. The total number of samples is 210,240 (2 years × 365 days × 24 h × 12 measurements). There were 272 missing data points (0.1% of all data) that were replaced with the average of the previous 3 load values. The data has been normalized between −1 and 1. For our prediction task, one example is a 2016-dimensional feature vector after the initial feature selection and a 35–50-dimensional vector after the secondary

Results and discussion

Table 4 shows the performance of the four proposed feature sets with NN, LR and MTR. Table 5 shows the performance of the baselines and the methods used for comparison.

Conclusions

We considered the task of predicting the electricity load one step ahead from a time series of previous electricity loads measured every 5 min. We evaluated the performance of four feature selection methods – three advanced machine learning (MI, RF and CFS) and one traditional statistical method (AC). These methods differ in the type of relationships they detect (both linear and non-linear), ability to capture relationships between features and the generation of the feature subset (explicit or

References (40)

  • P. Shamsollahi, K.W. Cheung, Q. Chen, E.H. Germain, A neural network based very short term load forecaster for the...
  • D. Chen, M. York, Neural network based very short term load prediction, in: Proceedings of the IEEE Power and Energy...
  • I. Koprinska, M. Rana, V.G. Agelidis, Yearly and seasonal models for electricity load forecasting, in: Proceedings of...
  • I. Guyon et al.

    An introduction to variable and feature selection

    J. Mach. Learn. Res.

    (2003)
  • L. Yu et al.

    Efficient feature selection via analysis of relevance and redundancy

    J. Mach. Learn. Res.

    (2004)
  • H.S. Hippert et al.

    Neural Networks for short-term load forecasting: a review and evaluation

    IEEE Trans. Power Syst.

    (2001)
  • E.A. Feinberg et al.

    Load forecasting

  • S. Fan et al.

    Short-term load forecasting based on a semi-parametric additive model

    IEEE Trans. Power Syst.

    (2012)
  • F. Martínez-Álvarez et al.

    Energy time series forecasting based on pattern sequence similarity

    IEEE Trans. Knowl. Data Eng.

    (2011)
  • A.J.R. Reis et al.

    Feature extraction via multiresolution analysis for short-term load forecasting

    IEEE Trans. Power Syst.

    (2005)
  • Cited by (191)

    View all citing articles on Scopus
    View full text