Data-driven symbolic ensemble models for wind speed forecasting through evolutionary algorithms

doi:10.1016/j.asoc.2019.105976

Applied Soft Computing

Volume 87, February 2020, 105976

https://doi.org/10.1016/j.asoc.2019.105976 Get rights and content

Highlights

•
An evolutionary algorithm (EA) is proposed for postprocessing wind speed forecasts.
•
EA uses past and future data from the target location and its neighbors as inputs.
•
A single solution for the entire year for each forecast range, time and location.
•
EA outperformed the benchmarks, with median forecast errors roughly 8–56% lower.
•
The human-interpretable solutions automatically inferred by EA were analyzed.

Abstract

Non-linear data-driven symbolic models have been gaining traction in many fields due to their distinctive combination of modeling expressiveness and interpretability. Despite that, they are still rather unexplored for ensemble wind speed forecasting, leaving behind new promising avenues for advancing the development of more accurate models which impact the efficiency of energy production. In this work, we develop a methodology based on the evolutionary algorithm known as grammatical evolution, and apply it to build forecasting models of near-surface wind speed over five locations in northeastern Brazil. Taking advantage of the symbolic nature of the models built, we conducted an extensive series of post-analyses. Overall, our models reduced the forecasting errors by 7%–56% when compared with other techniques, including a real-world operational ensemble model used in Brazil.

Introduction

The worst energy crisis in Brazil’s history, which occurred in 2001, added to the drought that has been affecting the northeast region for the last seven years, have been leading to substantial changes in the Brazilian energy matrix, which currently includes different types of renewable energy sources other than hydropower. In northeastern Brazil, the main source of energy is the wind power available in the region, which currently provides more than 50% of the daily energy production. The northeast of Brazil is among the best regions in the world for wind farming, with strong, persistent, and well-behaved winds [1], [2], [3].

The cubic relationship between the wind speed and theamount of energy produced by a wind turbine makes wind speed forecasting an important issue for decision-making in operations management of the wind power system in order to avoid failures on the power grid. In this way, one of the major research challenges in the context of wind power generation systems is to provide more accurate and reliable wind speed and power forecasts. There has not been much research regarding wind direction forecasts because in practice the turbines are able to automatically align their blades to the prevailing wind direction.

A large literature review on the topic of wind speed and power forecasting is given by Giebel et al. [4], who discuss a collection of published research relevant to the wind power management system. The current statistical methods adopted include autoregressive models, moving average models, autoregressive moving average models, autoregressive integrated moving average models, and Kalman filters [5], [6], [7]. From the machine learning area, artificial neural networks [8], [9], [10] are the most widely used approach, although fuzzy logic, support vector machines, neuro-fuzzy networks [11], and hybrid models [5], [12], [13] have also been tested. Ensemble weather forecast techniques for post-processing wind speed forecasts consist in combining multiple numerical weather prediction (NWP) models using fixed linear approaches based on simple average or performance-based weighted average, such as the MASTER super model ensemble system (MSMES) [14] and the Bayesian model averaging [15], [16].

However, the aforementioned techniques have some well-known limitations, such as linear representations with fixed structures, black-box frameworks, and the improvement of only a single solution during the optimization process. Despite these limitations, little has been experienced beyond that. For instance, there have been very few studies concerning nature-inspired algorithms in the wind-forecasting literature. Until recently, nature-inspired algorithms, such as artificial bee colony algorithm [17], particle swarm optimization [18], [19], [20], and differential evolution [20] have been used exclusively for parameter optimization rather than for solving the wind-forecasting problem directly. To the best of our knowledge, the powerful evolutionary algorithm known as grammatical evolution has never been explored as an alternative tool to tackle the challenges related to wind power generation systems. An example of successful application of grammatical evolution in ensemble forecast problems may be found in Dufek et al. [21], where the authors focused on ensemble forecasts for daily rainfall amount at 317 locations in Brazil.

In contrast to single-solution algorithms which tend to get stuck in local optima, grammatical evolution (GE) is a population-based, stochastic, global optimization algorithm that, together with its high degree of parallelism, allows for a better exploration of the whole search space, which in turn increases the probability of finding the global optimum. The white-box nature of the GE models provides many advantages and further applications over black-box approaches, such as: (i) the direct extraction of knowledge from the resulting symbolic solutions and (ii) the validation of the GE solutions’ structures by experts. In theory, the GE-based modeling can be defined as a general-purpose technique, as opposed to specific-purpose techniques, such as the conventional statistical methods and the spatial correlation models. In other words, the search space of the GE-based modeling is an extension to those of the specific-purpose techniques as it enables the use of not only standard arithmetic operators and NWP models, but also several other linear and non-linear operators (e.g. mathematical, logical, relational and conditional) as well as input attributes according to the problem at hand. The GE-based modeling can evolve expressions of arbitrary structure guided by formal grammars. In the context of wind speed forecasting, it is capable of, for example, (i) correcting an NWP model; (ii) optimizing a multi-model ensemble forecast; or (iii) potentially recovering the mathematical expressions associated with spatial–temporal models. An introduction to GE is given in Section 2.1. For more details the reader is referred to [22], [23].

The main contribution of this paper to the strategic planning of the wind power sector consists of (i) the proposal of a GE-based data-driven modeling in order to increase the accuracy and confidence of near-surface hourly wind speed point forecasts; (ii) a case study for one- to three-day-ahead forecasts at five locations in northeastern Brazil; (iii) an investigation into the influence of feature selection and execution time on the GE forecast accuracy; (iv) providing information about the spatial and temporal variability of the GE forecast accuracy; (v) comparing the accuracy of the GE solutions with those obtained from five other approaches; and (vi) presenting a way of extracting knowledge and insights from the human-interpretable (“white-box”) solutions given by GE through sensitivity analysis. The methodology designed to deal with the wind-forecasting problem is based on three steps. The first step includes the feature selection from a pool of possible predictors, containing individual and ensemble numerical weather forecasts, and historical data from the target location and its vicinity. Next, it is the execution of the GE algorithm making use of multi- and many-core parallelism. In the third step, the best solutions are chosen and analyzed.

Section snippets

Grammatical evolution

Grammatical evolution (GE) is an algorithm that automatically builds functional structures (“programs”) by means of an iterative optimization process inspired by the evolutionary principle of natural selection [22]. It is essentially a genetic algorithm [24], sharing the same representation and breeding process carried out over a number of generations, but equipped with a more sophisticated mechanism to map the genotype space (population of individuals encoded as bit-arrays) into the phenotype

Results and discussion

In this section several GE experiments are analyzed from two perspectives: technique- and problem-centered analyses.

The technique-centered analysis studies the influence of feature selection (Section 3.1) and execution time (Section 3.2) on the accuracy of the 24-, 48- and 72-h forecasts of near-surface hourly wind speed given by GE over northeastern Brazil. Thereafter, in Section 3.3 the GE-based data-driven modeling is compared in terms of forecast error with MSMES and four other approaches:

Conclusions and future work

This paper presented a GE-based data-driven modeling of the 24-, 48- and 72-h forecasts of near-surface hourly wind speed at five locations over northeastern Brazil. Several GE experiments were conducted under two different points of view: (i) a technique-centered analysis which concerns the efficiency, effectiveness, scalability, and robustness of the GE-based data-driven modeling; and (ii) a problem-centered analysis which focuses on understanding the regression problem of predicting wind

Declaration of Competing Interest

No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.asoc.2019.105976.

Acknowledgments

The authors would like to thank the support provided by CNPq, Brazil (grants 312337/2017-5, 502836/2014-8 and 300458/2017-7), FAPEMIG, Brazil (grant APQ-03414-15), EU H2020 Programme and MCTI/RNP–Brazil under the HPC4E Project (grant 689772).

References (36)

de Araujo LimaL. et al.
Wind energy assessment and wind farm simulation in Triunfo – Pernambuco, Brazil
Renew. Energy
(2010)
LiuH. et al.
Comparison of two new ARIMA-ANN and ARIMA-Kalman hybrid methods for wind speed prediction
Appl. Energy
(2012)
ErdemE. et al.
ARMA based approaches for forecasting the tuple of wind speed and direction
Appl. Energy
(2011)
QureshiA.S. et al.
Wind power prediction using deep neural network based meta regression and transfer learning
Appl. Soft Comput.
(2017)
Carolin MabelM. et al.
Analysis of wind power generation and prediction using ANN: A case study
Renew. Energy
(2008)
KalogirouS.A.
Artificial neural networks in renewable energy systems applications: A review
Renew. Sustain. Energy Rev.
(2001)
MaX. et al.
A generalized dynamic fuzzy neural network based on singular spectrum analysis optimized by brain storm optimization for short-term wind speed forecasting
Appl. Soft Comput.
(2017)
WangJ. et al.
An innovative hybrid approach for multi-step ahead wind speed prediction
Appl. Soft Comput.
(2019)
ZhangW. et al.
Short-term wind speed forecasting based on a hybrid model
Appl. Soft Comput.
(2013)
OsórioG. et al.
Short-term wind power forecasting using adaptive neuro-fuzzy inference system combined with evolutionary particle swarm optimization, wavelet transform and mutual information
Renew. Energy
(2015)

QuanH. et al.

Particle swarm optimization for construction of neural network-based prediction intervals

Neurocomputing

(2014)

JursaR. et al.

Short-term wind power forecasting using evolutionary algorithms for the automated specification of artificial intelligence models

Int. J. Forecast.

(2008)

DufekA.S. et al.

Application of evolutionary computation on ensemble forecast of quantitative precipitation

Comput. Geosci.

(2017)

Silva dos SantosA.T. et al.

Assessment of wind resources in two parts of northeast Brazil with the use of numerical models

Meteorol. Appl.

(2016)

de Araujo LimaL. et al.

Wind resource evaluation in São João do Cariri (SJC) – Paraíba, Brazil

Renew. Sustain. Energy Rev.

(2012)

GiebelG. et al.

The State-Of-The-Art in Short-Term Prediction of Wind Power: A Literature Overview

(2011)

CaporinM. et al.

Modelling and forecasting wind speed intensity for weather risk management

Comput. Statist. Data Anal.

(2010)

Silva DiasP.L. et al.

The Master Super Model Ensemble system (MSMES)

(2006)

Cited by (7)

Simultaneous prediction for multiple source–loads based sliding time window and convolutional neural network
2022, Energy Reports
Citation Excerpt :
When carrying out prediction, uncertainty, temporal properties and relevance, prediction accuracy are extensively concerned issues. For improvement of prediction accuracy and considering wind speed randomness and uncertainty, Dufek et al. (2020) proposed a method based on vibrational mode decomposition, drosophila optimization algorithm, and autoregressive integrated moving average model. Based on a Weight-Varying combination forecast mode, Yu et al. (2019) developed a novel two-stage model to quantify the uncertainty of PV power by the prediction interval.
Practical applications of perception, communication, computing, etc. in modern energy industry continually generate large scale data from diversified monitoring terminals. Such massive information has the characteristics like heterogeneity, time sequence, low value density, etc. To promptly obtain high quality data served for safe operation of power network, this paper developed a simultaneous prediction algorithm for multiple source–loads combing sliding time window with convolutional neural network (CNN). The contributions lie in such aspects as extraction of high-value training samples, construction of CNN catered for multiple source–loads and prediction efficiency. Firstly, after the incomplete and abnormal information of raw samples are processed through median filtering and interpolation, the correlation analysis of time series and sliding time window are employed to extract high-value training samples served for CNN. Also, a simultaneous prediction model suitable for multiple source–loads is constructed by modifying CNN. In addition, integrated with the characteristics of the proposed CNN architecture, the parallel strategies based on task parallel and data parallel are designed to achieve rapidly quality prediction for massive and heterogeneous energy sources and loads. Extensive experimental results demonstrate that the proposed algorithm can obtain the higher predicting accuracy simultaneously satisfying requirements of diversified energy source and loads. Loads of experimental comparisons in multicore chips show that the proposed parallel strategies can offer increased speedup under massive prediction.
Point and interval forecasting of ultra-short-term wind power based on a data-driven method and hybrid deep learning model
2022, Energy
Citation Excerpt :
In summary, there still exist research gaps in the current academic community and engineering practice on multi-dimensional data cleaning and feature mining technology of wind farm data space; intelligent decomposition and noise reduction techniques of wind power sequence; and the combination of advanced DL forecasting and interval forecasting models. To improve the forecasting accuracy, WPF based on data-driven ideas and advanced DL algorithms has become a research hotspot [24,33,58]. Extracting hidden features from massive high-dimensional and multi-modal data space is not only the advantage of DL models, but also the new development trend of AI.
Accurate and reliable wind power forecasting (WPF) is significant for ensuring power systems’ economic operation and safe dispatching and for reducing the technical and economic risks faced by power market participants. Based on data-driven and deep-learning methods, we propose a hybrid ultra-short-term WPF framework that can achieve accurate point and interval WPF. First, the multi-sourced and multi-dimensional data sets of wind power plant are preprocessed. Second, feature selection (FS) is conducted to eliminate redundant features. Third, the wind power sequence is decomposed through the variational modal decomposition improved by grey wolf optimization (GWO-VMD). Then, the BiLSTM-Attention model is established to predict each subsequence of wind power. Finally, the prediction intervals of wind power under different confidence levels are estimated by kernel density estimation with the Gaussian kernel function (KDE-Gaussian). The proposed FS-GWO-VMD-BiLSTM-Attention forecasting framework is compared with benchmark models to verify its practicability and reliability. Compared with the BPNN, the mean absolute error, mean absolute percentage error, and mean square error of the FS-GWO-VMD-BiLSTM-Attention model are reduced by 94.03%, 85.82%, and 99.51%, respectively. Furthermore, according to the coverage width-based criterion, KDE-Gaussian is superior to other interval forecasting methods, which can achieve more reliable forecasting of prediction interval.
Multi-step wind speed forecasting model based on wavelet matching analysis and hybrid optimization framework
2020, Sustainable Energy Technologies and Assessments
Citation Excerpt :
Many models have been proposed to achieve high precise wind speed forecasting. These models can be divided into the physical models [5] and the statistical models [6]. The physical models simulate the wind base on the physical laws and usually take terrain and meteorological factors into consideration [7].
Accurate wind speed forecasting is beneficial to the management of the wind power system. A hybrid WPD-DA-NAR wind speed forecasting model under moving window framework is proposed in this study. The WPD (Wavelet Packet Decomposition) is utilized to process the original wind speed time series. Since the mother wavelet function is the core component of the WPD, the matching relationship between the predictor and 17 different mother wavelet functions is discussed. The optimal mother wavelet for the application of the proposed hybrid model is determined. The NAR (Nonlinear Autoregressive) network is employed to build the forecasting models for decomposed sub-layers. A hybrid optimization framework base on the DA (Dragonfly algorithm) is adopted to optimize the NAR network. In order to capture the characters of wind speed time series and update model accordingly, the optimized NAR is applied under a moving window framework. The proposed hybrid model is compared with 8 existing models. The experimental results indicated that: (a) the dmey wavelet provides the best results among all the included 17 mother wavelet functions; (b) the proposed hybrid WPD-DA-NAR model under moving window framework has the best performance in all steps and metrics, compared with 8 existing models.
Development of NCL equivalent serial and parallel python routines for meteorological data analysis
2022, International Journal of High Performance Computing Applications
Point and Interval Forecasting of Ultra-Short-Term Wind Power Based on Data-Driven Method and Hybrid Deep Learning Model
2022, SSRN
A holistic review on energy forecasting using big data and deep learning models
2021, International Journal of Energy Research

View all citing articles on Scopus

View full text

Data-driven symbolic ensemble models for wind speed forecasting through evolutionary algorithms

Highlights

Abstract

Introduction

Section snippets

Grammatical evolution

Results and discussion

Conclusions and future work

Declaration of Competing Interest

Acknowledgments

Renew. Energy

Appl. Energy

Appl. Energy

Appl. Soft Comput.

Renew. Energy

Renew. Sustain. Energy Rev.

Appl. Soft Comput.

Appl. Soft Comput.

Appl. Soft Comput.

Renew. Energy

Neurocomputing

Int. J. Forecast.

Comput. Geosci.

Assessment of wind resources in two parts of northeast Brazil with the use of numerical models

Meteorol. Appl.

Wind resource evaluation in São João do Cariri (SJC) – Paraíba, Brazil

Renew. Sustain. Energy Rev.

The State-Of-The-Art in Short-Term Prediction of Wind Power: A Literature Overview

Modelling and forecasting wind speed intensity for weather risk management

Comput. Statist. Data Anal.

The Master Super Model Ensemble system (MSMES)