Elsevier

Applied Soft Computing

Volume 54, May 2017, Pages 150-163
Applied Soft Computing

A new class of MODWT-SVM-DE hybrid model emphasizing on simplification structure in data pre-processing: A case study of annual electricity consumptions

https://doi.org/10.1016/j.asoc.2017.01.022Get rights and content

Highlights

  • The proposed model is developed to forecast thirty-two electricity consumption datasets.

  • Both data pre-processing and optimal parameter selection techniques are used.

  • Simplification structure in data pre-processing is proposed in the proposed model.

  • The proposed model significantly enhances the most accuracy and the nearly highest precision.

  • The proposed model can be a promising model to forecast electricity consumptions.

Abstract

In recent years, electricity crisis still becomes noticeable in some countries due to a widening gap between demand and supply. Consequently, the future demand plays a significant role in efficient management and utilization of electricity. Pertaining to efficient supply handling to increase the power system reliability, an electricity demand forecasting is one of the most crucial tools. The forecasting technique is used by decision makers all over the world to predict the future demand as key information for a proper policy. In this research, the hybrid model consists of maximal overlap discrete wavelet transform (MODWT), support vector machine (SVM), and differential evolution (DE) optimization emphasizing on simplifying the complex structure in data pre-processing is proposed to forecast the thirty-two annual electricity consumptions and is compared with traditional forecasting models, hybrid model of MODWT and SVM, and combined model of SVM and DE optimization based on mean absolute error (MAE), mean absolute percentage error (MAPE) and symmetric mean absolute percentage error (sMAPE) measures as well as Friedman test and post hoc test. The empirical results indicate that the proposed model outperforms other forecasting models and provides more accurate forecasts than other candidate models at 0.05 significance levels and the nearly highest precision. Consequently, the proposed model is able to reduce the limitations of individual models regarding annual electricity consumptions and can be used as a promising tool in order to forecast annual electricity consumptions as well.

Introduction

According to economic and population growth in recent years, numerous demands on energy are generated to drive economic growth and development of any country. In contrast, the energy resources are not only scanty and depleted resources but also have been exploited exponentially in the past decades. Subsequently, an energy crisis becomes noticeable in some countries (i.e., Pakistan) due to a widening gap between demand and supply. One of several energy types is electrical energy that is a highly versatile form of energy and is exploited almost every sector of an economy as well. Consequently, the electricity crisis is an important issue of energy regarding economic growth of any country unless the electric resource management is efficiency desirable. Like Pakistan where [15] encounters the miserable failure of Pakistan’s energy policy that has left the country with acute electricity crisis and causing its economic underperformance. In addition, the power shortage has become a major political issue and reflects the hardships for individuals and businesses. Besides, another country also encounters the electricity crisis i.e., Iraq [21]. Dealing with efficient supply handling to increase the power system reliability, an accuracy of future electrical demand is needed to realize in order to manage efficiently electric resources [14]. Consequently, the net consumption of electrical energy has to be estimated by using the most proper tools. One of the most crucial tools is an electricity load forecasting, which is used by decision makers all over the world to predict the future demand as key information for a proper policy. Even though there are arguments (i.e., [34], [43]) in against and in favor of using forecasts for the policy analysis, many researchers [15] support their arguing that the forecasting provides guidelines for policy makers to take steps for the future based on past experience. In other words, the meaningful information of electricity load forecast is able to support effective utilization of electricity resources as well as efficient management of electricity resources.

In 2014, Ahmad et al. proposed a review article on the prediction of the building energy consumption that the prediction methods can be classified into three categories e.g. engineering methods, statistical methods and artificial intelligence methods. For statistical methods, time series models are the most simplest of models that employ time series trend analysis for extrapolating the future energy demands. One of several time series models is autoregressive integrated moving average (ARIMA) models (e.g., [15], [39], [42], [7], [24], [10] that are widely used in literatures and are extended in many fields of science due to its good performance concerning linear problems. Nevertheless, its major disadvantage is an assumption based on linear form of time series so approximation by ARIMA models may not be sufficient for complex nonlinear real-world problems. Furthermore, it is often difficult to determine whether a time series is generated from a linear or nonlinear underlying process. Therefore, the ARIMA models can still be used to investigate the fitted model to predict a future value but they may not guarantee the best model for all circumstances.

Accordingly, artificial intelligence methods are the most widely implemented methods in energy forecasting (i.e., [9], [22], [29], [32], [23]). Relating to artificial intelligence methods, Artificial Neural Networks (ANNs) and SVMs these are supervised learning models and are widely implemented due to their best accuracy results and the ability of forecasting nonlinear problems as well as complex problems. In this regard, its advantages over statistical models that make them attractive in forecasting tasks. First, the models have flexible nonlinear function mapping ability. Second, these models impose few prior assumptions on the model formulation due to its advantage of being data-driven models. Third, these models are adaptive in nature. According to the advantages of ANN models and SVM models, they are able to support that these models have attractive overwhelming attention in time series forecasting. Since these artificial intelligence models have their own advantages and disadvantages, it is hard to decide which one is the best in forecasting. Nevertheless, the SVM models have more attractive attention in time series extrapolating because the SVM models implement the structural risk minimization principle rather than empirical risk minimization principle implemented by most of ANN models [30], [5], [14]. Based on this principle, the SVM models achieve an optimum network structure and always provide a unique and globally optimal solution due to the convex nature of the optimal problem [3], [30]. Therefore, the SVM models are successfully in time series forecasting and revealed that the SVM models performed better than ANN models and other conventional statistical models as well [30].

Even though there is an increase in popularity of SVM models, the performance of SVM models depends on an appropriate selection of SVM hyper-parameters and a feature set of input data [25]. Consequently, the main concern for scientists is to investigate either proper parameter values for a given data set or data pre-processing process to reduce irrelevant and redundant features in SVM regression that can ensure good generalization performance. In this regard, those single forecasting models are difficult to achieve more accuracy of forecast and cannot give satisfactory results for all situations due to be in need of more accuracy in recent years. Consequently, many researches involve combined forecasting models in order to reduce the risk of using an improper model and to achieve higher accuracy as well. In 2014, Tascikaraoglu and Uzunoglu proposed a review article on combined models for prediction of short-term wind speed and power presented that combined models are classified into four categories (i.e., weighting-based combined approach, combined approaches including data pre-processing techniques, parameter selection and optimization techniques, and error processing techniques). The summary of combined approaches is presented in Table 1.

Among these combined approaches, the combined approaches including data pre-processing and parameter selection are interesting approaches due to high performance and easy to find literature as well.

Regarding combined approaches including data pre-processing, these approaches emphasize on a preliminary process on data sets by decomposing the original time series into more stationary and regular subseries that are generally more explicit to analyze by filtering out the irrelevant and redundant features of the data set. Hence, more stable subseries and the most informative training data are able to improve the quality of the data. One of numerous data pre-processing techniques is discrete wavelet transforms (DWTs), which are widely used in literatures and are a signal processing algorithm developed from Fourier transform. The mathematical expression of DWT is to decompose time series frequency signal into different sub-components. One of its advantages over Fourier transform is the perfect analysis of the resulting decomposed components with well scaled resolution. Since the DWT captures the requisite information at various levels, the DWT enhances the capacity of the study model. Additionally, the DWT is suitable for analyzing data in frequency and time domain due to its ability of extracting data from non-periodic and transient signal. Consequently, it is really meaningful in time-frequency localization. In this regard, the DWT can improve a performance of the SVM models concerning a feature set of the input data (e.g., [31], [37], [16], [30], [36]). Even if the DWT can decompose original time series into the most informative series dataset by filtering out the irrelevant and redundant features of the original time series, a disadvantage of the DWT for direct prediction is a complex structure relies on the number of subseries components based on decomposition level, which is not flexible to manipulate forecasting models within each subseries before reconstruction. Besides, the coupled models may not guarantee the most proper model unless optimal parameter selection techniques are capably employed. The basic flowchart of combined approaches including data pre-processing techniques is illustrated in Fig. 1.

With regard to combined approaches including parameter selection and optimization techniques, the approaches take advantage of search algorithms to find the most suitable parameters of forecasting models. One of the search algorithms is metaheuristic algorithms that are becoming powerful for solving optimization problems and have a substantial history in fine-tuning machine learning algorithms [12]. The DE algorithm is one of the metaheuristic algorithms, which is a stochastic population-based search method proposed by Storn and Price for solving nonlinear, high-dimensional and complex computational optimization problems. In addition, the DE algorithm is considered the most recent evolution algorithms (EAs) for solving real-parameter optimization problems. Furthermore, the DE algorithm has many advantages including simplicity of implementation, reliable, robust, and in general is considered as an effective global optimization algorithm. In this regard, the DE algorithm is able to enhance a forecast performance of the SVM models by selecting the most suitable hyper-parameters in the SVM models (e.g., [45], [40], [41]. Even though the DE algorithm can improve the forecast performance of the SVM models, the hybrid models may not provide the most suitable model unless the data pre-processing techniques are efficiently adopted.

In order to cover both aspects to improve the performance of the SVM models, a few researches in literatures have exploited both data pre-processing and parameter selection techniques to improve the forecasting performance. The results demonstrated (i.e., [17], [44], [2], [25] that those combined approaches outperform either data pre-processing technique or parameter selection technique.

For accuracy measures of forecast models, there are many existing measures of forecast accuracy that are used to determine forecast performance in recent years. A research article [18] presented that many measures of accuracy of univariate time series forecasts have been proposed in the past, which are classified into several ways i.e., scale-dependent measures, measures based on parentage errors, measures based on relative errors, and relative measures. Relating to existing measures, the most commonly used measures are MAE and MAPE. Historically, root mean square error (RMSE) has been popular, largely because of its theoretical relevance in statistical modeling. However, it is more sensitive to outliers than MAE, which has led some authors to recommend against their use in forecast accuracy evaluation and thus MAE may still be preferred to evaluate forecast accuracy on the same scale of data sets. With regard to measures based on percentage errors, most textbooks (e.g., [13] p.120; [4] p.18) recommend the employment of MAPE and it was the primary measure [26] in the M-competition. A good point of MAPE measure is that the percentage error has advantage of being scale-independent and so it is frequently used to compare forecast performance across different data sets rather than scale-dependent measures (i.e., MAE and RMSE). Although there are arguments (i.e., [27,28] p.45) in against of using MAPE in some circumstances (e.g., meaningful zero, heavier penalty on positive errors than on negative errors), it may still be preferred for reasons of simplicity to explain. In order to reduce the disadvantage of MAPE, sMAPE is proposed to handle the problem of heavier penalty on positive errors than on negative errors.

In this research, a new class of MODWT-SVM-DE hybrid model based on feature extraction and optimal parameter selection techniques is proposed to forecast thirty-two datasets of electricity consumption. The proposed model takes the advantage of each individual to overcome the limitations of each other. First, MODWT is employed to extract meaningful features of the original time series dataset into only a single series dataset of scaling coefficients dataset. The feature extraction dataset is the most crucial part to provide the signal its identity and to reduce noise of non-stationary original time series dataset before combining the feature extraction series as input series with original time series as target series into a new series. This technique is different from data pre-processing techniques in literatures that decompose original time series into several subseries and exploit forecasting models to predict the future value of each subseries in order to build the reconstruction of prediction as forecast value. The motivation of building the new series is to simplify the complex structure that relies heavily on the number of subseries based on decomposition levels and to eliminate reconstruction process as well. Second, the SVM models are utilized to formulate a prediction function due to its good capability [30], [5], [14]. Third, the DE is employed to search the most proper parameters of SVM models and kernel functions in order to reduce the risk of using improper parameters. Furthermore, the proposed model is compared to ARIMA, SVM, a hybrid model of MODWT and SVM that emphasizes on data pre-processing technique, and a hybrid model of SVM and DE that emphasizes on technique of optimal parameter selection. All forecasting models are evaluated its forecasting performances based on three accuracy measures that are MAE, MAPE, and sMAPE. In order to compare these forecast performances across different data sets, the MAPE and sMAPE measures are used to compare those forecast models based on Friedman test and post hoc test in order to identify significantly difference between the forecast performances of those forecasting models.

The organizations of this paper are as follows; first the methodologies are presented. Then, the cross–validation and the comparison of the forecasting performances are discussed. In addition, the results and discussion are presented. Finally, the conclusions and the highlighted findings of the proposed model are summarized.

Section snippets

Autoregressive integrated moving average

The ARIMA model has dominated in many linear problems of time series analysis and is generalization of autoregressive moving average in a case of non-stationary time series data. The model is generally referred to as an ARIMA(p, d, q) model with the mean μ that has the form as Eq. (1).(1i=1pφiBi)(1B)d(ytμ)=(1j=1qθjBj)εtwhere yt and εt are the actual value and random error at time period t, respectively. B is the backward shift operator; p and q are referred to orders of autoregressive

Cross-validation

All forecasting models are evaluated its forecasting performance with thirty-two annual data of electricity consumption based on three measures of forecast accuracy as well as statistical analyses that are Friedman test and post hoc test in order to identify significantly difference between the forecast performance of those forecasting models. The thirty-two annual data from 1980 to 2012 are obtained from U.S. Energy Information Administration (EIA), which are provided online information at

Results and discussion

In order to support a superior capability of the proposed model, it is evaluated all key performance indicators under many situations. First, its performances of forecast accuracy that are compared with other forecast models based on three measures of forecast accuracy that are scale–dependent measure and measure based on percentage error as well. Based on the three measures, any forecast model provides the lowest error in many measures and so the forecast model has superior capability than

Conclusion

According to all evidences, they indicate that the proposed model has superior capability with the most accuracy than those compared models at 0.05 significance levels and the nearly highest precision. In this regard, it can support to conclude that the simplification structure is proposed in data pre-processing of the proposed model, which is able to improve forecast accuracy compared with the SVM-DE models that emphasize only on optimal parameter selection technique. Moreover, both techniques

Acknowledgements

The author wishes to gratefully acknowledge to the referees of this article for their valuable and constructive comments to clarify and improve the presentation. Additionally, the author would like to express gratitude to university of the Thai Chamber of Commerce as well.

References (45)

Cited by (0)

View full text