Hourly temperature forecasting using abductive networks

https://doi.org/10.1016/j.engappai.2004.04.002Get rights and content

Abstract

Hourly temperature forecasts are important for electrical load forecasting and other applications in industry, agriculture, and the environment. Modern machine learning techniques including neural networks have been used for this purpose. We propose using the alternative abductive networks approach, which offers the advantages of simplified and more automated model synthesis and transparent analytical input–output models. Dedicated hourly models were developed for next-day and next-hour temperature forecasting, both with and without extreme temperature forecasts for the forecasting day, by training on hourly temperature data for 5 years and evaluation on data for the 6th year. Next-day and next-hour models using extreme temperature forecasts give an overall mean absolute error (MAE) of 1.68 °F and 1.05 °F, respectively. Next-hour models may also be used sequentially to provide next-day forecasts. Performance compares favourably with neural network models developed using the same data, and with more complex neural networks, reported in the literature, that require daily training. Performance is significantly superior to naive forecasts based on persistence and climatology.

Introduction

Accurate forecasting of hourly air temperatures has a number of important applications in industry, agriculture, and the environment. Many short-term load forecasting (STLF) schemes for power utilities require hourly temperature forecasts (Fan and McDonald, 1994; Hwang et al., 1998; Khotanzad et al., 1998; Sharif and Taylor, 2000; Xu and Chen, 1999). Such forecasts are also used for predicting the gas send-out for a gas utility (e.g., Pattern Recognition Technologies ANNGSF) and for forecasting the 1 h-ahead heat load for a district heat load network (Seppälä et al., 2000). In agriculture, hourly air temperature forecasts can be used by disease warning systems and pest management schemes to predict conditions that are favourable for disease development in crops and scheduling appropriate actions such as spraying protective fungicides (Francis, 2000; Kim et al., 2002). Road weather information systems utilize forecasted hourly air temperatures for predicting road surface temperatures (Bogren and Gustavsson, 1994).

Temperature is the most important weather parameter affecting electric load generated by power utilities in many parts of the world, and therefore forecasted temperatures constitute a basic ingredient in load forecasting schemes. Forecasts for extreme (minimum and maximum) daily temperatures are provided by many weather services, but these alone are useful only for predicting the daily peak load. However, forecasting the full 24-h load curve is important for many scheduling and network analysis functions in power utilities. Since high–low temperature forecasts are usually provided without specifying the times at which they occur, this precludes their use to generate the hourly load curve through regression and interpolation (Hippert et al., 2000). Schemes for hourly temperature forecasting have been developed in the context of short-term load forecasting and in some cases form an integrated part of the load forecaster (e.g., Fan and McDonald, 1994; Khotanzad et al., 1998). In other agricultural and environmental applications, even high–low temperature forecasts that are specific to the site of interest may not be available, and it is often preferred that temperature forecasts rely only on parameters that are available or can be measured on-site.

Schneider et al. (1985) obtained hourly temperature forecasts by first fitting a two-harmonics Fourier model to the temperature data of the past 21 days to produce a temperature day profile. Hourly forecasts were then obtained by stretching/contracting this profile so that its minimum and maximum points coincide with those forecasted for the day by the weather service. Fan and McDonald (1994) adopt a similar approach but use a day profile that is initialized with historical average data and updated by exponential smoothing with a time constant of 28 days. The complex and nonlinear nature of temperature variations and the abundance of historical data suggest that computational intelligence data-based modeling techniques would be good candidates for solving the temperature forecasting problem. In the load forecasting arena, the use of neural networks temperature forecasting has been a natural extension to their use in load forecasting. The ANNSTLF neural network load forecasting system (Khotanzad et al., 1998) embodies a 7-day-ahead neural network hourly temperature forecaster that uses an adaptive daily update of the weights (Khotanzad et al., 1996). The day module of such forecaster uses back propagation neural networks to forecast the 24 hourly temperatures for day (d) using 28 inputs which include the minimum and maximum temperatures measured for day (d−2), the 24 hourly temperatures for day (d−1), and the forecasted minimum and maximum temperatures for day (d). One drawback to this approach is the large size of the neural networks involved, which implies a large number of weights to be estimated. The large input dimensionality relative to the number of training records may cause the estimation problem to be ill-posed, resulting in unstable networks for typical sizes of training sets (Hippert et al., 2000). The resulting over-fitting may also degrade model generalization, thus yielding poor out-of-sample forecasts (Hippert et al., 2001). To overcome these problems, smaller networks have been proposed that tackle the simpler problem of forecasting only the next-hour temperature. Lanza and Cosme (2001) used a radial basis functions neural network to forecast the temperature on hour (h) from only three inputs including the temperature at hour (h−1) and two hour indices of the sine/cosine form. With the fewer inputs, smaller databases can be used to train the network, and in this case a sliding window of 28 days was found sufficient. In many situations, however, full forecasts of the 24 next-day temperatures are required. This may be achieved through iterative use of the same next-hour forecaster, with the output forecasted for hour (h) being recycled as input for forecasting hour (h+1). This approach, however, may lead to the neural network forecaster behaving chaotically (Hippert et al., 2000). To reduce this risk, these latter authors feed the neural network instead with crude forecasts estimated using an autoregressive (AR) model. Tassadduq et al. (2002) describe a back propagation neural network that uses only the temperature at a given hour to forecast the temperature at the same hour of the following day.

In general, the neural network approach also suffers from a number of limitations, including difficulty in determining optimum network topology and training parameters (Alves da Silva et al., 2001). There are many choices to be made in determining numerous critical design parameters with little guidance available (Hippert et al., 2001), and designers often resort to trial and error approaches (Charytoniuk and Chen, 2000; Tassadduq et al., 2002) which can be tedious and time consuming. Such design parameters include the number and size of the hidden layers, the type of neuron transfer functions for the various layers, the learning rate and momentum coefficient, and training stopping criteria to avoid over-fitting and ensure adequate generalization with new data. Another limitation is the black box nature of neural network models that give little insight into the modeled relationship and the relative significance of various inputs, thus providing poor explanation facilities (Matsui et al., 2001). The acceptability of, and confidence in, automated forecasting tools in operational environments appear to be related to their transparency and their ability to justify results to human experts (Lewis, III, 2001).

To overcome such limitations, we propose using abductive networks (Montgomery and Drake, 1990) as an alternative machine learning approach to hourly temperature forecasting. We have previously used this approach to model and forecast the monthly domestic energy consumption (Abdel-Aal et al., 1997) and in forecasting the minimum (Abdel-Aal and Elhadidy, 1994) and maximum (Abdel-Aal and Elhadidy, 1995) daily temperatures. The method has also been used by Fulcher and Brown (1994) in predicting temperature distributions at data-deficient sites based on detected similarities with data-rich sites. Compared to neural networks, abductive networks offer the advantages of faster model development requiring little or no user intervention, faster convergence during model synthesis without the problem of getting stuck in local minima, automatic selection of effective input variables, and automatic configuration of the model structure (Alves da Silva et al., 2001). Using the approach on a time series problem led to lower mean square errors and simpler models as compared to back propagation neural networks (Tenorio and Lee, 1989). With the model represented as a hierarchy of polynomial expressions, resulting analytical model relationships can provide insight into the modeled phenomena, highlight contributions of various inputs, and allow comparison with previously used empirical or statistical models. The technique automatically avoids over-fitting by using a proven regularization criterion based on penalizing model complexity (Montgomery and Drake, 1990) without requiring a dedicated validation dataset during training, as is the case with many neural network paradigms.

Following a brief description of abductive network modeling in Section 2, the temperature dataset used is described in Section 3. Next-day hourly temperature forecasters, that predict the full 24-h temperature curve for a full day in one step at the end of the preceding day, are described in Section 4. Abductive network models were developed and evaluated both with and without extreme temperature forecasts for the forecasting day. Performance of representative models was compared with that of the corresponding neural network models developed using the same data. Next-hour temperature forecasters that predict the temperature hour-by-hour utilizing all data available up to the forecasting hour are presented in Section 5. Results are also given when such models are used sequentially to forecast the full next-day temperature curve.

Section snippets

Aim abductive networks

Abductory inductive mechanism (AIM) (AbTech, 1990) is a supervised inductive machine-learning tool for automatically synthesizing abductive network models from a database of inputs and outputs representing a training set of solved examples. As a group method of data handling (GMDH) algorithm (Farlow, 1984), the tool can automatically synthesize adequate models that embody the inherent structure of complex and highly nonlinear systems. The automation of model synthesis not only lessens the

The dataset

The dataset used consists of measured hourly temperature data for the Puget power utility, Seattle, USA, as integer values over the period 1 January 1985 to 12 October 1992. The set is made available in the public domain by Professor A. M. El-Sharkawi, University of Washington, Seattle, USA (El-Sharkawi, 2002). We used the data for the first 5 years (1985–1989) for model synthesis and those of the following year (1990) for model evaluation. Data for 5 years were considered sufficient for

Using forecasts for next day extreme temperatures

We have developed 24 models for forecasting the hourly temperatures for the following day (d) in one step at the end of the preceding day (d−1). A model is dedicated for forecasting the temperature, ET (d,h), for each hour of the day. Each of the 24 models was trained using 1825 data records for 5 years (1985–1989) and evaluated on 365 records for the year 1990. Unless specified otherwise, training was performed with the default value CPM=1 for the complexity penalty multiplier. All models use

Using forecasts for next day extreme temperatures

We have developed 24 models for forecasting the temperature at the next hour (h) during day (d) using the full hourly temperature data on day (d−1) (T1,T2,…,T24) together with all available hourly temperatures on day (d) up to the preceding hours hour (h−1) (NT1,NT2,…,NT(h−1)), in addition to the measured minimum (Tmin) and maximum (Tmax) air temperatures on day (d−1) and the forecasted minimum (ETmin) and maximum (ETmax) air temperatures on day (d), as described in Section 4.1 above. A record

Conclusions

Abductive network machine learning has been demonstrated as an alternative tool for next-day and next-hour hourly temperature forecasting. Models both with and without the requirement for extreme temperature forecasts have been developed. Compared to the neural networks approach, the proposed method simplifies model development, automatically selects effective inputs, gives better insight into the modeled function, and allows comparison with previously used analytical models. Represented as a

Acknowledgements

The author wishes to acknowledge the support of the Research Institute of King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia.

References (33)

  • R.E. Abdel-Aal et al.

    A machine-learning approach to modelling and forecasting the minimum temperature at Dhahran, Saudi Arabia

    Energy—The International Journal

    (1994)
  • R.E. Abdel-Aal et al.

    Modelling and forecasting monthly electric energy consumption in eastern Saudi Arabian using abductive networks

    Energy—The International Journal

    (1997)
  • R.E. Abdel-Aal et al.

    Modeling and forecasting the maximum temperature using abductive machine learning

    Weather and Forecasting

    (1995)
  • AbTech Corporation, Charlottesville, VA, USA, 1990. AIM User's...
  • Alves da Silva, A.P., Rodrigues, U.P., Rocha Reis, A.J., Moulin, L.S., 2001. NeuroDem—a neural network based short term...
  • A.R. Barron

    Predicted squared errora criterion for automatic model selection. Self-organizing

  • Bogren, J., Gustavsson, T., 1994. A combined statistical and energy balance model for prediction of road surface...
  • Charytoniuk, W., Chen, M.S., 2000. Neural network design for short-term load forecasting. Proceedings of the...
  • El-Sharkawi, A.M., 2002. EE 559: Fundamentals of Intelligent Systems. University of Washington, Seattle, USA....
  • J.Y. Fan et al.

    A real-time implementation of short-term load forecasting for distribution power systems

    IEEE Transactions on Power Systems

    (1994)
  • S.J. Farlow

    The GMDH algorithm

  • Francis, R., 2000. Early Warning System, Commercial Vegetable Notes, Vol. 2, No. 2. Cooperative Extension Service,...
  • G.E. Fulcher et al.

    A polynomial network for predicting temperature distributions

    IEEE Transactions on Neural Networks

    (1994)
  • Hippert, H.S., Pedreira, C.E., Souza, R.C., 2000. Combining neural networks and ARIMA models for hourly temperature...
  • H.S. Hippert et al.

    Neural networks for short-term load forecastinga review and evaluation

    IEEE Transactions on Power System

    (2001)
  • Hwang, R.-C., Huang, H.-C., Chen, Y.-J., Hsieh, J.-G., 1998. Power load forecasting by neural network with a new...
  • Cited by (81)

    • Artificial neural network approach for monthly air temperature estimations and maps

      2023, Journal of Atmospheric and Solar-Terrestrial Physics
      Citation Excerpt :

      Air temperature forecasting is also crucial for hydrological modeling techniques including water supply, drought analysis, and demand problems. Furthermore, it is required for choosing the spatial validity for accurate siting of agricultural crops, predicting soil temperature, and avoiding the severe effects of air temperature variations (Abdel-Aal, 2004; Kisi et al., 2017). Changes in climatic conditions, such as the drought in a region, have forced people to leave these regions.

    View all citing articles on Scopus
    View full text