Elsevier

Applied Soft Computing

Volume 82, September 2019, 105566
Applied Soft Computing

A hybrid method for crude oil price direction forecasting using multiple timeframes dynamic time wrapping and genetic algorithm

https://doi.org/10.1016/j.asoc.2019.105566Get rights and content

Highlights

  • A novel hybrid method of dynamic time wrapping and genetic algorithm is proposed.

  • Return predictions from multiple timeframe are comprehensively considered.

  • A very sophisticated trading strategy is designed for simulation trading.

  • The proposed method significantly received higher forecasting accuracy and return.

Abstract

This study proposes a hybrid method based on similarity measurement of time series from multiple timeframes to predict direction changes of crude oil price, as well as executing simulated trading. Except daily timeframe data, it is essential for utilizing the information from various representations of the same data source; hence weekly data are also used. For the proposed method, firstly, it uses the Multiple Dynamic Time Wrapping (MDTW) to collect similar time series from daily and weekly data, and direction changes and returns of them one week later. Next, it calculates a comprehensive expected return based on the expected return results of two timeframes and their weights. Then, the proposed method predicts the direction change of current time series for one week later, and executes simulation trading upon the prediction results. Lastly, the proposed method adopted the genetic algorithms to optimize several model parameters for trading strategy. Experimental results showed that the proposed method achieved excellent performances in terms of hit ratio, accumulated return and Sharpe ratio, and the results are significantly superior to that of benchmark methods. The proposed method can provide beneficial advises for investors, energy-related enterprises, and government officers engaged in policy decisions.

Introduction

With the rapid development of the world economy, crude oil has become a significant resource for economic development all over the world. In addition, it is also deeply related to politics, military situation, diplomatic affairs and others national security matter in most countries. In recent decades, especially the large amount of energy consumption in developing countries has increasingly stimulated the demand for crude oil of many countries in various fields. Thus, the problem of scarcity of crude oil resources has become more serious, making the fluctuation of crude oil prices gain more attentions from governments, energy-related enterprise, market trading individuals and organizations, etc. Nonetheless, prices of crude oil are strongly influenced by a combination of various factors such as international political situation, regional military conflicts, crude oil production, natural disasters and religious events, etc. Therefore, it is extremely difficult to analyze or predict direction changes of crude oil prices.

There are currently two most influential crude oil markets in the world: West Texas Intermediate (WTI) crude oil market and Brent Crude oil market. Among these two, WTI of New York Mercantile Exchange (NYMEX) is the crude oil price mark commonly used in the U.S. and other markets in the western hemisphere; Brent Crude oil from International Petroleum Exchange (IPE), which is produced in the North Sea of the UK, is a symbol of crude oil prices in Europe, the Mediterranean and west Africa. Since the prices of WTI and Brent crude oil are the most influential ones in the world, price direction analysis and forecasting associated with them have drawn tremendous attentions to global researchers.

For crude oil price analysis, one approach generally accepted by researchers in related fields is technical analysis method [1]. It is a way to forecast market prices of securities or commodities by using past prices or trading volumes, based on the assumption that history will repeat. In the last few decades, with the development of the time series analysis and machine learning methods, numerous researchers have estimated and predicted the movements of crude oil price time series by using historical time series data. Lots of researchers have used time series models such as ARIMA (Autoregressive Integrated Moving Average Model), which is a famous time series prediction method proposed by Box and Jenkins in 1970s [2], and many researchers applied it for the crude oil price forecasting. For example, Yusof et al. successfully applied an ARIMA based method to predict monthly crude oil prices produced in Malaysia [3]; Etuk developed an ARIMA based method to identify the direction of monthly crude oil prices in Nigeria [4]. However, the ARIMA based method can produce satisfactory results when data sequence is linear or in a approximately linear format, but may not be appropriate for forecasting future fluctuations if the time series is nonlinear [2], such as crude oil prices, marked by its nonlinearity and irregularity.

Since linear hypothesis based models like ARIMA may not be suitable for analyzing nonlinear time series such as crude oil prices, some researchers applied some nonlinear models such as SVM (support vector machine) to predict crude oil prices movements. SVM is a model proposed by Vapnik [5], which is based on the minimum structure risk criterion to simulate the nonlinear function relationship and then effectively simplify the difficulty to solve the high-dimensional space problem. It provides a new idea for nonlinear combination forecasting and researchers in lots of fields have applied SVM based methods for solving prediction problem. For instance, for prediction in other fields, SVM is applied for performance prediction of a spark ignition [6], weight factors calculation [7], parameter identification [8], and product quality measurement [9]. For crude oil price prediction, Qi and Zhang successfully developed a SVM based method for crude oil prices prediction [10]; Chiroma et al. proposed a SVM based method for predicting the monthly price of WTI crude oil [11]; Zhou and Xu successfully predicted crude oil prices from January to June of 2011 by using a SVM based approach and yielded better results compared to some conventional methods [12]; Yu et al. also applied a SVM regression based method for crude oil price forecasting [13]. Although it is convinced that SVM may provide a better solution to crude oil price movements forecasting from their experimental results, it has also suffered from some limitations. For instance, SVM is not good enough at explaining high-dimensional mapping of kernel functions. In addition, it can be easily misguided when forecasting the movements of financial time series, for example, its prediction looks like just follow the former trend [14].

In addition to SVM, Neural Network (NN) is also an efficient nonlinear tool to predict financial time series. NN is a mathematical model of behavioral characteristics modeled on animal neural networks. This network relies on the complexity of the system to adjust the relationship between a large numbers of internal nodes to achieve the purpose of solving classification or regression problem. NN based methods have been widely applied by researchers in lots of field. For instance, a NN based method is applied for heat transfer performance prediction [15]. For crude oil price analysis, Jiliang et al. applied a NN based method to analyze the time series of crude oil prices, and the experimental results showed that model has high prediction accuracy [16]; Latif and Herawati proposed a prediction method based on a combination of NNs for crude oil price forecasts [17]; Suriya built a NN based method to predict Brent crude oil prices [18]; Chiroma et al. proposed an evolutionary NN to predict WTI crude oil prices [19]; Huang and Wang applied NN based method for global crude oil price prediction [20]. Although NN has been observed to provide better solutions in predicting nonlinear crude oil price movement, it has noticeable limitations. For instance, it is easy to be over-fitting and often suffers from local minima, consequently cause inaccurate prediction results.

Dynamic Time Wrapping (DTW) is also a famous method widely used for time series analysis. DTW maps a time series to another time series used to measure the similarity between two time series and minimum distance map between them. At first, DTW was often applied in speech recognition for the purpose of determining if waveforms represent the same spoken phrase. By using DTW method, recognition rate of speech recognition has improved to some extent, such as for emotional speech recognition improvement [21], speech identification in adverse conditions [22], speech recognition rate improvement and real-time control [23], and isolated word detection [24]. DTW algorithm makes two sequences to be pulled and compressed which let them to have best matching degree or similarity. In financial field, some researchers have also applied DTW algorithm for matching similar financial time series and it was shown that DTW algorithm has achieved an outstanding prediction performance. For instance, Wang et al. used DTW to measure the similarity of time series for prediction of the movement directions of exchange rates [25]; Chang et al. applied DTW for stock trading [26]; Lee and Jeong developed a DTW based method for pattern recognition in futures market [27]. Although DTW method have shown great potential for financial time series prediction, few researches have been undertaken to study the application of DTW method in the prediction of crude oil movements. Therefore, applying DTW method to analyze and predict the price movements and returns holds great promise.

For the purpose of forecasting of crude oil price movements or returns, researchers often used only a single timeframe of historical data: such as using only daily timeframe data to make predictions and conduct simulation trading. For instance, Shin et al. used crude oil daily data to predict the rise and fall of crude oil after one month [28]; Chiroma et al. used WTI crude oil daily data to predict daily fluctuations in crude oil prices [19]; Suriya used daily data on crude oil to predict the changes in crude oil prices that happens one day ahead [18]; However, for real trading in crude oil markets, in fact, traders usually observe the movements of crude oil time series from multiple timeframes (daily, weekly, or monthly data), and sometimes direction predictions from different timeframes may be contradictory. For instance, at some time points, direction prediction of daily timeframes is upward but it is downward when applying with weekly timeframe. Therefore, in this study, the direction prediction and the predicted value of the returns from multiple timeframes are considered comprehensively: if the return predictions from multiple timeframes are in same direction, the accuracy of the prediction is expected to be more confident. Otherwise, if they are opposite, the magnitude and importance of the prediction results for different timeframes should be considered and judged comprehensively, and accuracy of the prediction is expected to be improved. Hence, a sophisticated trading strategy is proposed in addition to return predictions by using DTW.

Recently, hybrid methods have been proposed by lots of researchers to enhance the accuracy of crude oil price prediction when compared to using individual approaches. The objection of using a hybrid model is to significantly improve the prediction or trading performances by combining the advantages of several methods. For instance, Wang et al. proposed a novel hybrid method of data fluctuation network and several artificial intelligence for crude oil prices forecasting [29]; Ding forecasted crude oil prices by a novel hybrid method of ensemble empirical mode decomposition and neural network [30]; Chen et al. combined random walk with gray wave model for multiple steps ahead crude oil price forecasting [31]. Thus, hybrid models showed a new way for improving crude oil direction prediction and trading performances. In this study, a hybrid method combining DTW and genetic algorithm (GA) is proposed for direction forecasting and trading simulation. The proposed method uses DTW to match the similarity of crude oil time series, and uses GA to integrate multi-timeframes prediction results to optimize the trading strategies. Firstly, the proposed method calculates the top most similar sequences from the daily data and the weekly data, respectively, and then separately counts the average return of the corresponding sequences after one week. Then, the proposed method assigns weights to the daily and weekly averaged return predictions and calculates the weighted average of them, which is the comprehensive expected return of daily and weekly predictions. Finally, the proposed method makes trading decisions by comparing return prediction with long and short thresholds. If the predicted return is larger than the long threshold, a “rise” forecast and a “long” transaction are performed; On the contrary, if the predicted return rate is lower than the short threshold, a “fall” forecast and a “short” transaction are performed; If the predicted return is in the middle of the two thresholds, it indicates that the thresholds for forecasting or trading has not been reached and the forecasting reliability is low for traders, therefore, no direction prediction or trading is performed. The parameters mentioned above: (1) Daily and weekly DTW return prediction weight values; (2) Long and short thresholds; (3) Numbers for selecting top similar sequences from daily timeframe DTW results and weekly timeframe DTW), are all optimized by GA.

This study has the following four contributions: (1) In addition to matching similar time series from daily movements, similar ones from weekly movements are also considered as useful return predictions for making a comprehensive judgment of future movements and returns of crude oil price; (2) Using a multiple timeframes weighted method (rather than a simple averaging method) and a threshold method (rather than the methods forecast and trade every time) to design a more sophisticated trading strategy; (3) The proposed method matches several similar sequences besides the most similar one from historical data since ranking and selection of several similar time series could reduce the contingency of similar time series matching; (4) Parameters for multiple timeframes weights, numbers for most similar time series from daily and weekly DTW results, and trading thresholds are not decided based on expert experience, but are derived from GA based on historical data.

The reminder of this paper is organized as follows: Section 2 describes the methods of this study, including DTW and GA; the structure of the proposed method is described in Section 3; Section 4 explains the experimental design in detail; Section 5 reports the experimental results and discussions. In the end of this paper, Section 6 summarizes this study and provides some research directions of future work.

Section snippets

Dynamic time wrapping

A distance measurement is needed to determine similarity between time series and for time series match. In this research, the similarity of time series from multiple timeframes is calculated to predict the returns of crude oil in the near future (one week later). DTW is a technique that can find the optimal alignment between two time series if one time series may be “warped” non-linearly by stretching or shrinking it along its time axis. DTW is applied to construct the model to match the

Structure of proposed method

The proposed method is mainly consisted of four components, including:

  • Data Pre-processing (DP) component

  • Multiple Timeframes DTW prediction (MT-DTW) component

  • GA Parameters Optimization (GA-PO) component and Prediction

  • Trading and Evaluation (PTE) component

The structure of proposed method is shown in Fig. 3, and the details of the working procedures of proposed method are explained in Section 3.2.

(1) DP component: This component is for data pre-processing. WTI and Brent’s crude oil market data are

Experiment design

In the experiment designs, the spot price time series data of the two major crude oil markets: WTI and Brent spot price data are used for empirical analysis.

The daily price data for WTI crude oil spot prices ranges from January 2, 1986 to June 29, 2018, while for Brent crude oil spot prices ranges from May 20, 1987 to June 29, 2018. Both data were downloaded from the U.S. Energy Information Administration website [34]. The overall dataset is divided into three subsets: (1) DTW matching

Hit ratio results

To verify the accuracy improvement of the proposed DTW-GA-DW method for crude oil price direction forecasting, the methods SVM-D, NN-D, DTW-D, DTW-W, and DTW-D+W are selected as comparison methods, and the time horizon for direction forecasting is one week. Hit ratio results for Brent and WTI crude oil forecasts are shown in Table 5, Table 6.

From hit ratio results shown in Table 5, Table 6, firstly, it is found that for both WTI and Brent crude oil market, the hit ratio results of the proposed

Conclusion

In this research, a hybrid method based on multiple timeframes dynamic time wrapping and GA is developed for direction forecasting and simulation trading. The proposed method is mainly consisted of four components: (1) Data Pre-processing component; (2) Multiple timeframes DTW prediction component; (3) GA parameters optimization component; (4) Prediction, trading and evaluation component. According to the experimental results, it shows that the proposed method achieved an average hit ratio of

Acknowledgments

This work was partially funded of Hubei Ministry of Education (Q20171208), Science Foundation of China Three Gorges University (No. KJ2016A001), Starting Grant of China Three Gorges University (No. 20170907).

Declaration of competing interest

No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.asoc.2019.105566.

References (38)

  • MurphyJ.J.

    Technical Analysis of the Financial Markets: A Comprehensive Guide To Trading Methods and Applications

    (1999)
  • BoxG.E.P. et al.

    Time series analysis: forecasting and control, 4th edition

    J. Mark. Res.

    (2008)
  • YusofN.M. et al.

    Malaysia Crude oil production estimation: an application of ARIMA model

  • EtukE.H.

    Seasonal arima modelling of nigerian monthly crude oil prices

    Asian. Econ. Financ. Rev.

    (2013)
  • VapnikV.

    The Nature of Statistical Learning Theory

    (1995)
  • ZuoQ.S. et al.

    Prediction of the performance and emissions of a spark ignition engine fueled with butanol gasoline blends based on support vector regression

    Environ. Prog. Sustain.

    (2018)
  • EJ.Q. et al.

    Parameter-identification investigations on the hysteretic preisach model improved by the fuzzy least square support vector machine based on adaptive variable chaos immune algorithm

    J. Low Freq. Noise Vib. A

    (2017)
  • ZuoH.Y. et al.

    Identification on rock and soil parameters for vibration drilling rock in metal mine based on fuzzy least square support vector machine

    J. Cent. South. Univ.

    (2014)
  • WangT.S. et al.

    Fuzzy Least squares support vector machines soft measurement model based on adaptive mutative scale chaos immune algorithm

    J. Cent. South. Univ.

    (2014)
  • Cited by (15)

    • Machine learning and the cross-section of cryptocurrency returns

      2024, International Review of Financial Analysis
    • A multi-scale model with feature recognition for the use of energy futures price forecasting

      2023, Expert Systems with Applications
      Citation Excerpt :

      It is necessary to develop a variety of potential data mining techniques in advance in order to accomplish this goal. Firstly, in order to optimize the models for the purpose of overcoming the limitations of parameter sensitivity, a number of artificial intelligence optimal algorithms, such as GA (Deng, Xiang, Fu, Wang, & Wang, 2019), the Salp Swarm Algorithm (Jiang, Li, & Li, 2019), the Simulated Annealing algorithm (SA) (Duan, Liu, & Wang, 2022), and the Ant Colony Algorithm (ACA) (Prasad, Ali, Kwan, & Khan, 2019), are proposed. Chiroma et al. utilize a genetic algorithm and neural network model in order to forecast the West Texas Intermediate crude oil price (Chiroma, Abdulkareem, & Herawan, 2015).

    • Interval prediction approach to crude oil price based on three-way clustering and decomposition ensemble learning

      2022, Applied Soft Computing
      Citation Excerpt :

      Especially, the non-stationary characteristics and volatility of time series can be processed well [25–27]. The investigations on predicting the time series have successfully confirmed the effectiveness of the decomposition ensemble techniques [28,29]. Nevertheless, the decomposed time subseries have different frequencies, which has a great impact on accurate prediction.

    • A novel hybrid method for direction forecasting and trading of Apple Futures

      2021, Applied Soft Computing
      Citation Excerpt :

      Tang et al. designed an SVM-based model for predicting stock turning points, and it produced better and more stable results than other benchmark methods [8]. Deng et al. established an SVM-based model to predict the crude oil price with outstanding performances [9]. Yasir et al. successfully predicted three different foreign exchange rates by using an SVM-based method [10].

    • Oil price future regarding unconventional oil production and its near-term deployment: A system dynamics approach

      2021, Energy
      Citation Excerpt :

      Nevertheless, because of the data assumptions of stationarity and linearity, the traditional econometric models might be unable to capture uncertain factors and nonlinear patterns hidden in the crude oil price series [24]. The most predominant models for crude oil price forecasting using AI techniques, we may refer to artificial neural networks (ANN) [9,[15–17,22,25–27], support vector machine (SVM) [15,17,19,25], support vector regression (SVR) [28], genetic algorithm (GA) [28,44], particle swarm optimization (PSO) [14,30], bees algorithm [26], and wavelet techniques [17,45]. In comparison to the traditional statistical and econometric models, soft computing methods can provide a better analysis of the nonlinear and complicated data of crude oil prices [9].

    View all citing articles on Scopus
    View full text