Abstract
The motivation for this paper is to investigate the use of three promising types of recurrent neural networks (RNNs), i.e., the long short-term memory (LSTM) network, the bidirectional LSTM (BiLSTM), and gated recurrent unit (GRU), when applied to the task of copper price prediction. This is done by deploying these RNNs with adequate data slicing and augmentation procedures to extract useful information from both domestic and international copper-related market indices for forecasting out-of-sample copper price movements. These RNN models are then empirically tested under various input window lengths, with the memory-free ANN model serving as a benchmark. The results show that the RNN models with memory units are superior to the memory-free ANN model, but the choice of a longer input window length may not produce a better prediction performance. To optimize the prediction results, the RNN models with relatively low forecasting errors are further combined using various ensemble averaging approaches (i.e., AVG or OLS, with the former being simpler) to integrate different model forecasts. Empirical findings show that the best ensemble prediction model is formed by combining just two RNNs (LSTM and BiLSTM) with shorter input window length, through the simpler AVG approach. Our results therefore suggest that in a sense, the “simpler is better” philosophy should apply for the deployment of RNN models for the task of copper price prediction.
Similar content being viewed by others
Data availability
Enquiries about data availability should be directed to the authors.
References
Aggarwal A (2018) Domains in which artificial intelligence is rivalling humans. https://datafloq.com/read/domains-artificial-intelligence-rivalling-humans/4817.
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. Technical Report. Available at: arXiv preprint arXiv:1409.0473.
Baek Y, Kim HY (2018) ModAugNet: a new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module. Expert Syst Appl 113:457–480
Bengio Y (2009) Learning deep architectures for AI. Found Trends Mach Learn 2(1):1–127
Bengio Y (2012) Practical recommendations for gradient-based training of deep architectures. In: Montavon G, Orr GB, Müller K-R (eds) Neural Networks: Tricks of the Trade, 2nd edn. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 437–478
Bildirici M, Ersin Ö (2009) Improving forecasts of GARCH family models with the artificial neural networks: an application to the daily returns in Istanbul Stock Exchange. Expert Syst Appl 36:7355–7362
Bildirici M, Ersin Ö (2013) Forecasting oil prices: Smooth transition and neural network augmented GARCH family models. J Petrol Sci Eng 109:230–240
Buncic D, Moretto C (2015) Forecasting copper prices with dynamic averaging and selection models. North Am J Econ Finance 33:1–38
Chen Y, Hao Y (2018) Integrating principle component analysis and weighted support vector machine for stock trading signals prediction. Neurocomputing 321:381–402
Chen Y, Rogoff K, Rossi B (2010) Can exchange rates forecast commodity prices? Quart J Econ 125(3):1145–1194
Chen T, Xu R, He Y, Wang X (2017) Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst Appl 72:221–230
Cho K, Van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. Available at: https://arxiv.org/abs/1406.1078.
Chung J, Gulcehre C, Cho KH, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. Available at: arXiv:1412.3555v1.
Cui Z, Chen W, Chen Y (2016) Multi-Scale convolutional neural networks for time series classification. Available at: https://arxiv.org/abs/1603.06995.
Cui Z, Ke R, Pu Z, Wang Y (2018) Deep bidirectional and unidirectional LSTM recurrent neural network for network-wide traffic speed prediction. Available at: arXiv:1801.02143
Dahl EG, Jaitly N, Salakhutdinov R (2014) Multi-task neural networks for QSAR predictions. Available at: arXiv preprint arXiv:1406.1231.
De Boer L, Labro E, Morlacchi P (2001) A review of methods supporting supplier selection. Eur J Purch Supply Manag 7:75–89
Deng L, Li J, Huang JT, Yao K, Yu D, Seide F, Seltzer M, Zweig G, He X (2013) Recent advances in deep learning for speech research at Microsoft. In: IEEE international conference on acoustics, speech and signal processing, pp. 8604–8608.
Di Persio L, Honchar O (2017) Analysis of recurrent neural networks for short-term energy load forecasting, AIP Publishing. Available at: https://aip.scitation.org/doi/abs/https://doi.org/10.1063/1.5012469.
Dietterich TG (2000) Ensemble methods in machine learning. In: Kittler J, Roli F (eds) Proceedings of the first international workshop on multiple classifier systems, Springer-Verlag, London, UK, pp 1–15
Elish MO, Aljamaan H, Ahmad I (2015) Three empirical studies on predicting software maintainability using ensemble methods. Soft Comput 19(9):2511–2524
Fischer T, Krauss C (2018) Deep learning with long short-term memory networks for financial market predictions. Eur J Oper Res 270:654–669
García D, Kristjanpoller W (2019) An adaptive forecasting approach for copper price volatility through hybrid and non-hybrid models. Appl Soft Comput 74:466–478
Gers FA, Schmidhuber J, Cummins F (2000) Learning to forget: continual prediction with LSTM. Neural Comput 12(10):2451–2471
Graves A, Jaitly N, Mohamed A (2013) Hybrid speech recognition with Deep Bidirectional LSTM. In: IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, pp 273–278.
Graves A, Liwicki M, Fernández S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868
Graves A, Schmidhuber J (2005) Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw 18:602–610
Gupta HV, Kling H (2011) On typical range, sensitivity, and normalization of Mean Squared Error and Nash-Sutcliffe Efficiency type metrics. Water Resour Res 47(10):1–3
Han X, Wu Z, Jiang YG, Davis LS (2017) Learning fashion compatibility with bidirectional LSTMs. In: Proceedings of the 2017 ACM on Multimedia Conference. ACM, New York
He T, Droppo J (2016) Exploiting LSTM structure in deep neural networks for speech recognition. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, pp 5445–5449.
Henríquez J, Kristjanpoller W (2019) A combined independent component analysis-neural network model for forecasting exchange rate variation. Appl Soft Comput. https://doi.org/10.1016/j.asoc.2019.105654
Hinton GE (2007) Learning multiple layers of representation. Trends Cogn Sci 11:428–434
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Hinton G, Deng L, Yu D, Dahl G, Mohamed A, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath T, Kingsbury B (2012) Deep neural networks for acoustic modeling in speech recognition—the shared views of four research groups. IEEE Signal Process Mag 29(6):82–97
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hsu WN, Zhang Y, Lee A, Glass J (2016) Exploiting depth and highway connections in convolutional recurrent deep neural networks for speech recognition. Cell 50(1):395–399
ICSG. (2012) The world copper factbook 2012. http://www.icsg.org/index.php?option=com_content&task=view&id=17&Itemid=62.
Jacks DS, O’Rourke KH, Williamson JG (2011) Commodity price volatility and world market integration since 1700. Rev Econ Stat 93(3):800–813
Jamasmie C (2017) Copper to be best performing commodity of 2017 - analysts. Available at: http://www.mining.com/copper-best-performing-commodity-2017-analysts.
Kim HY, Won CH (2018) Forecasting the volatility of stock price index: a hybrid model integrating LSTM with multiple GARCH-type models. Expert Syst Appl 103:25–37
Kristjanpoller W, Fadic A, Minutolo MC (2014) Volatility forecast using hybrid neural network models. Expert Syst Appl 41:2437–2442
Kristjanpoller W, Minutolo MC (2015) Gold price volatility: a forecasting approach using the artificial neural network- GARCH model. Expert Syst Appl 42(20):7245–7251
Kristjanpoller RW, Hernández PE (2017) Volatility of main metals forecasted by a hybrid ANN-GARCH model with regressors. Expert Syst Appl 84:290–300
Li X, Wu X (2014) Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition. Available at: arXiv:1410.4281
Liu G, Guo J (2019) Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 337:325–338
Mendes-Moreira J, Soares C, Jorge AM, De Sousa JF (2012) Ensemble approaches for regression: a survey. ACM Comput Surv 45(1):1–40
Mikolov T, Karafiát M, Burget L, Černocký J, Khudanpur S (2010) Recurrent neural network based language model. INTERSPEECH, pp 1045–1048.
Qu Z, Haghani P, Weinstein E, Moreno P (2017) Syllable-based acoustic modeling with CTC-SMBR-LSTM. In: Proceedings of the IEEE automatic speech recognition and understanding workshop, pp 173–177.
Sak H, Senior AW, Beaufays F (2014) Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition. CoRR, Available at: arXiv: 1402.1128
Sak H, Senior AW, Rao K, Irsoy O, Graves A, Beaufays F, Schalkwyk J (2015) Learning acoustic frame labeling for speech recognition with recurrent neural networks. In: IEEE International conference on acoustics, speech and signal processing (ICASSP), Brisbane, QLD, pp 4280–4284.
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Thaler RH (2016) Misbehaving: The making of behavioral economics. W. W. Norton & Company, Inc., New York
Turban E, Sharda R, Delen D (2011) Decision support and business intelligence systems, 9th edn. Pearson Prentice Hall, New Jersey
Wang L, Wang ZG, Qu H, Liu S (2018) Optimal forecast combination based on neural networks for time series forecasting. Appl Soft Comput 66:1–17
Witkowska D, Marcinkiewicz E (2005) Construction and evaluation of trading systems: Warsaw index futures. Int Adv Econ Res 11:83–92
Yu Y, Si X, Hu C, Zhang J (2019) A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput 31(7):1235–1270
Zen H, Sak H (2015) Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis. Google.com. ICASSP, pp 470–4474.
Zhang C, Ma Y (2012) Ensemble machine learning: methods and applications. Springer, New York
Zhang B, Wu JL, Chang PC (2017) A multiple time series-based recurrent neural network for short-term load forecasting. Soft Comput 22(12):4099–4112
Zhuang Z, Lv H, Xu J, Huang Z, Qin W (2019) A deep learning method for bearing fault diagnosis through stacked residual dilated convolutions. Appl Sci 9(9):18–23
Funding
We would like to thank the Editor and the anonymous referees for their insightful comments and suggestions for the revisions of this paper. Any remaining errors are our responsibility. This research was supported by National Social Science Fund of China (No.19ZDA074).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ni, J., Xu, Y., Li, Z. et al. Copper price movement prediction using recurrent neural networks and ensemble averaging. Soft Comput 26, 8145–8161 (2022). https://doi.org/10.1007/s00500-022-07201-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-022-07201-w