Abstract
Financial institutions, investors, mining companies and related firms need an effective accurate forecasting model to examine gold price fluctuations in order to make correct decisions. This paper proposes an innovative approach to accurately forecast gold price movements and to interpret predictions. First, it compares six machine learning models. These models include two very recent methods: the eXtreme Gradient Boosting (XGBoost) and CatBoost. The empirical findings indicate the superiority of XGBoost over other advanced machine learning models. Second, it proposes Shapley additive explanations (SHAP) in order to help policy makers to interpret the predictions of complex machine learning models and to examine the importance of various features that affect gold prices. Our results illustrate that the utilization of XGBoost along with SHAP approach could provide a significant boost in increasing the gold price forecasting performance.
Similar content being viewed by others
Notes
More detailed features of these algorithms will be presented in the methodology section.
Semantically, we note that the terms variable and feature are identical. The former tends to be used in statistics and the latter tends to be used in computer science.
References
Abd Elaziz, M., Ewees, A. A., & Alameer, Z. (2019). Improving adaptive neuro-fuzzy inference system based on a modified salp swarm algorithm using genetic algorithms to forecast crude oil price. Natural Resources Research. https://doi.org/10.1007/s11053-019-09587-1
Abellán, J., & Mantas, C. J. (2014). Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring. Expert Systems with Applications, 41, 3825–3830. https://doi.org/10.1016/j.eswa.2013.12.003
Akbar, M., Iqbal, F., & Noor, F. (2019). Bayesian analysis of dynamic linkages among gold price, stock prices, exchange rate and interest rate in Pakistan. Resources Policy, 62(April), 154–164. https://doi.org/10.1016/j.resourpol.2019.03.003
Alameer, Z., Elaziz, M. A., Ewees, A. A., Ye, H., & Jianhua, Z. (2019). Forecasting gold price fluctuations using improved multilayer perceptron neural network and whale optimization algorithm. Resources Policy, 61(February), 250–260. https://doi.org/10.1016/j.resourpol.2019.02.014
Antunes, F., Ribeiro, B., & Pereira, F. (2017). Probabilistic modeling and visualization for bankruptcy prediction. Applied Soft Computing, 60, 831–843. https://doi.org/10.1016/j.asoc.2017.06.043
Babar, B., Luppino, L. T., Boström, T., & Anfinsen, S. N. (2020). Random forest regression for improved mapping of solar irradiance at high latitudes. Solar Energy, 198, 81–92. https://doi.org/10.1016/j.solener.2020.01.034
Baker, R., Forrest, D., & Pérez, L. (2020). Modelling demand for lotto using a novel method of correcting for endogeneity. Economic Modelling, 84, 302–308. https://doi.org/10.1016/j.econmod.2019.04.021
Basak, S., Karb, S., Sahaa, S., Luckyson, K., & Sudeepa, R. D. (2019). Predicting the direction of stock market prices using tree-based classifiers. North American Journal of Economics and Finance, 47, 552–567.
Batten, J. A., Ciner, C., & Lucey, B. M. (2010). The macroeconomic determinants of volatility in precious metals markets. Resources Policy, 35(2), 65–71.
Baur, D. G., & Lucey, B. M. (2010). Is gold a hedge or a safe haven? An analysis of stocks, bonds and gold. Finance Review, 45, 217–229. https://doi.org/10.1111/j.1540-6288.2010.00244.x
Baur, D. G., & McDermott, T. K. (2010). Is gold a safe haven? International evidence. Journal of Banking and Finance, 34, 1886–1898. https://doi.org/10.1016/j.jbankfin.2009.12.008
Beckmann, J., & Czudaj, R. (2013). Gold as an inflation hedge in a time-varying coefficient framework. North American Journal of Economics and Finance, 24(1), 208–222. https://doi.org/10.1016/j.najef.2012.10.007
Bedoui, R., Braiek, S., Guesmi, K., & Chevallier, J. (2019). On the conditional dependence structure between oil, gold and USD exchange rates: Nested copula based GJR GARCH model. Energy Economy, 80, 876–889.
Behmiri, N. B., & Manera, M. (2015). The role of outliers and oil price shocks on volatility of metal prices. Resources Policy, 46, 139–150. https://doi.org/10.1016/j.resourpol.2015.09.004
Ben Jabeur, S., Sadaaoui, A., Sghaier, A., & Aloui, R. (2020). Machine learning models and cost-sensitive decision trees for bond rating prediction. Journal of the Operational Research Society, 71(8), 1161–1179. https://doi.org/10.1080/01605682.2019.1581405
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324.
Bhatia, V., Das, D., Tiwari, A. K., Shahbaz, M., & Hasim, H. M. (2018). Do precious metal spot prices influence each other? Evidence from a nonparametric causality-in-quantiles approach. Resources Policy, 55, 244–252. https://doi.org/10.1016/j.resourpol.2017.12.008
Bodart, V., Candelon, B., & Carpantier, J.-F. (2015). Real exchanges rates, commodity prices and structural factors in developing countries. Journal of International Money and Finance, 51, 264–284. https://doi.org/10.1016/j.jimonfin.2014.11.021
Capie, F., Mills, T. C., & Wood, G. (2005). Gold as a hedge against the dollar. Journal of International Financial Markets, Institutions and Money, 15, 343–352.
Chen, L. & Zhang, X. (2019). Gold price forecasting based on projection pursuit and neural network. IOP Conf. Series: Journal of Physics: Conf. Series 1168 06 2009. IOP Publishing Doi:https://doi.org/10.1088/1742-6596/1168/6/062009
Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/2939672.2939785
Chen, Y. -C., Rogoff, K. S., & Rossi, B. (2010). Can exchange rates forecast commodity prices? Quarterly Journal of Economics, 125, 1145–1194. http://www.jstor.org/stable/27867508
Chen, Y., Xie, X., Zhang, T., Bai, J., & Hou, M. (2020). A deep residual compensation extreme learning machine and applications. Journal of Forecasting, 39, 986–999. https://doi.org/10.1002/for.2663
Ciner, C. (2017). Predicting white metal prices by a commodity sensitive exchange rate. International Review of Financial Analysis, 52, 309–315. https://doi.org/10.1016/j.irfa.2017.04.002
Climent, F., Momparler, A., & Carmona, P. (2019). Anticipating bank distress in the Eurozone: An Extreme Gradient Boosting approach. Journal of Business Research, 101, 885–896. https://doi.org/10.1016/j.jbusres.2018.11.015
Cologni, A., & Manera, M. (2008). Oil prices, inflation and interest rates in a structural cointegrated VAR model for the G-7 countries. Energy Economics, 30, 856–888. https://doi.org/10.2139/ssrn.843505
De Clercq, D., Wen, Z., Fei, F., Caicedo, L., Yuan, K., & Shang, R. (2020). Interpretable machine learning for predicting biomethane production in industrial-scale anaerobic co-digestion. Science of the Total Environment. https://doi.org/10.1016/j.scitotenv.2019.134574
Du, P., Wang, J., Yang, W., & Tong, N. (2020). Point and interval forecasting for metal prices based on variational mode decomposition and an optimized outlier-robust extreme learning machine. Resources Policy, 69, 101881.
Elie, B., Naji, J., Dutta, A., & Uddin, G. S. (2019). Gold and crude oil as safe-haven assets for clean energy stock indices: Blended copulas approach. Energy, 178, 544–553. https://doi.org/10.1016/j.energy.2019.04.155
Escribano, A., & Granger, C. W. J. (1998). Investigating the relationship between gold and silver prices. Journal of Forecasting, 17, 81–107.
Ewees, A. A., Elaziz, M. A., Alameer, Z., Ye, H., & Jianhua, Z. (2020). Improving multilayer perceptron neural network using chaotic grasshopper optimization algorithm to forecast iron ore price volatility. Resources Policy. https://doi.org/10.1016/j.resourpol.2019.101555
Fortune, J. N. (1987). The inflation rate of the price of gold, expected prices and interest rates. Journal of Macroeconomy, 9, 71–82.
Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, 29, 1189–1232.
Gholamy, A., Kreinovich, V., Kosheleva, O., (2018). Why 70/30 or 80/20 relation between training and testing sets : A pedagogical explanation. Departmental Technical Reports (CS) 1–6.
Guo, J., Yang, L., Bie, R., Yu, J., Gao, Y., Shen, Y., & Kos, A. (2019). An XGBoost-based physical fitness evaluation model using advanced feature selection and Bayesian hyper-parameter optimization for wearable running monitoring. Computer Networks, 151, 166–180. https://doi.org/10.1016/j.comnet.2019.01.026
He, Y., Wang, S., & Lai, K. K. (2010). Global economic activity and crude oil prices: A cointegration analysis. Energy Economics, 32, 868–876. https://doi.org/10.1016/j.eneco.2009.12.005
Herawati, S., Firmansyah, A., Latif, M., & Aeri, R. (2017). Implementing method of ensemble empirical mode decomposition and recurrent neural network for gold price forecasting. Journal of Engineering Research and Applications, 7(11), 39–43.
Huck, N. (2019). Large data sets and machine learning: Applications to statistical arbitrage. European Journal of Operational Research, 278, 330–342.
Jiang, C., Wang, Z., & Zhao, H. (2019). A prediction-driven mixture cure model and its application in credit scoring. European Journal of Operational Research, 277(1), 20–31. https://doi.org/10.1016/j.ejor.2019.01.072
Kang, S. H., McIver, R., & Yoon, S.-M. (2017). Dynamic spillover effects among crude oil, precious metal, and agricultural commodity futures markets. Energy Economics, 62, 19–32.
Kanjilal, K., & Ghosh, S. (2017). Dynamics of crude oil and gold price post 2008 global financial crisis—New evidence from threshold vector error-correction model. Resources Policy, 52(March), 358–365. https://doi.org/10.1016/j.resourpol.2017.04.001
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T. Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 2017-Decem(Nips), 3147–3155.
Kearney, A., & Lombra, R. E. (2009). Gold and platinum: Toward solving the price puzzle. The Quarterly Review of Economics and Finance, 49, 884–892.
Khashei, M., & Bijari, M. (2010). An artificial neural network model for time series forecasting. Expert System and Applications, 37, 479–489. https://doi.org/10.1016/j.eswa.2009.05.044
Khashei, M., & Bijari, M. (2011). A novel hybridization of artificial neural networks and ARIMA models for time series forecasting. Application Soft Computation, 11, 2664–2675. https://doi.org/10.1016/j.asoc.2010.10.015
Krauss, C., Anh, X., & Huck, N. (2017). Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S&P 500 R. European Journal of Operational Research, 259(2), 689–702. https://doi.org/10.1016/j.ejor.2016.10.031
Kristjanpoller, W., & Hernandez, E. (2017). Volatility of main metals forecasted by a hybrid ANN-GARCH model with regressors. Expert Systems with Applications, 84, 290–300. https://doi.org/10.1016/j.eswa.2017.05.024
Kristjanpoller, W., & Minutolo, M. C. (2015). Gold price volatility: A forecasting approach using the artificial neural network-GARCH model. Expert Systems with Applications, 42(20), 7245–7251. https://doi.org/10.1016/j.eswa.2015.04.058
Kristjanpoller, W., & Minutolo, M. C. (2016). Forecasting volatility of oil price using an artificial neural network-GARCH model. Expert Systems with Applications, 65, 233–241. https://doi.org/10.1016/j.eswa.2016.08.045
Kucher, O., & McCoskey, S. (2017). The long-run relationship between precious metal prices and the business cycle. Quarterly Review of Economics and Finance, 65, 263–275. https://doi.org/10.1016/j.qref.2016.09.005
Kumar, S. (2018). Prediction of gold and silver prices in an emerging economy: Comparative analysis of linear, nonlinear, hybrid, and ensemble models. The Journal of Prediction Markets, 12(3), 63–78.
Lago, J., De Ridder, F., & De Schutter, B. (2018). Forecasting spot electricity prices: Deep learning approaches and empirical comparison of traditional algorithms. Applied Energy, 221(April), 386–405. https://doi.org/10.1016/j.apenergy.2018.02.069
Lardic, S., & Mignon, V. (2008). Oil prices and economic activity: An asymmetric cointegration approach. Energy Economics, 30, 847–855. https://doi.org/10.1016/j.eneco.2006.10.010
Lineesh, M., Minu, K., & John, C. J. (2010). Analysis of nonstationary nonlinear economic time series of gold price: A comparative study. International Mathematical Forum, 5, 1673–1683.
Liu, C., Hu, Z., Li, Y., & Liu, S. (2017). Forecasting copper prices by decision tree learning. Resources Policy, 52, 427–434. https://doi.org/10.1016/j.resourpol.2017.05.007
Liu, Y., Li, H., Guan, J., Liu, X., Guan, Q., & Sun, Q. (2019). Influence of different factors on prices of upstream, middle and downstream products in China’s whole steel industry chain: Based on Adaptive Neural Fuzzy Inference. System Resources Policy, 60, 134–142. https://doi.org/10.1016/j.resourpol.2018.12.009
Loureiro, A. L. D., Miguéis, V. L., & da Silva, L. F. M. (2018). Exploring the use of deep neural networks for sales forecasting in fashion retail. Decision Support Systems, 114(January), 81–93. https://doi.org/10.1016/j.dss.2018.08.010
Lundberg, S. M., Erion, G. G., & Lee, S. -I. (2018). Consistent Individualized Feature Attribution for Tree Ensembles. 2.
Lundberg, S. M., and Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 2017-Decem(Section 2), 4766–4775.
Ma, J., Ding, Y., Cheng, J. C. P., Jiang, F., Tan, Y., Gan, V. J. L., & Wan, Z. (2020). Identification of high impact factors of air quality on a national scale using big data and machine learning techniques. Journal of Cleaner Production. https://doi.org/10.1016/j.jclepro.2019.118955
Mahdavi, S., & Zhou, S. (1997). Gold and commodity prices as leading indicators of inflation: Tests of long-run relationship and predictive performance. Journal of Economics and Business, 49(5), 475–489.
Mercadier, M., & Lardy, J. P. (2019). Credit spread approximation and improvement using random forest regression. European Journal of Operational Research, 277(1), 351–365. https://doi.org/10.1016/j.ejor.2019.02.005
Mo, B., Nie, H., & Jiang, Y. (2018). Dynamic linkages among the gold market, US dollar and crude oil market. Physics A, 491, 984–994. https://doi.org/10.1016/j.physa.2017.09.091
Mo, H., Sun, H., Liu, J., & Wei, S. (2019). Developing window behavior models for residential buildings using XGBoost algorithm. Energy and Buildings. https://doi.org/10.1016/j.enbuild.2019.109564
O’connor, F. A., Lucey, B. M., Battend, J. A., & Baure, D. G. (2015). The financial economics of gold—A survey. International Review of Financial Analysis, 41, 186–205.
Parisi, A., Parisi, F., & Díaz, D. (2008). Forecasting gold price changes: Rolling and recursive neural network models. Journal of Multinational Financial Management, 18, 477–487. https://doi.org/10.1016/j.mulfin.2007.12.002
Pesaran, M. H., & Smith, R. P. (2019). A Bayesian analysis of linear regression models with highly collinear regressors. Econometrics and Statistics, 11, 1–21. https://doi.org/10.1016/j.ecosta.2018.10.001
Pierdzioch, C., Risse, M., & Rohloff, S. (2015a). Forecasting gold-price fluctuations: A real-time boosting approach. Applied Economics Letter, 22, 46–50.
Pierdzioch, C., Risse, M., & Rohloff, S. (2015b). A boosting approach to forecasting gold and silver returns: Economic and statistical forecast evaluation. Applied Economics Letter, 22, 46–50.
Pierdzioch, C., Risse, M., & Rohloff, S. (2016). Fluctuations of the real exchange rate, real interest rates, and the dynamics of the price of gold in a small open economy. Empirical Economics, 51(4), 1481–1499. https://doi.org/10.1007/s00181-015-1053-5
Pierdziochu, P., & Risse, M. (2020). Forecasting precious metal returns with multivariate random forests. Empirical Economics, 58, 1167–1184. https://doi.org/10.1007/s00181-018-1558-9
Piñeiro-Chousa, J., López-Cabarcos, M. Á., Pérez-Pico, A. M., & Ribeiro-Navarrete, B. (2018). Does social network sentiment influence the relationship between the S&P 500 and gold returns? International Review of Financial Analysis, 57, 57–64. https://doi.org/10.1016/j.irfa.2018.02.005
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., & Gulin, A. (2018). Catboost: Unbiased boosting with categorical features. Advances in Neural Information Processing Systems, 2018-December (Section 4), 6638–6648.
Pukthuanthong, K., & Roll, R. (2011). Gold and the dollar (and the Euro, Pound, and Yen). Journal of Banking and Finance, 35, 2070–2083.
Rabbouch, B., Saâdaoui, F., & Mraihi, R. (2020). Empirical-type simulated annealing for solving the capacitated vehicle routing problem. Journal of Experimental and Theoretical Artificial Intelligence, 32(3), 437–452. https://doi.org/10.1080/0952813X.2019.1652356
Ramyar, S., & Kianfar, F. (2017). Forecasting crude oil prices: A comparison between artificial neural networks and vector Autoregressive models. Computational Economics. https://doi.org/10.1007/s10614-017-9764-7
Reboredo, J. C. (2013). Is gold a safe haven or a hedge for the U.S. dollar? Implications for risk management. Journal of Banking and Finance, 37, 2665–2676.
Risse, M. (2019). Combining wavelet decomposition with machine learning to forecast gold returns. International Journal of Forecasting, 35(2), 601–615. https://doi.org/10.1016/j.ijforecast.2018.11.008
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Modelagnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386.
Roberts, M. C. (2009). Duration and characteristics of metal price cycles. Resources Policy, 34, 87–102. https://doi.org/10.1016/j.resourpol.2009.02.001
Rossen, A. (2015). What are metal prices like? Co-movement, price cycles and long-run trends. Resources Policy, 45, 255–276. https://doi.org/10.1016/j.resourpol.2015.06.002
Saâdaoui, F. (2012). A probabilistic clustering method for US interest rate analysis. Quantitative Finance, 12(1), 135–148. https://doi.org/10.1080/14697681003591712
Sari, R., Hammoudeh, S., & Soytas, U. (2010). Dynamics of oil price, precious metal prices, and exchange rate. Energy Economics, 32, 351–362.
Schweikert, K. (2018). Are gold and silver cointegrated? New evidence from quantile cointegrating regressions. Journal of Banking and Finance, 88, 44–51. https://doi.org/10.1016/j.jbankfin.2017.11.010
Sensoy, A. (2013). Dynamic relationship between precious metals. Resources Policy, 38, 504–511.
Sephton, P., & Mann, J. (2018). Gold and crude oil prices after the great moderation. Energy Economics, 71, 273–281. https://doi.org/10.1016/j.eneco.2018.02.022
Sezer, O. B., Gudelek, M. U., & Ozbayoglu, A. M. (2020). Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Applied Soft Computing, 90, 106181.
Shafiee, S., & Topal, E. (2010a). An overview of global gold market and gold price forecasting. Resources Policy, 35, 178–189. [4].
Shafiee, S., & Topal, E. (2010b). An overview of global gold market and gold price forecasting. Resources Policy, 35, 178–189. https://doi.org/10.1016/j.resourpol.2010.05.004
Shapley, L. S., (1953). A value for n-person games. Contrib. to Theory Games. pp. 307–317.
Singh, N., Singh, P., & Bhagat, D. (2019). A rule extraction approach from support vector machines for diagnosing hypertension among diabetics. Expert Systems with Applications, 130, 188–205. https://doi.org/10.1016/j.eswa.2019.04.029
Singhal, S., Choudhary, S., & Biswal, P. C. (2019). Return and volatility linkages among International crude oil price, gold price, exchange rate and stock markets: Evidence from Mexico. Resources Policy, 60(January), 255–261. https://doi.org/10.1016/j.resourpol.2019.01.004
Štrumbelj, E., & Kononenko, I. (2014). Explaining prediction models and individual predictions with feature contributions. Knowledge and information systems, 41(3), 647–665.
Sun, X., Liu, M., & Sima, Z. (2019). A novel cryptocurrency price trend forecasting model based on LightGBM. Finance Research Letters, December. https://doi.org/10.1016/j.frl.2018.12.032
Teetranont, T., Chanaim, S., Yamaka, W., & Sriboonchitta, S. (2018). Investigating relationship between gold price and crude oil price using interval data with copula based GARCH. In V. Kreinovich, S. Sriboonchitta, & N. Chakpitak (Eds.), Predictive Econometrics and Big Data (pp. 656–669). Springer International Publishing.
Tully, E., & Lucey, B. M. (2007). A power GARCH examination of the gold market. Research in International Business and Finance, 21(2), 316–325.
Wen, X., Yang, X., & Gong, K. K. L. (2017). Multi-scale volatility feature analysis and prediction of gold price. International Journal of Information Technology and Decision, 16, 205–223.
Wu, D., & Hu, Z.-H. (2016). Structural changes and volatility correlation in nonferrous metal market. Transactions Nonferrous Metals Society of China, 26, 2784–2792. https://doi.org/10.1016/S1003-6326(16)64395-9
Xia, Y., Liu, C., Li, Y. Y., & Liu, N. (2017). A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Systems with Applications, 78, 225–241. https://doi.org/10.1016/j.eswa.2017.02.017
Yue, Y., Liu, D., & Xu, S. (2015). Price linkage between Chinese and international nonferrous metals commodity markets based on VAR-DCC-GARCH models. Transactions Nonferrous Metals Society of China, 25, 1020–1026. https://doi.org/10.1016/S1003-6326(15)63693-7
Zhang, P., & Ci, B. (2020). Deep belief network for gold price forecasting. Resources Policy, 69, 101806.
Zheng, J., Fu, X., & Zhang, G. (2019). Research on exchange rate forecasting based on deep belief network. Neural Computing and Applications, 31, 573–582.
Zhu, Y., & Zhang, C. (2018). Gold price prediction based on pca-ga-bp neural network. Journal of Computer and Communications, 6(7), 22–33.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jabeur, S.B., Mefteh-Wali, S. & Viviani, JL. Forecasting gold price with the XGBoost algorithm and SHAP interaction values. Ann Oper Res 334, 679–699 (2024). https://doi.org/10.1007/s10479-021-04187-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-021-04187-w