In recent years, deep learning has been attracting substantial attention due to its outstanding forecasting performance. However, the application of deep learning methods in solving the problem of forecasting tourist arrivals has been few. For the efficient allocation of tourism resources, tourist arrivals must be accurately predicted for government and tourism enterprises. In this study, a new hybrid deep learning approach is developed for tourist arrival forecasting. Random forest is used to reduce the dimensionality of the search query index data for selecting a small subset of informative features that contain the information that is most related to the tourist arrivals. Differential evolution algorithm is designed for choosing the lag lengths of each search query index and historical tourist arrival data for reconstructing the forecasting input. Long short-term memory (LSTM) is used for modeling the nonlinear relationship between tourist arrivals and search query index data. Two comparative examples, namely, Beijing City and Jiuzhaigou Valley, are applied for verification of the forecasting accuracy of the proposed deep learning method. The results indicate that the proposed deep learning method outperforms some time series and machine learning methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abellán J, Mantas CJ, Castellano JG. A random forest approach using imprecise probabilities. Knowl-Based Syst. 2017;134:72–84.
Artola C, Pinto F, de Pedraza García P. Can internet searches forecast tourism inflows? Int J Manpow. 2015;36(1):103–16.
Blazquez D, Domenech J. Big data sources and methods for social and economic analyses. Technol Forecast Soc Chang. 2018;130:99–113.
Bogaerts T, Masegosa AD, Angarita-Zapata JS, Onieva E, Hellinckx P. A graph CNN-LSTM neural network for short and long-term traffic forecasting based on trajectory data. Transp Res C Emerg Technol. 2020;112:62–77.
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
Chen CF, Lai MC, Yeh CC. Forecasting tourism demand based on empirical mode decomposition and neural network. Knowl-Based Syst. 2012;26:281–7.
Chung N, Lee H, Lee SJ, Koo C. The influence of tourism website on tourists’ behavior to determine destination selection: a case study of creative economy in Korea. Technol Forecast Soc Chang. 2015;96:130–43.
Claveria O, Torra S. Forecasting tourism demand to Catalonia: neural networks vs. time series models. Econ Model. 2014;36:220–8.
Fischer T, Krauss C. Deep learning with long short-term memory networks for financial market predictions. Eur J Oper Res. 2018;270(2):654–69.
Gensler A, Henze J, Sick B, Raabe N. Deep learning for solar power forecasting-an approach using AutoEncoder and LSTM neural networks. In: Proceedings of IEEE Int. Conference on Systems, Man, and Cybernetics; 2016. p. 2858–65.
Genuer R, Poggi JM, Tuleau-Malot C, Villa-Vialaneix N. Random forests for big data. Big Data Res. 2017;9:28–46.
Grömping U. Variable importance assessment in regression: linear regression versus random forest. Am Stat. 2009;63(4):308–19.
Gunter U, Önder I. Forecasting city arrivals with Google Analytics. Ann Tour Res. 2016;61:199–212.
Hapfelmeier A, Ulm K. Variable selection by random forests using data with missing values. Comput Stat Data Anal. 2014;80(80):129–39.
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.
Huang X, Zhang L, Ding Y. The Baidu Index: uses in predicting tourism flows-a case study of the Forbidden City. Tour Manag. 2017;58:301–6.
Janitza S, Tutz G, Boulesteix AL. Random forest for ordinal responses: prediction and variable selection. Comput Stat Data Anal. 2016;96:57–73.
JuHyok U, Lu P, Kim C, Ryu U, Pak K. A new LSTM based reversal point prediction method using upward/downward reversal point feature sets. Chaos, Solitons Fractals. 2020;132:109559.
Jungmittag A. Combination of forecasts across estimation windows: an application to air travel demand. J Forecast. 2016;35(4):373–80.
Karevan Z, Suykens JA. Transductive LSTM for time-series prediction: an application to weather forecasting. Neural Netw. 2020;125:1–9. https://doi.org/10.1016/j.neunet.2019.12.030.
Keles D, Scelle J, Paraschiv F, Fichtner W. Extended forecast methods for day-ahead electricity spot prices applying artificial neural networks. Appl Energy. 2016;162:218–30.
Kim HY, Won CH. Forecasting the volatility of stock price index: a hybrid model integrating LSTM with multiple GARCH-type models. Expert Syst Appl. 2018;103:25–37.
Li S, Chen T, Wang L, Ming C. Effective tourist volume forecasting supported by PCA and improved BPNN using Baidu Index. Tour Manag. 2018;68:116–26.
Li G, Wu DC, Zhou M, Liu A. The combination of interval forecasts in tourism. Ann Tour Res. 2019;75:363–78.
Liang YH. Forecasting models for Taiwanese tourism demand after allowance for Mainland China tourists visiting Taiwan. Comput Ind Eng. 2014;74:111–9.
Lin VS, Liu A, Song H. Modeling and forecasting Chinese outbound tourism: an econometric approach. J Travel Tour Mark. 2015;32(1–2):34–49.
Liu YY, Tseng FM, Tseng YH. Big Data analytics for forecasting tourism destination arrivals with the applied vector autoregression model. Technol Forecast Soc Chang. 2018;130:123–34.
Lulli A, Oneto L, Anguita D. Mining big data with random forests. Cogn Comput. 2019;11(2):294–316.
Lv SX, Peng L, Wang L. Stacked autoencoder with echo-state regression for tourism demand forecasting using search query data. Appl Soft Comput. 2018;73:119–33.
Matin SS, Farahzadi L, Makaremi S, Chelgani SC, Sattari G. Variable selection and prediction of uniaxial compressive strength and modulus of elasticity by random forest. Appl Soft Comput. 2018;70:980–7.
Mursalin M, Zhang Y, Chen Y, Chawla NV. Automated epileptic seizure detection using improved correlation-based feature selection with random forest classifier. Neurocomputing. 2017;241:204–14.
Park S, Lee J, Song W. Short-term forecasting of Japanese tourist inflow to South Korea using Google trends data. J Travel Tour Mark. 2017;34(3):357–68.
Peng G, Liu Y, Wang J, Gu J. Analysis of the prediction capability of web search data based on the HE-TDC method–prediction of the volume of daily tourism visitors. J Syst Sci Syst Eng. 2017;26(2):163–82.
Peng L, Liu S, Liu R, Wang L. Effective long short-term memory with differential evolution algorithm for electricity price prediction. Energy. 2018;162:1301–14.
Peng L, Zhu Q, Lv SX, Wang L. Effective long short-term memory with fruit fly optimization algorithm for time series forecasting. Soft Comput. 2020. https://doi.org/10.1007/s00500-020-04855-2.
Principi E, Rossetti D, Squartini S, Piazza F. Unsupervised electric motor fault detection by using deep autoencoders. IEEE-CAA J Autom Sin. 2019;6(2):441–51.
Srivastava S, Lessmann S. A comparative study of LSTM neural networks in forecasting day-ahead global horizontal irradiance with satellite data. Sol Energy. 2018;162:232–47.
Storn R, Price K. Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim. 1997;11(4):341–59.
Sun X, Peng X, Ding S. Emotional human-machine conversation generation based on long short-term memory. Cogn Comput. 2018;10(3):389–97.
Sun S, Wei Y, Tsui KL, Wang S. Forecasting tourist arrivals with machine learning and internet search index. Tour Manag. 2019;70:1–10.
Wang L, Lv SX, Zeng YR. Effective sparse adaboost method with ESN and FOA for industrial electricity consumption forecasting in China. Energy. 2018;155:1013–31.
Wu LJ, Cao GH. Seasonal SVR with FOA algorithm for single-step and multi-step ahead forecasting in monthly inbound tourist flow. Knowl-Based Syst. 2016;110:157–66.
Yao Y, Cao Y, Ding X, Zhai J, Liu J, Luo Y, et al. A paired neural network model for tourist arrival forecasting. Expert Syst Appl. 2018;114:588–614.
Zeng YR, Zeng Y, Choi B, Wang L. Multifactor-influenced energy consumption forecasting using enhanced back-propagation neural network. Energy. 2017;127:381–96.
Zhao Z, Chen W, Wu X, Chen PCY, Liu J. LSTM network: a deep learning approach for short-term traffic forecast. IET Intell Transp Syst. 2017;11(2):68–75.
This study was funded by Humanities and Social Sciences Foundation of Chinese Ministry of Education, China (No. 18YJA630005), National Natural Science Foundation of China (No. 71771095), and the Fundamental Research Funds for the Central Universities (HUST: 2019kfyRCPY038).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Ethical Approval
This article does not contain any studies that used human participants or animals.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
• Deep learning method named RF–DE–LSTM is designed to predict tourist arrivals.
• Random forest is used to effectively measure the importance of each keyword.
• DE helps to select a suitable lag length of network input data.
• RF–DE–LSTM outperforms the existing best method for two comparative cases.
Rights and permissions
About this article
Cite this article
Peng, L., Wang, L., Ai, XY. et al. Forecasting Tourist Arrivals via Random Forest and Long Short-term Memory. Cogn Comput 13, 125–138 (2021). https://doi.org/10.1007/s12559-020-09747-z
Issue Date:
DOI: https://doi.org/10.1007/s12559-020-09747-z