Skip to main content
Log in

Forecasting Tourist Arrivals via Random Forest and Long Short-term Memory

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

In recent years, deep learning has been attracting substantial attention due to its outstanding forecasting performance. However, the application of deep learning methods in solving the problem of forecasting tourist arrivals has been few. For the efficient allocation of tourism resources, tourist arrivals must be accurately predicted for government and tourism enterprises. In this study, a new hybrid deep learning approach is developed for tourist arrival forecasting. Random forest is used to reduce the dimensionality of the search query index data for selecting a small subset of informative features that contain the information that is most related to the tourist arrivals. Differential evolution algorithm is designed for choosing the lag lengths of each search query index and historical tourist arrival data for reconstructing the forecasting input. Long short-term memory (LSTM) is used for modeling the nonlinear relationship between tourist arrivals and search query index data. Two comparative examples, namely, Beijing City and Jiuzhaigou Valley, are applied for verification of the forecasting accuracy of the proposed deep learning method. The results indicate that the proposed deep learning method outperforms some time series and machine learning methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Abellán J, Mantas CJ, Castellano JG. A random forest approach using imprecise probabilities. Knowl-Based Syst. 2017;134:72–84.

    Google Scholar 

  2. Artola C, Pinto F, de Pedraza García P. Can internet searches forecast tourism inflows? Int J Manpow. 2015;36(1):103–16.

    Google Scholar 

  3. Blazquez D, Domenech J. Big data sources and methods for social and economic analyses. Technol Forecast Soc Chang. 2018;130:99–113.

    Google Scholar 

  4. Bogaerts T, Masegosa AD, Angarita-Zapata JS, Onieva E, Hellinckx P. A graph CNN-LSTM neural network for short and long-term traffic forecasting based on trajectory data. Transp Res C Emerg Technol. 2020;112:62–77.

    Google Scholar 

  5. Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

    MATH  Google Scholar 

  6. Chen CF, Lai MC, Yeh CC. Forecasting tourism demand based on empirical mode decomposition and neural network. Knowl-Based Syst. 2012;26:281–7.

    Google Scholar 

  7. Chung N, Lee H, Lee SJ, Koo C. The influence of tourism website on tourists’ behavior to determine destination selection: a case study of creative economy in Korea. Technol Forecast Soc Chang. 2015;96:130–43.

    Google Scholar 

  8. Claveria O, Torra S. Forecasting tourism demand to Catalonia: neural networks vs. time series models. Econ Model. 2014;36:220–8.

    Google Scholar 

  9. Fischer T, Krauss C. Deep learning with long short-term memory networks for financial market predictions. Eur J Oper Res. 2018;270(2):654–69.

    MathSciNet  MATH  Google Scholar 

  10. Gensler A, Henze J, Sick B, Raabe N. Deep learning for solar power forecasting-an approach using AutoEncoder and LSTM neural networks. In: Proceedings of IEEE Int. Conference on Systems, Man, and Cybernetics; 2016. p. 2858–65.

    Google Scholar 

  11. Genuer R, Poggi JM, Tuleau-Malot C, Villa-Vialaneix N. Random forests for big data. Big Data Res. 2017;9:28–46.

    Google Scholar 

  12. Grömping U. Variable importance assessment in regression: linear regression versus random forest. Am Stat. 2009;63(4):308–19.

    MathSciNet  Google Scholar 

  13. Gunter U, Önder I. Forecasting city arrivals with Google Analytics. Ann Tour Res. 2016;61:199–212.

    Google Scholar 

  14. Hapfelmeier A, Ulm K. Variable selection by random forests using data with missing values. Comput Stat Data Anal. 2014;80(80):129–39.

    MathSciNet  MATH  Google Scholar 

  15. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.

    Google Scholar 

  16. Huang X, Zhang L, Ding Y. The Baidu Index: uses in predicting tourism flows-a case study of the Forbidden City. Tour Manag. 2017;58:301–6.

    Google Scholar 

  17. Janitza S, Tutz G, Boulesteix AL. Random forest for ordinal responses: prediction and variable selection. Comput Stat Data Anal. 2016;96:57–73.

    MathSciNet  MATH  Google Scholar 

  18. JuHyok U, Lu P, Kim C, Ryu U, Pak K. A new LSTM based reversal point prediction method using upward/downward reversal point feature sets. Chaos, Solitons Fractals. 2020;132:109559.

    MathSciNet  Google Scholar 

  19. Jungmittag A. Combination of forecasts across estimation windows: an application to air travel demand. J Forecast. 2016;35(4):373–80.

    MathSciNet  Google Scholar 

  20. Karevan Z, Suykens JA. Transductive LSTM for time-series prediction: an application to weather forecasting. Neural Netw. 2020;125:1–9. https://doi.org/10.1016/j.neunet.2019.12.030.

    Article  Google Scholar 

  21. Keles D, Scelle J, Paraschiv F, Fichtner W. Extended forecast methods for day-ahead electricity spot prices applying artificial neural networks. Appl Energy. 2016;162:218–30.

    Google Scholar 

  22. Kim HY, Won CH. Forecasting the volatility of stock price index: a hybrid model integrating LSTM with multiple GARCH-type models. Expert Syst Appl. 2018;103:25–37.

    Google Scholar 

  23. Li S, Chen T, Wang L, Ming C. Effective tourist volume forecasting supported by PCA and improved BPNN using Baidu Index. Tour Manag. 2018;68:116–26.

    Google Scholar 

  24. Li G, Wu DC, Zhou M, Liu A. The combination of interval forecasts in tourism. Ann Tour Res. 2019;75:363–78.

    Google Scholar 

  25. Liang YH. Forecasting models for Taiwanese tourism demand after allowance for Mainland China tourists visiting Taiwan. Comput Ind Eng. 2014;74:111–9.

    Google Scholar 

  26. Lin VS, Liu A, Song H. Modeling and forecasting Chinese outbound tourism: an econometric approach. J Travel Tour Mark. 2015;32(1–2):34–49.

    Google Scholar 

  27. Liu YY, Tseng FM, Tseng YH. Big Data analytics for forecasting tourism destination arrivals with the applied vector autoregression model. Technol Forecast Soc Chang. 2018;130:123–34.

    Google Scholar 

  28. Lulli A, Oneto L, Anguita D. Mining big data with random forests. Cogn Comput. 2019;11(2):294–316.

    Google Scholar 

  29. Lv SX, Peng L, Wang L. Stacked autoencoder with echo-state regression for tourism demand forecasting using search query data. Appl Soft Comput. 2018;73:119–33.

    Google Scholar 

  30. Matin SS, Farahzadi L, Makaremi S, Chelgani SC, Sattari G. Variable selection and prediction of uniaxial compressive strength and modulus of elasticity by random forest. Appl Soft Comput. 2018;70:980–7.

    Google Scholar 

  31. Mursalin M, Zhang Y, Chen Y, Chawla NV. Automated epileptic seizure detection using improved correlation-based feature selection with random forest classifier. Neurocomputing. 2017;241:204–14.

    Google Scholar 

  32. Park S, Lee J, Song W. Short-term forecasting of Japanese tourist inflow to South Korea using Google trends data. J Travel Tour Mark. 2017;34(3):357–68.

    Google Scholar 

  33. Peng G, Liu Y, Wang J, Gu J. Analysis of the prediction capability of web search data based on the HE-TDC method–prediction of the volume of daily tourism visitors. J Syst Sci Syst Eng. 2017;26(2):163–82.

    Google Scholar 

  34. Peng L, Liu S, Liu R, Wang L. Effective long short-term memory with differential evolution algorithm for electricity price prediction. Energy. 2018;162:1301–14.

    Google Scholar 

  35. Peng L, Zhu Q, Lv SX, Wang L. Effective long short-term memory with fruit fly optimization algorithm for time series forecasting. Soft Comput. 2020. https://doi.org/10.1007/s00500-020-04855-2.

  36. Principi E, Rossetti D, Squartini S, Piazza F. Unsupervised electric motor fault detection by using deep autoencoders. IEEE-CAA J Autom Sin. 2019;6(2):441–51.

    Google Scholar 

  37. Srivastava S, Lessmann S. A comparative study of LSTM neural networks in forecasting day-ahead global horizontal irradiance with satellite data. Sol Energy. 2018;162:232–47.

    Google Scholar 

  38. Storn R, Price K. Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim. 1997;11(4):341–59.

    MathSciNet  MATH  Google Scholar 

  39. Sun X, Peng X, Ding S. Emotional human-machine conversation generation based on long short-term memory. Cogn Comput. 2018;10(3):389–97.

    Google Scholar 

  40. Sun S, Wei Y, Tsui KL, Wang S. Forecasting tourist arrivals with machine learning and internet search index. Tour Manag. 2019;70:1–10.

    Google Scholar 

  41. Wang L, Lv SX, Zeng YR. Effective sparse adaboost method with ESN and FOA for industrial electricity consumption forecasting in China. Energy. 2018;155:1013–31.

    Google Scholar 

  42. Wu LJ, Cao GH. Seasonal SVR with FOA algorithm for single-step and multi-step ahead forecasting in monthly inbound tourist flow. Knowl-Based Syst. 2016;110:157–66.

    Google Scholar 

  43. Yao Y, Cao Y, Ding X, Zhai J, Liu J, Luo Y, et al. A paired neural network model for tourist arrival forecasting. Expert Syst Appl. 2018;114:588–614.

    Google Scholar 

  44. Zeng YR, Zeng Y, Choi B, Wang L. Multifactor-influenced energy consumption forecasting using enhanced back-propagation neural network. Energy. 2017;127:381–96.

    Google Scholar 

  45. Zhao Z, Chen W, Wu X, Chen PCY, Liu J. LSTM network: a deep learning approach for short-term traffic forecast. IET Intell Transp Syst. 2017;11(2):68–75.

    Google Scholar 

Download references

Funding

This study was funded by Humanities and Social Sciences Foundation of Chinese Ministry of Education, China (No. 18YJA630005), National Natural Science Foundation of China (No. 71771095), and the Fundamental Research Funds for the Central Universities (HUST: 2019kfyRCPY038).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu-Rong Zeng.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies that used human participants or animals.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Highlights

• Deep learning method named RF–DE–LSTM is designed to predict tourist arrivals.

• Random forest is used to effectively measure the importance of each keyword.

• DE helps to select a suitable lag length of network input data.

• RF–DE–LSTM outperforms the existing best method for two comparative cases.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, L., Wang, L., Ai, XY. et al. Forecasting Tourist Arrivals via Random Forest and Long Short-term Memory. Cogn Comput 13, 125–138 (2021). https://doi.org/10.1007/s12559-020-09747-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-020-09747-z

Keywords

Navigation