Inbound tourism demand forecasting framework based on fuzzy time series and advanced optimization algorithm

https://doi.org/10.1016/j.asoc.2020.106320Get rights and content

Highlights

  • Develop a novel hybrid forecasting framework focusing on small sample forecasting.

  • Improve the system’s recognition ability by information optimization technology.

  • Search the optimal parameters based on atom search optimization algorithm.

  • Identify the regional characteristics of inbound tourism demand based on FCM.

  • Analyze the inbound tourism demand based out-of-sample forecasting results.

Abstract

The tourism industry has been integrated into the national strategic system in China. Thus, tourism demand forecasting has become a concern for the sustainable development of the tourism industry. Unfortunately, the sample size for tourism in China is always small and cannot satisfy the hypothesis test of an economic model or the data volume for a traditional time series model. In this study, a novel hybrid forecasting framework combining fuzzy time series (FTS) and an atom search optimization (ASO) algorithm is proposed for inbound tourism demand forecasting; this forecasting framework is particularly suitable for small sample sizes. Specifically, information optimization technology is applied in the FTS to improve the recognition ability of the system and effectively identify small sample information. The ASO algorithm is applied to search the optimal parameters of FTS that can further improve forecasting performance. All comparison experiments and tests verify the effectiveness and superiority of our proposed model, which provides excellent forecasting results for tourism demand and a basis for policymakers and managers to plan appropriately for the tourism market.

Introduction

Around the world, the development and promotion of tourism has become a source of personal income and government revenue [1]. In an increasingly competitive environment, it is important for governments and businesses to embrace demand trends in the tourism industry; this is fundamental to developing appropriate strategies for resources allocation and business investment to guarantee sustainable development in tourism [2].

Forecasting tourist volume is becoming increasingly important for predicting future economic development. However, due to the complex and evolutionary nature of the tourism market, a tourist demand series tends to be inherently noisy, conditionally non-stationary, and in some cases, deterministically chaotic. Because modeling dynamic, non-stationary demand series is challenging, a system is needed that allows for more accurate forecasts with less noise and complexity [3]. Current forecasting methods for tourism demand can be divided into econometric models and time series models.

Econometric models attempt to use related variables to explain and forecast the dependent variable based on economic theory. Econometric models include static models such as traditional regression models [4], gravity models [5], linear almost ideal demand systems [6], [7], and dynamic econometric models such as vector autoregressive (VAR) models [8], [9], time varying parameter models [10], and error correction models [11].

Time series models are popular in recent research because they are based only on historical tourism demand data [12]. Time series models can be divided into linear and nonlinear models. The most popular linear methods that have been successfully applied in practical applications are the exponential smoothing (ES) model [13], [14] and autoregressive integrated moving averages (ARIMA) models [15], [16]. In practice, linear models possess an important advantage in easy implementation and interpretation. However, when linear models perform poorly in both in-sample fitting and out-of-sample forecasting, more complex nonlinear models should be considered. Nonlinear time series models study the nonlinear relationship between historical series and predictive indicators such as artificial neural networks [17], [18], [19] and support vector regression [20], [21], [22]. It is widely believed that nonlinear methods are superior to linear methods for efficient and prudent decision-making when applied in modeling economic behavior.

Additionally, there are several combined models proposed to improve forecasting accuracy [23]. Robert R [24] combined different time length forecasts for application in tourism demand forecasting. John T. Coshall [25] combined volatility and smoothing forecasts in UK tourism demand. Kuan-Yu Chen [26] combined linear and nonlinear models for tourism demand forecasting. Haiyan Song [27] combined statistical and judgmental forecasts to forecast demand for Hong Kong tourism. It cannot be denied that the combined forecasting models can achieve better performance than individual models.

Econometric models always rely on economic theory. A regression equation need to tests a series of hypotheses, which is difficult to be fulfilled in actual applications. Time series models always require sufficient historical data to explore the rules and patterns in the data. Linear time series models have a poor extrapolation effect and narrow forecasting scale; nonlinear time series models are unstable, and highly dependent on data.

In practice, the sample size for tourism is always small, making it difficult to test the hypothesis. Thus, a traditional time series model may be not appropriate or accurate. Moreover, the data usually contains uncertainty and fuzziness due to limitations of resources and statistical techniques or inherent dataset characteristics. If a traditional model is applied to explain the stability and trend analysis of a time series, the model may incur causal judgment deviation and forecasting error, or increase the error between forecasting results and actual values.

Accordingly, fuzzy theory provides a powerful tool for uncertainty and incomplete or limited data. Some scholars present the concept of fuzzy time series by combining fuzzy theory and fuzzy mathematics methods that have been applied in some fields for time series forecasting [28]. Song and Chissom [29] proposed a fuzzy time series model for forecasting university enrollment. Yu [30] proposed weighted fuzzy time series models for Taiwan Stock Index forecasting. Li et al. [31] applied fuzzy time series for air quality forecasting with large fluctuations in the concentration of pollutants. Stefanakos [32] applied the fuzzy time series method to forecast non-stationary wind and wave data. There have been some advances to improve the accuracy of fuzzy time series methods. Cheng et al. [33] added the adaptive expectation model in fuzzy time series forecasting to improve forecasting performance for the Taiwan Stock Index. Sadaei et al. [34] combined fuzzy time series and a convolutional neural network to control over-fitting phenomenon for short-term load forecasting. Dincer [35] applied the fuzzy k-medoid clustering algorithm to address the outliers and abnormal observations in a fuzzy time series for air pollution forecasting.

In a fuzzy time series forecasting model, partitioning the interval length and establishing the fuzzy logic relationship are two main phases that can significantly affect forecasting performance. Huang et al. [36] applied particle swarm optimization to adjust the interval lengths in the fuzzy time series for forecasting student enrollment. Yang et al. [37] applied a multi-objective differential evolution algorithm to determine the interval lengths to balance the accuracy and stability for wind speed forecasting. Lu et al. [38] applied an interval information granules technique to optimize the interval partition in a fuzzy time series. Chen et al. [39], [40] used particle swarm optimization techniques to determine the optimal partitions of intervals used for the TAIEX, NTD/USD exchange rates, and the forecasting of enrollment at the University of Alabama. Cheng and Yang [41] applied a rule-based algorithm rather than fuzzy logical relationships to establish forecast rules in a fuzzy time series for stock price forecasting. Rubio et al. [42] proposed a new weighted fuzzy-trend method to assign weights in a fuzzy time series for forecasting stock market indices. Sadaei et al. [43] used a seasonal auto-regressive fractionally integrated moving average and particle swarm optimization to establish the fuzzy logical relationship for seasonal time series forecasting.

For the tourism demand time series, the sample size is always small. In some cases, the information is insufficient to accurately understand the statistical rules of the samples. Thus, information optimization technology including information distribution and information diffusion is applied using a fuzzy time series method to discretely distribute and superpose the sample information to excavate the true structure of overall sample information and as much useful information as possible to further enhance the understanding of the sample. In most previous studies, the parameters in a fuzzy time series are selected by experience, which can significantly affect the model performance in the forecasting progress. Thus, in our study the ASO algorithm is applied to search the optimal parameters in the fuzzy time series. Accordingly, the contributions of our study are as follows:

  • (1)

    A novel hybrid forecasting model focusing on small sample size for tourism demand forecasting is developed. The sample size of most economic data for tourism is too small to meet the requirements of traditional forecasting models that use large data samples. Thus, in this study, a hybrid forecasting method combined with a fuzzy time series and an advanced optimization algorithm is proposed. It is appropriate for small sample forecasting and can provide excellent results for tourism demand forecasting.

  • (2)

    Information optimization technology including information distribution and diffusion is applied in a fuzzy time series method. Information distribution technology can allocate the information carried by sample points to fill information gaps in small sample sets. Accordingly, forecasting accuracy can be improved. Normal information diffusion can spread the information carried by sample points to multiple monitoring points, which can both simplify the operation and improve the recognition ability of the system, improving forecasting accuracy.

  • (3)

    An advanced optimization algorithm, ASO, is used to search the optimal parameters in the fuzzy time series model. In a fuzzy time series, the main parameters are the ambiguity, fuzzy intervals, and diffusion coefficient (h), which are selected by experience in most research and play a key role in the forecasting progress. The ASO algorithm is applied to determine the optimal values of these parameters, which can significantly improve the forecasting accuracy.

  • (4)

    Tourism demand is analyzed based on previous tourism demand and out-of-sample forecasting obtained by the optimal model. According to the comprehensive evaluation indexes, tests, and forecasting effectiveness, the optimal model for tourism demand is selected and applied to forecast the annual number of inbound tourists from 2017 to 2020. Combined with historical data and forecasting results, the development trend of tourism demand is analyzed. Trend analysis is a prerequisite for policymakers and managers to enact plans and adopt measures to guarantee the balance of the tourism market.

The rest of the paper is organized as follows: Section 2 presents the clustering process for different provinces. Section 3 introduces the framework of our forecasting model. Section 4 describes and compares the empirical results. Section 5 discusses statistical tests, forecasting effectiveness, reproducibility, and universality of the proposed forecasting framework. In Section 6, out-of-sample forecasting is conducted and analyzed by the forecasting model with the best performance. Conclusions are presented in Section 7. The structure of our forecasting framework is illustrated in Fig. 1.

Section snippets

The clustering of province tourism based on fuzzy c-means clustering algorithm

Due to the diversity of the resource endowment, economic level, geographical location, cultural background, and infrastructure, the development of tourism in different provinces varies significantly. Therefore, to intuitively recognize the characteristics of tourism demand development from the perspectives of region distribution and economic development, the fuzzy c-means clustering (FCM) algorithm is applied to classify all 31 provinces [44] (data for Hong Kong, Macao, and Taiwan cannot be

Introduction of the methods

In practice, the data is insufficient and not always easily accessible. To address the small sample time series and avoid model causality deviation and forecasting error caused by traditional models, a forecasting system based on fuzzy time series (FTS) and the atom search optimization (ASO) algorithm is proposed in our study to forecast the number of annual inbound tourists.

In the fuzzy time series, fuzzy information optimization technology is applied to excavate useful information from the

Empirical study

To verify the effectiveness and applicability of our proposed forecasting method, experiments and comparisons are presented in this section, including data description, evaluation criteria, parameter setting, experimental results, and analysis.

Discussion

In this section, three parts are discussed. A no-parameters test is used to determine if there is a significant statistical difference between the models. The forecasting effectiveness is verified. The reproducibility and universality of the proposed forecasting framework is verified through repeated and supplementary experiments.

Out-of-sample forecasting and discussion

The experiments and tests indicate that our proposed ASO-NFTS model can achieve higher accuracy and more stable forecasting results than other models for annual number of inbound tourists. The lower part of Fig. 4 illustrates the actual and forecasted values of ASO-NFTS. It is observed that the scatter is close to the diagonal. Thus, in this section, ASO-NFTS is applied to make out-of-sample forecasting for Beijing, Guangdong, and national annual number of inbound tourists from 2017 to 2020.

Conclusion

Excellent forecasting performance is a prerequisite and foundation of scheduling and management in economic and industrial fields. Some traditional models, such as the autoregressive integrated moving average, artificial neural network, and regression models meet with theoretical assurance only when the sample size is large. However, in most actual circumstances, it is a formidable or impossible task for researchers to collect complete information and many samples due to limited data. Thus,

CRediT authorship contribution statement

Ping Jiang: Conceptualization, Resources, Methodology. Hufang Yang: Formal analysis, Writing - original draft, Writing - review & editing. Ranran Li: Software, Writing - review & editing. Chen Li: Software.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This work was supported by Major Program of National Social Science Foundation of China (Grant No. 17ZDA093).

References (56)

  • BurgerC.J.S.C. et al.

    A practitioners guide to time-series methods for tourism demand forecasting - a case study of Durban, South Africa

    Tour. Manag.

    (2001)
  • AkinM.

    A novel approach to model selection in tourism demand modeling

    Tour. Manag.

    (2015)
  • LimC. et al.

    Time series forecasts of international travel demand for Australia

    Tour. Manag.

    (2002)
  • ChenC.F. et al.

    Forecasting tourism demand based on empirical mode decomposition and neural network

    Knowl.-Based Syst.

    (2012)
  • YaoY. et al.

    A paired neural network model for tourist arrival forecasting

    Expert Syst. Appl.

    (2018)
  • LiS. et al.

    Effective tourist volume forecasting supported by PCA and improved BPNN using Baidu index

    Tour. Manag.

    (2018)
  • ChenK.Y. et al.

    Support vector regression with genetic algorithms in forecasting tourism demand

    Tour. Manag.

    (2007)
  • LijuanW. et al.

    Seasonal svr with foa algorithm for single-step and multi-step ahead forecasting in monthly inbound tourist flow

    Knowl.-Based Syst.

    (2016)
  • AltanA. et al.

    Digital currency forecasting with chaotic meta-heuristic bio-inspired signal processing techniques

    Chaos Solitons Fractals

    (2019)
  • AndrawisR.R. et al.

    Combination of long term and short term forecasts, with application to tourism demand forecasting

    Int. J. Forecast.

    (2011)
  • CoshallJ.T.

    Combining volatility and smoothing forecasts of UK demand for international tourism

    Tour. Manag.

    (2009)
  • ChenK.Y.

    Combining linear and nonlinear model in forecasting tourism demand

    Expert Syst. Appl.

    (2011)
  • SongH. et al.

    Combining statistical and judgmental forecasts via a web-based tourism demand forecasting system

    Int. J. Forecast.

    (2013)
  • SongQ. et al.

    Forecasting enrollments with fuzzy time series — part II

    Fuzzy Sets and Systems

    (1994)
  • YuH.-K.

    Weighted fuzzy time series models for TAIEX forecasting

    Physica A

    (2005)
  • WangJ. et al.

    Application of a novel early warning system based on fuzzy time series in urban air quality forecasting in China

    Appl. Soft Comput. J.

    (2018)
  • StefanakosC.

    Fuzzy time series forecasting of nonstationary wind and wave data

    Ocean Eng.

    (2016)
  • ChengC.-H. et al.

    Fuzzy time-series based on adaptive expectation model for TAIEX forecasting

    Expert Syst. Appl.

    (2008)
  • Cited by (35)

    • A GRASP-VND algorithm to solve the multi-objective fuzzy and sustainable Tourist Trip Design Problem for groups[Formula presented]

      2022, Applied Soft Computing
      Citation Excerpt :

      This information is usually expressed through a natural language modelled by linguistic variables [19]. In these cases, fuzzy optimisation is a tool that allows addressing these types of characteristics to build solutions for problems of the same nature [18,50]. Some works have applied the fuzzy optimisation introduced by [51] through Fuzzy Linear Programming (FLP) to address TTDP models.

    View all citing articles on Scopus
    View full text