Inbound tourism demand forecasting framework based on fuzzy time series and advanced optimization algorithm
Introduction
Around the world, the development and promotion of tourism has become a source of personal income and government revenue [1]. In an increasingly competitive environment, it is important for governments and businesses to embrace demand trends in the tourism industry; this is fundamental to developing appropriate strategies for resources allocation and business investment to guarantee sustainable development in tourism [2].
Forecasting tourist volume is becoming increasingly important for predicting future economic development. However, due to the complex and evolutionary nature of the tourism market, a tourist demand series tends to be inherently noisy, conditionally non-stationary, and in some cases, deterministically chaotic. Because modeling dynamic, non-stationary demand series is challenging, a system is needed that allows for more accurate forecasts with less noise and complexity [3]. Current forecasting methods for tourism demand can be divided into econometric models and time series models.
Econometric models attempt to use related variables to explain and forecast the dependent variable based on economic theory. Econometric models include static models such as traditional regression models [4], gravity models [5], linear almost ideal demand systems [6], [7], and dynamic econometric models such as vector autoregressive (VAR) models [8], [9], time varying parameter models [10], and error correction models [11].
Time series models are popular in recent research because they are based only on historical tourism demand data [12]. Time series models can be divided into linear and nonlinear models. The most popular linear methods that have been successfully applied in practical applications are the exponential smoothing (ES) model [13], [14] and autoregressive integrated moving averages (ARIMA) models [15], [16]. In practice, linear models possess an important advantage in easy implementation and interpretation. However, when linear models perform poorly in both in-sample fitting and out-of-sample forecasting, more complex nonlinear models should be considered. Nonlinear time series models study the nonlinear relationship between historical series and predictive indicators such as artificial neural networks [17], [18], [19] and support vector regression [20], [21], [22]. It is widely believed that nonlinear methods are superior to linear methods for efficient and prudent decision-making when applied in modeling economic behavior.
Additionally, there are several combined models proposed to improve forecasting accuracy [23]. Robert R [24] combined different time length forecasts for application in tourism demand forecasting. John T. Coshall [25] combined volatility and smoothing forecasts in UK tourism demand. Kuan-Yu Chen [26] combined linear and nonlinear models for tourism demand forecasting. Haiyan Song [27] combined statistical and judgmental forecasts to forecast demand for Hong Kong tourism. It cannot be denied that the combined forecasting models can achieve better performance than individual models.
Econometric models always rely on economic theory. A regression equation need to tests a series of hypotheses, which is difficult to be fulfilled in actual applications. Time series models always require sufficient historical data to explore the rules and patterns in the data. Linear time series models have a poor extrapolation effect and narrow forecasting scale; nonlinear time series models are unstable, and highly dependent on data.
In practice, the sample size for tourism is always small, making it difficult to test the hypothesis. Thus, a traditional time series model may be not appropriate or accurate. Moreover, the data usually contains uncertainty and fuzziness due to limitations of resources and statistical techniques or inherent dataset characteristics. If a traditional model is applied to explain the stability and trend analysis of a time series, the model may incur causal judgment deviation and forecasting error, or increase the error between forecasting results and actual values.
Accordingly, fuzzy theory provides a powerful tool for uncertainty and incomplete or limited data. Some scholars present the concept of fuzzy time series by combining fuzzy theory and fuzzy mathematics methods that have been applied in some fields for time series forecasting [28]. Song and Chissom [29] proposed a fuzzy time series model for forecasting university enrollment. Yu [30] proposed weighted fuzzy time series models for Taiwan Stock Index forecasting. Li et al. [31] applied fuzzy time series for air quality forecasting with large fluctuations in the concentration of pollutants. Stefanakos [32] applied the fuzzy time series method to forecast non-stationary wind and wave data. There have been some advances to improve the accuracy of fuzzy time series methods. Cheng et al. [33] added the adaptive expectation model in fuzzy time series forecasting to improve forecasting performance for the Taiwan Stock Index. Sadaei et al. [34] combined fuzzy time series and a convolutional neural network to control over-fitting phenomenon for short-term load forecasting. Dincer [35] applied the fuzzy k-medoid clustering algorithm to address the outliers and abnormal observations in a fuzzy time series for air pollution forecasting.
In a fuzzy time series forecasting model, partitioning the interval length and establishing the fuzzy logic relationship are two main phases that can significantly affect forecasting performance. Huang et al. [36] applied particle swarm optimization to adjust the interval lengths in the fuzzy time series for forecasting student enrollment. Yang et al. [37] applied a multi-objective differential evolution algorithm to determine the interval lengths to balance the accuracy and stability for wind speed forecasting. Lu et al. [38] applied an interval information granules technique to optimize the interval partition in a fuzzy time series. Chen et al. [39], [40] used particle swarm optimization techniques to determine the optimal partitions of intervals used for the TAIEX, NTD/USD exchange rates, and the forecasting of enrollment at the University of Alabama. Cheng and Yang [41] applied a rule-based algorithm rather than fuzzy logical relationships to establish forecast rules in a fuzzy time series for stock price forecasting. Rubio et al. [42] proposed a new weighted fuzzy-trend method to assign weights in a fuzzy time series for forecasting stock market indices. Sadaei et al. [43] used a seasonal auto-regressive fractionally integrated moving average and particle swarm optimization to establish the fuzzy logical relationship for seasonal time series forecasting.
For the tourism demand time series, the sample size is always small. In some cases, the information is insufficient to accurately understand the statistical rules of the samples. Thus, information optimization technology including information distribution and information diffusion is applied using a fuzzy time series method to discretely distribute and superpose the sample information to excavate the true structure of overall sample information and as much useful information as possible to further enhance the understanding of the sample. In most previous studies, the parameters in a fuzzy time series are selected by experience, which can significantly affect the model performance in the forecasting progress. Thus, in our study the ASO algorithm is applied to search the optimal parameters in the fuzzy time series. Accordingly, the contributions of our study are as follows:
- (1)
A novel hybrid forecasting model focusing on small sample size for tourism demand forecasting is developed. The sample size of most economic data for tourism is too small to meet the requirements of traditional forecasting models that use large data samples. Thus, in this study, a hybrid forecasting method combined with a fuzzy time series and an advanced optimization algorithm is proposed. It is appropriate for small sample forecasting and can provide excellent results for tourism demand forecasting.
- (2)
Information optimization technology including information distribution and diffusion is applied in a fuzzy time series method. Information distribution technology can allocate the information carried by sample points to fill information gaps in small sample sets. Accordingly, forecasting accuracy can be improved. Normal information diffusion can spread the information carried by sample points to multiple monitoring points, which can both simplify the operation and improve the recognition ability of the system, improving forecasting accuracy.
- (3)
An advanced optimization algorithm, ASO, is used to search the optimal parameters in the fuzzy time series model. In a fuzzy time series, the main parameters are the ambiguity, fuzzy intervals, and diffusion coefficient (h), which are selected by experience in most research and play a key role in the forecasting progress. The ASO algorithm is applied to determine the optimal values of these parameters, which can significantly improve the forecasting accuracy.
- (4)
Tourism demand is analyzed based on previous tourism demand and out-of-sample forecasting obtained by the optimal model. According to the comprehensive evaluation indexes, tests, and forecasting effectiveness, the optimal model for tourism demand is selected and applied to forecast the annual number of inbound tourists from 2017 to 2020. Combined with historical data and forecasting results, the development trend of tourism demand is analyzed. Trend analysis is a prerequisite for policymakers and managers to enact plans and adopt measures to guarantee the balance of the tourism market.
The rest of the paper is organized as follows: Section 2 presents the clustering process for different provinces. Section 3 introduces the framework of our forecasting model. Section 4 describes and compares the empirical results. Section 5 discusses statistical tests, forecasting effectiveness, reproducibility, and universality of the proposed forecasting framework. In Section 6, out-of-sample forecasting is conducted and analyzed by the forecasting model with the best performance. Conclusions are presented in Section 7. The structure of our forecasting framework is illustrated in Fig. 1.
Section snippets
The clustering of province tourism based on fuzzy c-means clustering algorithm
Due to the diversity of the resource endowment, economic level, geographical location, cultural background, and infrastructure, the development of tourism in different provinces varies significantly. Therefore, to intuitively recognize the characteristics of tourism demand development from the perspectives of region distribution and economic development, the fuzzy c-means clustering (FCM) algorithm is applied to classify all 31 provinces [44] (data for Hong Kong, Macao, and Taiwan cannot be
Introduction of the methods
In practice, the data is insufficient and not always easily accessible. To address the small sample time series and avoid model causality deviation and forecasting error caused by traditional models, a forecasting system based on fuzzy time series (FTS) and the atom search optimization (ASO) algorithm is proposed in our study to forecast the number of annual inbound tourists.
In the fuzzy time series, fuzzy information optimization technology is applied to excavate useful information from the
Empirical study
To verify the effectiveness and applicability of our proposed forecasting method, experiments and comparisons are presented in this section, including data description, evaluation criteria, parameter setting, experimental results, and analysis.
Discussion
In this section, three parts are discussed. A no-parameters test is used to determine if there is a significant statistical difference between the models. The forecasting effectiveness is verified. The reproducibility and universality of the proposed forecasting framework is verified through repeated and supplementary experiments.
Out-of-sample forecasting and discussion
The experiments and tests indicate that our proposed ASO-NFTS model can achieve higher accuracy and more stable forecasting results than other models for annual number of inbound tourists. The lower part of Fig. 4 illustrates the actual and forecasted values of ASO-NFTS. It is observed that the scatter is close to the diagonal. Thus, in this section, ASO-NFTS is applied to make out-of-sample forecasting for Beijing, Guangdong, and national annual number of inbound tourists from 2017 to 2020.
Conclusion
Excellent forecasting performance is a prerequisite and foundation of scheduling and management in economic and industrial fields. Some traditional models, such as the autoregressive integrated moving average, artificial neural network, and regression models meet with theoretical assurance only when the sample size is large. However, in most actual circumstances, it is a formidable or impossible task for researchers to collect complete information and many samples due to limited data. Thus,
CRediT authorship contribution statement
Ping Jiang: Conceptualization, Resources, Methodology. Hufang Yang: Formal analysis, Writing - original draft, Writing - review & editing. Ranran Li: Software, Writing - review & editing. Chen Li: Software.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This work was supported by Major Program of National Social Science Foundation of China (Grant No. 17ZDA093).
References (56)
- et al.
Using a Grey-Markov model optimized by Cuckoo search algorithm to forecast the annual foreign tourist arrivals to China
Tour. Manag.
(2016) - et al.
A meta-analysis of international tourism demand forecasting and implications for practice
Tour. Manag.
(2014) - et al.
Modeling a combined forecast algorithm based on sequence patterns and near characteristics: an application for tourism demand forecasting
Chaos Solitons Fractals
(2018) The advanced econometrics of tourism demand
Tour. Manag.
(2010)- et al.
Gravity models for tourism demand: theory and use
Ann. Tour. Res.
(2014) - et al.
Modelling US tourism demand for European destinations
Tour. Manag.
(2006) - et al.
Modelling the interdependence of tourism demand: the global vector autoregressive approach
Ann. Tour. Res.
(2017) - et al.
Big data analytics for forecasting tourism destination arrivals with the applied vector autoregression model
Technol. Forecast. Soc. Change
(2018) - et al.
Forecasting tourist arrivals using time-varying parameter structural time series models
Int. J. Forecast.
(2011) - et al.
A novel system based on neural networks with linear combination framework for wind speed forecasting
Energy Convers. Manage.
(2019)
A practitioners guide to time-series methods for tourism demand forecasting - a case study of Durban, South Africa
Tour. Manag.
A novel approach to model selection in tourism demand modeling
Tour. Manag.
Time series forecasts of international travel demand for Australia
Tour. Manag.
Forecasting tourism demand based on empirical mode decomposition and neural network
Knowl.-Based Syst.
A paired neural network model for tourist arrival forecasting
Expert Syst. Appl.
Effective tourist volume forecasting supported by PCA and improved BPNN using Baidu index
Tour. Manag.
Support vector regression with genetic algorithms in forecasting tourism demand
Tour. Manag.
Seasonal svr with foa algorithm for single-step and multi-step ahead forecasting in monthly inbound tourist flow
Knowl.-Based Syst.
Digital currency forecasting with chaotic meta-heuristic bio-inspired signal processing techniques
Chaos Solitons Fractals
Combination of long term and short term forecasts, with application to tourism demand forecasting
Int. J. Forecast.
Combining volatility and smoothing forecasts of UK demand for international tourism
Tour. Manag.
Combining linear and nonlinear model in forecasting tourism demand
Expert Syst. Appl.
Combining statistical and judgmental forecasts via a web-based tourism demand forecasting system
Int. J. Forecast.
Forecasting enrollments with fuzzy time series — part II
Fuzzy Sets and Systems
Weighted fuzzy time series models for TAIEX forecasting
Physica A
Application of a novel early warning system based on fuzzy time series in urban air quality forecasting in China
Appl. Soft Comput. J.
Fuzzy time series forecasting of nonstationary wind and wave data
Ocean Eng.
Fuzzy time-series based on adaptive expectation model for TAIEX forecasting
Expert Syst. Appl.
Cited by (35)
A solar radiation intelligent forecasting framework based on feature selection and multivariable fuzzy time series
2023, Engineering Applications of Artificial IntelligenceA time series attention mechanism based model for tourism demand forecasting
2023, Information SciencesA GRASP-VND algorithm to solve the multi-objective fuzzy and sustainable Tourist Trip Design Problem for groups[Formula presented]
2022, Applied Soft ComputingCitation Excerpt :This information is usually expressed through a natural language modelled by linguistic variables [19]. In these cases, fuzzy optimisation is a tool that allows addressing these types of characteristics to build solutions for problems of the same nature [18,50]. Some works have applied the fuzzy optimisation introduced by [51] through Fuzzy Linear Programming (FLP) to address TTDP models.