1. Introduction
In recent decades, the tourism industry has become an important sector of a country’s economy, with a fast growth rate. Tourism and passenger transportation industry contributes to a major portion of economic growth rate in many countries and can directly impact factors such as number of jobs created and foreign investment. At the end of 20th century mass tourism spread in emerging, middle income economies. In 2007, the global international tourist arrivals from January to April experienced a 6 percent growth in comparison to 2006, where Asia and the Pacific had the strongest growth, according to a report by the UNWTO World Tourism Barometer (Chu, 2009). The improving economic conditions after the 2009 crisis let more people around the world plan for traveling to other regions. This is a rebound that put world tourism back on its longer-term growth (Croce, 2018). It was estimated by World Travel and Tourism Council that in 2014, the tourism industry contributed to 10 percent of the global gross domestic production (GDP), and one out of every 11 jobs was related to tourism. In the most recent report (year 2018) from the same organization, travel and tourism is shown to account for 10.4 percent of global GDP and 313 million jobs, or 9.9 percent of total employment (WTTC, 2018).
The tourism and passenger transportation industry economic impact is vital for many countries and regions in the world. For instance, in 1990, there were just 250,000 tourist arrivals in Vietnam, which increased to 6 million arrivals in 2011. In the same year, tourism created 1.3 million jobs in Vietnam, amounting to 5 percent of the country GDP (Nguyen et al., 2013). In 1990, 2.6 million tourists visited Egypt which increased to more than 9 million visitors in 2006 (Zaki, 2008). In 2012, the Malaysian tourism industry contributed to 5.6 percent of the country’s GDP (Borhan and Arsad, 2014).
The importance of the tourism industry in Catalonia in Spain is such that 12 percent of the GDP of the region and 15 percent of the jobs of working population were created by tourism and passenger transportation industry (Claveria and Torra, 2014). In Singapore, approximately 150,000 jobs are related to tourism and passenger transportation industry, and tourism makes up about 10 percent of the country’s annual $87 billion economy. Singapore had 8 million tourists in 2007, which was as twice as its population (Chu, 2008). Tourism and passenger transportation industry experienced 50 percent growth between 2000 and 2012 in New Zealand. In 2012, the total (direct and indirect) contribution of the tourism industry to New Zealand’s GDP was 14.9 and 19.1 percent of the total jobs in the country were supported by tourism (Huang et al., 2014). With the exception of 2008 when a major economic crisis occurred, the tourism and passenger transportation industry has experienced a steady growth in the recent decade (Loganathan and Ibrahim, 2010).
This growing trend for the tourism industry, and its importance in the economic development of many countries, makes it necessary for related government agencies and private sector to have an appropriate demand management plan. The miscalculation of future demand volume can be costly, both in cases of underestimation and overestimation. Underestimating the demand will lead to customer loss and dissatisfaction, crowd congestion in the facilities and stations and a fast depreciation of the facilities. Overestimating the demand will unreasonably increase the cost of the businesses in the form of creating idle capacity, maintaining the high quality of unsold seats and increasing the total overhead costs (Sharif Azadeh et al., 2013).
The tourism industry is logically dependent upon passenger transportation, and it is important to have a transportation requirement analysis when creating a tourism development strategy. The importance of transportation in the tourism industry necessitates forecasting studies for planning the transportation facilities. Having an accurate forecast of the tourism demand is essential for creating a development and operational plan to increase the profitability of the tourism and passenger transportation industry (Chen, Lai and Yeh, 2012).
Contributions of this paper are twofold. First, the state-of-the-art methodological advancements and studies in tourism and passenger transportation demand forecasting are reviewed. Second, several examples where a method has outperformed the others are provided and it is tried to show that no single method can universally be preferred over the others. For this purpose, the most commonly used methods in the literature are chosen and their latest developments are reviewed. These methods range from less computationally expensive ones such as regression models to more sophisticated soft computing approaches such as artificial neural networks (ANN) and support vector machines (SVM). Covering a wide range of the methods and their recent advancements in tourism and passenger transportation, this study can provide the researchers and practitioners such as government agencies and private businesses a useful source for choosing a method in various situations.
There are many methods and models developed to forecast the demand for the tourism and transportation industries, the most noteworthy of which are reviewed in this paper. The organization of this study is as follows. In Section 2, tourism and passenger transportation demand forecasting are explained and the literature trend on these topics is shown. Section 3 reviews the methodological development in tourism and passenger transportation demand forecasting models in the last decade (2007–2017), and finally, in Section 4 overall comparative results and conclusion remarks are stated.
2. Tourism and transportation demand forecasting
Tourism and passenger transportation demand forecasting aims at finding a near-accurate estimation of the demand for products or services, due to the fact that planning to supply a product or service requires an estimation of the purchase or consumption of that product or service. Both short-term and long-term forecasts may help to provide the required facilities to the tourists. Short-term forecasts may be useful in daily or weekly operations, like pricing a cruise tour which differs according to the high season or the low season. Long-term forecasts may be useful in developing infrastructures and expensive facilities, like extending a road according to the increasing demand for visiting an attraction. In another example, having a precise weather forecast (short-term) may help to better schedule the visits, or have appropriate clothing during a visit. On the other hand, having a precise forecast of the number of visitors in a year (long-term) may help to develop the required infrastructures such as hotels or transportation terminals. Most of the quantitative methods use historical data to make a forecast, and some of them use data from test markets. The results from forecasting the demand may be used for supply, workforce planning, pricing and defining marketing strategy.
Companies such as tour providers, airlines, hotels and resorts, cruise tour providers, and many other owners of recreational facilities and shops desire to have an accurate forecast of the tourist arrivals to their region. Tourism demand management is vital to the success of many of these businesses, and imprecise forecasts may lead to lost capacity for selling services (e.g. hotel rooms, airplane seats) and products (e.g. food, souvenirs). It must be considered that tourism is a perishable product, and not selling it at the right time will lead to lost sales for all related businesses (Cuhadar et al., 2014). The tourism characteristics that are usually forecasted in most studies on the subject are: the number of trips from the origin country to the tourism destination; the amount of money that is spent by the whole tourists in a year or the average amount of money that each tourist spent in the destination region; and the number of nights that each tourist stays at the destination (Fernandes and Teixeira, 2008).
Estimating the passenger transportation volume is an inseparable part of tourism demand forecasting and is an essential factor in strategic planning of tourism industry (Andreoni and Postorino, 2006; Manrai et al., 2014; Milenkovic et al., 2013; Tsai et al., 2009). Having a look into the tourism literature shows the increasing importance of passenger transportation in this domain as well as the tourism itself. Figure 1 depicts the number of studies with the topic of “forecast,” and the word “tourism,” “transport,” “travel” and “passenger” in their title using Thomson Reuters’ Web of Science search engine between 1998 and 2017.
As it can be observed in Figure 1, there is an increasing trend in number of publications that address the demand forecasting in tourism and/or passenger transportation. Following the review structure by Witt and Witt (1995), the focus of this review is not only on the travel aspect of the tourism, but also includes accommodation, attractions and so on. It is tried to exclude the studies that address the “travel” in a non-tourism context. However, where authors believe that the study might have significant tourism related implications, it is considered. Note that the same definition of tourism domain while considering the demand forecasting problem is deployed by many other review papers (Assaker et al., 2010; Song and Witt, 2012; Song and Li, 2008).
3. Forecasting methods
So far, demand forecasting methods in tourism and passenger transportation have been subject of several review papers. Some of the early reviews in this domain include the studies by Brand (1973), Chan (1979) and Karlaftis (2010). One of the most important reviews on forecasting the tourism demand was done by Witt and Witt (1995). They considered tourism as a perishable product, and believed that accurate forecasting was very important for products of this nature. They evaluated previous empirical studies with a focus on the available methods at the time, and concluded that no single method outperforms the rest in all cases. Among the studied methods, autoregression, exponential smoothing and econometrics were shown to be more accurate to the no change model. Similarly, Li et al. (2005) examined 22 studies on econometric models for international demand forecasting and focused on advance econometric models at the time of their study. They have concluded that time-varying parameter model and the structural time-series model generally perform well for short-term forecasts. And when the forecasts are based on annual data, the econometric models outperform the univariate time-series or the conventional benchmark no-change model.
Song and Li (2008) reviewed the tourism demand modeling and forecasting studies and showed that the number of forecasting methods increased after the year 2000. Moreover, they proposed some directions to improve the forecasting accuracy via combining different approaches. Goh and Law (2011) reviewed 155 research papers on tourism demand forecasting which were published between 1995 and 2009, and categorized them based on their analysis approach into the groups of econometric-based models, time series models, and artificial intelligence-based models. Focusing on one methodology, Song and Witt (2012) reviewed modern econometric approaches for tourism demand forecasting. Rasouli and Timmermans (2012) reviewed the literature on uncertainty in travel demand while focusing on different sources of uncertainty and the models applied to analysis of demand uncertainty. Some other notable studies that reviewed tourism and travel demand forecasting literature include papers by Juan et al. (2005), Polat (2005) and Assaker et al. (2010).
Demand forecasting methods can be classified based on factors such as the type of forecasting data, number of variables under study, the type of the problem and the industry where the forecasting was performed, and the techniques used to make the forecast. In fact, most of the review papers have acquired the same classification schema. Following the same classification approach, this study aims at reviewing the demand forecasting methods in tourism industry and passenger transportation by focusing on the articles published between 2007 and 2017. Moreover, the most important papers and case studies for the most commonly used methods are summarized in tables.
3.1 Time series models
A time series is an ordered sequence of values of a random variable, which is documented on constant time intervals. This documentation leads to the creation of a stochastic process based on the time series. For example, the average daily world prices of gold or crude oil, or the amount of rainfall on each day in a city, are time series. Time series just need the historical values of a variable to create a base for predicting the future values of that variable. Time series forecasting is the process of using a forecasting model to predict the future values of a variable based upon its previously observed values.
Time series models are widely applied to the tourism industry demand forecast. Oh and Morzuch (2005) studied a number of time series models in order to predict the tourism demand for Singapore. The authors had some useful conclusions, which suggest that: the different performance statistic (e.g. MAPE, MAE, RMSE) and the length of the forecast horizon may in turn choose different models as their best model; the model which is the best during the within-sample period may not have the best performance in the post-sample period; and, changing the length of the prediction period may change what the best model is. As a result of their study, they believe that the structural models generally provide less accurate forecasts than univariate models because they have more reasons for forecast error.
The tourism arrival forecasting in Spain was studied by de Oliveira and Eduardo (2009). They modeled the tourism demand as disaggregated components according to variables such as country of residence, purpose of the trip, type of transport and accommodation. They believed that not considering these variables eliminates the independent behavior of the disaggregated components. Total international tourist arrivals in Spain is disaggregated into 12 different origins. Following this, they summed the forecasts of the disaggregated components to obtain the total number of tourism demand forecasts. Using the disaggregated model in multi-step gave more accurate results compared to the traditional time-series models.
Analyzing the local traffic in USA airports – specifically, the Philadelphia International Airport – was done by Abdelghany and Guzhva (2010). They used a time series model to analyze the effects of seasonality, fuel price, airline strategies, incidents and financial factors on an airport’s level of activity. Their approach recommends disaggregating total demand into local and connecting flight demands. Proposed model is more appropriate for short-term prediction (less than two years), but it can be used for longer term (two to five years) if some assumptions are considered according to the used variables in the model. Lin et al. (2011) forecasted the tourism demand in Taiwan by three methods, namely, time series, ANN and multivariate adaptive regression splines. Root mean square error, mean absolute deviation and mean absolute percentage error (MAPE) measures were used to compare the methods. In their study, time series models outperformed ANNs and multivariate adaptive regression splines. The number of tourist arrivals and overnight stays for Catalonia, Spain, was studied by Claveria and Torra (2014), in which they compared various time series models and ANNs. The study shows that for short horizons, time series models are more accurate than both ANNs and self-exciting threshold autoregression models.
Tang et al. (2015) proposed a combined model of time series with belief functions to forecast the Chinese international tourism demand using the data of tourist arrivals from 1991 to 2013. The authors believed that the proposed model was simple to perform and accurately forecasted the demand. Moreover, the China monthly inboard tourism demand was studied by Wang (2015), in which he proposed a time series prediction model. Time series is also applied in other studies to model the tourism demand pattern in Madeira – Portugal (de Almeida, 2016), and forecast the tourism demand in Malaysia (Nor et al., 2016). Table I lists the most important and highest-quality articles using time series for tourism and passenger demand forecasting.
3.2 Autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA) and SARIMA
The ARMA models are a type of stationary stochastic models that consist of two models of autoregressive and moving average models. Most of the development and applications of ARMA class models are related to the book by Box et al. (2015). The ARMA models are mostly shown by the notation ARMA(p, q), where p is the order of autoregressive part and q is the order of moving average part. There is an essential limitation to the data stationarity in the ARMA models. When the data are not stationary, ARIMA models must be applied instead of classic ARMA models. The seasonal ARIMA, SARIMA (p, d, q)(P, D, Q)S incorporates both seasonal and non-seasonal factors. Box et al. (2015) and (Huang et al., 2014) are good references on ARMA, ARIMA and SARIMA models.
Andreoni and Postorino (2006) used the multivariate ARIMA model to forecast air transport demand. Demand levels at the Reggio Calabria regional airport in Italy were forecasted with both univariate and multivariate time series models. The autoregressive fractionally integrated moving average (ARFIMA) approach was used by Chu (2008) to forecast the tourism demand in Singapore. Moreover, the impact of volatile data, such as economic and political shocks, was analyzed by the proposed model. Additionally, Chu (2009) used ARMA-based methods to forecast the tourism demand. Three univariate ARMA-based models were applied to predict the tourism demand, as shown by the number of international visitors to Korea, Hong Kong, Japan, Thailand, Taiwan, Singapore, Philippines, New Zealand nd Australia. The maritime tourist demand of the Porec area in Croatia was forecasted by Krasić and Gatti (2009), in which they applied ARIMA in order to find the volume of tourist arrivals. The proposed model considered the interventions of Croatia war activities between 1991 and 1995, based upon historic data.
ARMA was combined with the General Regression Neural Network method by Gong (2010) in order to predict the passenger transport demand. The flight demand historical data in the Beijing-Shanghai corridor shows that the proposed model is capable of capturing both the nonlinear and linear perspectives of the demand.
The Box-Jenkins SARIMA model was used by Loganathan and Ibrahim (2010) to forecast international tourist arrivals to Malaysia. After comparing the ARMA-based models, the SARIMA (1,0,1) model was suggested for the considered case. Moreover, the SARIMA approach was used by Nanthakumar et al. (2012) to forecast the tourism demand for Malaysia from Association of Southeast Asian Nations (ASEAN) countries. The results of this study do not show any out-performance of SARIMA compared to ARIMA in terms of forecasting. The Box-Jenkins methodology was also used in SARIMA models by Bigović (2012) to forecast the Montenegrin tourism demand. The short-run flows of tourist arrivals and overnight stays in Montenegro were forecasted, and the quality of the proposed method was shown by Modified Box-Pierce and Jarque-Bera test statistics. The ARIMA model was also applied by Milenkovic et al. (2013) in order to predict the railway passenger demand. The authors showed that the SARIMA (0,1,1)(0,1,1)12 model better describes the monthly recurring pattern of the passenger flows when compared to the classic ARIMA models.
A comparison between the ARIMA and grey models to forecast the inbound tourism demand in Vietnam was studied by Nguyen et al. (2013). The Fourier series was applied to increase the accuracy of both models, and the results of the case study showed that the ARIMA model performed better than the grey model. In a recent study, forecasting the inbound tourism demand in New Zealand via Fourier residual-modified ARIMA models was studied by Huang et al. (2014). They found that a certain degree of Fourier-modification factors improved the forecasting performance of the model.
In a more recent study, the Box-Jenkins ARIMA model was developed by Anvari et al. (2016) in order to predict the passenger demand for the Istanbul metropolitan area in Turkey. The main contribution of the work was to apply statistical tests instead of human judgment, which simplifies the process of selecting an accurate model. The authors believed that the proposed model was applicable in different forecasting problems, regardless of the problem type. Moreover, Gunter (2018) used the autoregression framework to show that the combination of decreasing prices in the destination countries and increasing tourist income is increasing tourism demand for the EU-15 countries.
Table II lists the most important and highest-quality articles using ARMA, ARIMA and SARIMA for tourism and passenger demand forecasting.
3.3 Regression models
Regression models determine a forecasting function by calculating a dependent variable value based upon one or more independent variables (Mosteller and Tukey, 1977). The terminology of “response variable” and “predictor variable” are used for the dependent variable and the independent variable, respectively.
Varagouli et al. (2005) tried to fit multiple linear regression to the travel demand of Xanthi in northern Greece. The authors tried to identify the affective variables and propose a model based on those variables in order to predict travel demand. Similarly, multiple linear regression was used by Anderson et al. (2006) to forecast the travel demand for small urban areas. The authors believed that the traffic forecasting was commonly done through multiple-step processes, but they studied the traffic forecasting directly by examining a functional relationship between socioeconomic influences and roadway characteristics.
The problem of forecasting the tourism demand in Hong Kong was studied by Wu et al. (2012), using a Gaussian process regression model. The sparsification procedure of the proposed model improved the generalization abilities and also allowed decreased computational complexity. The sparse model showed that it was more effective in comparison to the ARIMA, v-SVM and g-SVM models. A geographically weighted regression was used by Blainey and Mulley (2013) to forecast the rail demand in New South Wales, Australia. The authors tested many variables in the models, such as catchment population and employment, income and age profile, household size, car ownership levels, etc., in order to have the most accurate forecast.
Al-Rukaibi and Al-Mutairi (2013) compared regression and artificial intelligence models in order to forecast the air travel demand for Kuwait. The authors identified the national income and labor force as the most important affective factors on Kuwait air travel demand, and applied regression and neural network models to the data. The neural networks presented a better goodness-of-fit than the regression models. In another study, a semi-logarithmic regression model was applied by Sivrikaya and Tunç (2013) to predict the domestic air transportation of Turkey. The number of passengers that traveled between 42 cities in Turkey in 2011 was analyzed by each city of origin/destination pair. The accuracy of the predictions showed the possible applicability of the proposed regression model for the aviation industry in Turkey. Moreover, Strelcova (2013) applied the correlation analysis, doubly constraint gravity, and multiple linear regression models in order to predict the transportation demand for transatlantic air travel routes. The data of flights from most international airports in Europe and the USA in 2011 were considered. The results show the applicability of the proposed model for demand prediction purposes in different airlines.
Forecasting the Las Vegas tourism demand by using a logistic growth regression model was studied by Chu (2014). The applied model outperformed Naïve 1 and SARIMA models, based upon both the root mean square percentage error and MAPE. Spatial regression models were applied by Lopes et al. (2014) to predict the transportation demand for the city of Porto Alegre, Brazil. Since the spatial autocorrelation may have affected the forecasting power of the applied models, spatial dependence patterns were incorporated into the models. The results obtained with the spatial regression models were also compared with the results of a multiple linear regression model that was typically used in trips generation estimations. The results show the outperformance of the multiple linear regression model by spatial regression model. A multiple linear regression model was compared to generalized linear modeling in a study by Semeida (2014). He used the models to predict the travel demand in low population areas of Northeast Egypt. This study showed that the generalized linear model was more accurate when it came to predicting the number of trips.
A regression model was used by Manrai and Srinidhi to predict the demand for India International air transport. The study showed that the air travel demand was affected by the macroeconomic characteristics of the country, including the parameters such as population and income. The application of their model showed that the air travel will increase in demand until the year 2020. Tica and Kožić (2015) applied regression models to find the most affective factors on the demand of Croatian inbound tourism. Their study showed that the gross wages in Slovakia and the Czech Republic, as well as the imports and GDP in Poland, had the highest effect on Croatia tourism demand. Moreover, Asrin et al. (2015) used the generalized Poisson regression model to forecast the international tourism demand of ASEAN countries. Their study showed that the money exchange and inflation rates negatively affect the tourism demand. And Claveria et al. (2016) used the Gaussian process regression model to model the cross-dependencies between the regional tourism markets in Spain. The method comparison in this study shows that the Gaussian regression outperforms the benchmark neural network in terms of forecasting accuracy. Table III lists the most important and highest-quality articles using regression models for tourism and passenger demand forecasting.
3.4 Support vector machines
Boser et al. (1992) developed the SVM as a machine learning method associated with learning tasks such as classification and regression. SVM is based on the idea that non-linear trends in input space can be mapped to linear trends in a higher-dimensional feature space, and recognizes the subtle patterns in complex data sets by using a learning algorithm (Vapnik, 2013).
SVMs have two main categories; support vector classification and support vector regression (SVR). SVR, which is mainly the focus of this section, tries to achieve the generalized performance by minimizing the generalization error bound. It is notable that the SVR-produced models only depend on a subset of the training data (Basak et al., 2007).
Pai and Hong (2005) proposed a multi-factor SVM in an attempt to forecast tourism demand. The authors used a SVM with a neural network to increase the prediction accuracy. A hybrid model of least squares SVM was developed by Samsudin et al. (2010) to predict the tourism demand in Johor, Malaysia. The authors used the group method of data handling to determine the inputs for the least squares SVM in order to get more accurate predictions. Moreover, the SVR with a hybrid chaotic genetic algorithm (GA) was developed by Hong et al. (2011) to forecast tourism demand. The authors believed that the SVR minimized the upper bound of the generalization error and considered that an advantage to neural networks, which minimize the training error. Moreover, they believed that the chaotic GA overcame the problems of classical GA--such as premature convergence – slowly reaching up to the global optimum and becoming trapped into a local optimum. Thus, the chaotic GA was applied to solve the problem of a premature local optimum with finding the parameters of a SVR model.
Forecasting the tourism demand of Taiwan was studied by Lin and Lee (2013), in which they compared different forecasting methods. Multivariate adaptive regression splines, ANN and SVR were developed to forecast the number of tourist arrivals. The mean error rate showed that SVR outperforms the other models. Mei (2015) proposed an improved SVR model in order to predict the monthly tourism demand for China. The traditional SVR model was improved by the elitist non-dominated sorting GA in order to reduce the algorithm complexity, forecast more accurately, and keeps the population diversified. Table IV lists the most important and highest-quality articles using SVM for tourism and passenger demand forecasting.
3.5 ANN models
Neural networks were first used in the late 1990s in studies related to forecasting the tourism demand. The high number of publications using ANNs in the last decade shows the growing interest of researchers in applying this method to forecast tourism demand. Unlike the classic statistical methods, ANN models are data-driven and non-parametric, do not need strong assumptions and can learn nonlinear data trends.
A neural network generally consists of an input layer, a varying number of hidden layers and an output layer. Nodes of the input layer represent the input variables, such as economic and demographic data. Hidden layers are used for the network’s internal understanding of the nonlinear data trend, and the output layer represents the solution to the problem.
Each layer of a network has certain nodes called neurons, which are processing units of the network and are connected to the nodes of their adjacent layers. Each neuron applies a transfer function to the weighted summation of its input variables in order to generate an output. Neural Networks are divided into two major categories: feed-forward networks – which just consider the information flow in one direction – and recurrent networks, which consider the feedback connections from the next level layer to the previous layer of neurons (Pattie and Snyder, 1996).
The problem of forecasting tourism demand in Northern Portugal was studied using ANN (Fernandes and Teixeira, 2008). The proposed model used the 12 preceding values to calculate each forecasted value. Celebi et al. (2009) applied the ANN to forecast the short-term passenger demand of light rail systems. The authors used the multi-layer perceptron (MLP) model. The historical daily data were used to train the model, and the mean square errors and MAPEs were used to evaluate the accuracy of the proposed models.
Tsai et al. (2009) applied two multiple-temporal-unit neural networks and parallel ensemble neural networks to forecast the travel demand by railway passengers. Two categories of data, including the temporal features and level shifts, were used to construct the networks. Comparing two proposed networks with MLPs showed the better performance of the two networks by both the mean squared error and the MAPE measures. Liu (2011) used an ANN to predict the number of visitors to Weifang, China. Mukai and Yoden (2012) used ANN forecast the demand of a taxi transportation service in the city of Tokyo, Japan.
A model based upon both empirical mode decomposition and back-propagation neural (BPN) network was proposed by Chen, Lai and Yeh (2012) to predict tourism demand by the number of arrivals. They decomposed the raw data into a finite set of intrinsic mode functions and a residue, which they then modeled and forecasted using the BPN network. The final forecasting value, which was obtained by calculating the sum of the network prediction results, showed that the proposed model outperforms both the single BPN model and traditional ARIMA models. Chen, Kuo, Chang and Wang (2012) studied the specific problem of forecasting air passenger and air cargo demand from Japan to Taiwan using ANN. Sharif Azadeh et al. (2013) applied neural networks to forecast the total number of bookings and cancellations at a European rail operator.
Other valuable studies using ANNs in forecasting tourism and travel demand include: Applying three forecasting methods (including ANN) to predict the cruise tourism demand for the Izmir port in Turkey (Cuhadar et al., 2014); comparing MLP, radial basis functions, and Elman network techniques of a neural network in order to forecast tourist demand (Claveria et al., 2015); applying ANN models to forecast the airline passenger demand in Australia’s domestic air travel industry (Srisaeng et al., 2015a); and proposing the adaptive neuro-fuzzy inference system models to predict Australia’s domestic low cost carriers’ demand (Srisaeng et al., 2015b). A new development (long short term memory) of neural networks is studied by Li and Cao (2018) to predict the tourism flow. They showed the neural network capability in modeling the nonlinear and stochastic systems which cannot be modeled by the common linear models. Table V lists the most important and highest-quality articles using ANNs for tourism and passenger demand forecasting.
3.6 Other forecasting methods
In addition to the aforementioned methods, there are several other demand forecasting approaches for tourism in the literature which will be briefly discussed in this section.
3.6.1 Econometric models
Econometric models have a causal approach, and seek to find a cause-and-effect relationship between the output variable (tourism demand) and the input variables (economic, social, demographic, etc.). The main aim of econometric tourism demand forecasting is to find the most influential variables in tourism demand.
Profillidis and Botzoris (2006) forecasted the passenger demand in Greece by using econometric modeling. Three econometric models were developed by the authors in order to predict the total demand, rail demand and private car demand for the passengers. Many statistical and diagnostic tests were used to assess the validity of each model. Hilaly and El-Shishiny (2008) evaluated various econometric models and analyzed the pros and cons of each. They proposed some new models – such as the tourism technical analysis system model and artificial intelligence techniques – as well. Econometric modeling was also used to predict the international tourism demand for Egypt by Zaki (2008). The factors regarding the deficiency in terms of exploiting human capital efficiently, the lack of strategies to enhance Egypt’s image, and ineffective marketing campaigns were considered the affective factors for tourism demand in Egypt. In a recent study, Ur Rahman et al. (2017) developed an econometric model to study the related impact of tourism and terrorism on Pakistan economic growth. The study approves that terrorism and inflation have significant negative impact on the economic growth.
3.6.2 Grey system models
Grey systems are those dealing with both known and unknown information. More precisely, as white systems have completely known information, and black systems have completely unknown information, grey systems are defined as the systems with partially known and partially unknown information (Julong, 1989).
Regarding the tourism related industries, the problem of forecasting the demand for road passenger transportation was studied by Zhang and Hui (2008) using grey prediction model. Key forecasting variables were identified by the calculation of the degree of grey incidence. The multi-variable grey self-adapted model was constructed based upon the grey incidence result. The tourism travel demand of the Lushan scenic area was forecasted by Ren et al. (2009) using grey prediction. The proposed model predicts the number of vehicles and visitors to the area and assigns weight to the roads ending in the scenic spots of the area in order to predict the vehicle flow for each road.
Forecasting the inbound tourism demand in the Fujian province of China by using the grey dynamic models was studied by Yinzhu et al. Air passenger demand forecasting in China was studied by Wang et al. (2011), where they combined the grey method with a multiple regression model to obtain a more accurate result. In a similar study, a grey model was proposed by Cheng (2012) to forecast the tourism demand for Linguin, China. Yu (2015), optimized the grey model to forecast the inbound tourism demand for China. His model obtains more accurate results compared to time series. Moreover, the application of the grey relational analysis to forecast the passenger demand for high-speed railway was studied by Rong et al. (2015). Another interesting application of grey models is a study by Zhang and Qin (2017), where forecasting the demand for Chinese ski tourism is studied. It is noteworthy that beside demand forecasting in tourism industry, grey models are widely used in demand forecast of transportation in freights and logistics domains as well (Yin, 2007; Huang and Feng, 2009; Hua and Qi-hong, 2009; Luo and He, 2010; Fang-bing, 2011; Cao and Chen, 2012; Wang et al., 2015).
3.6.3 System dynamics (SD)
SD was developed by Forrester (1997) at the Massachusetts Institute of Technology. SD is a computer-oriented approach that uses the inter-relation of variables in a complex setting. The main characteristics of SD include the existence of a complex system, time-to-time variations regarding the system behavior, and the existence of the feedback in a closed loop.
In 2008, Ruutu (2008) used the SD to forecast the long-term sea transport demand in Finland. His proposed model aimed to determine the system behavior by finding its causal mechanisms. Most of the focus in his study was on analyzing the effects of macroeconomic variables on the working load of the Finnish ports. In another study, Suryani et al. (2010) used the SD framework to predict air passenger demand. Because of the SD’s ability to represent physical and information flows, it was used to model, analyze and generate the scenarios. The authors found the airfare impact, GDP, level-of-service impact, population, number of flights per day, and dwell time to be important variables in determining the runway utilization, air passenger volume, and additional area needed for terminal capacity expansion. The problem of predicting the demand for health tourism in the Krasnodar region of Russia by SD modeling was studied by Vetitnev et al. (2016). Data of tourist services from 2006 to 2012 were used by the model, and some marketing conclusions were made based on the outputs. Mai and Smith (2018) used the SD modeling to come up with scenarios for future instead of forecasts. They studied the case of an island in Vietnam, and showed that the tourism development in that island cannot be sustainable according to the shortage of resources.
3.6.4 Fuzzy logic models
The Fuzzy sets theory was first introduced by Zadeh (1965). Fuzzy forecasting methods apply fuzzy numbers to consider uncertainties in the input data. A fuzzy system maps the nonlinearity of an input vector to a scalar output, and is able to deal with both numerical values and linguistic variables.
Yu and Schwartz (2006) studied the demand prediction accuracy of USA tourist arrivals by applying a fuzzy time series and grey prediction. The authors believed that more complicated models do not necessarily generate more accurate forecasts when compared with simple models. The fuzzy time series was also used by Chou et al. (2010) to forecast tourism demand. They used the k-mean approach, triangular fuzzy demand numbers, and weighted fuzzy logical relationships. The numerical experiment – which uses Hong-Kong, USA and Germany tourist demand data for the input – showed that the proposed approach outperformed the other considered models in this study.
The problem of forecasting the tourism demand in Bali and Soekarno-Hatta in Indonesia by fuzzy time series was studied by Lee et al. (2012). The authors concluded that the fuzzy time series outperformed classical methods such as Box-Jenkins, Seasonal ARIMA, Holt Winters and time series regression. The tourism demand in Taiwan was forecasted by Zins et al. (2012) using the fuzzy time series model. The authors believed the severe acute respiratory syndrome (SARS) disease created a negative impact in tourism demand in 2002 and 2003, and their proposed model outperformed other studies in forecasting the tourism demand during the SARS event period.
The fuzzy logic and ARIMA model were compared by Dou et al. (2013) to predict the passenger demand of the high-speed railway of Beijing-Shanghai in China. The results showed that fuzzy logic predicts the passenger demand more accurately than the ARIMA model. Tsaur and Kuo (2014) used a novel fuzzy time series model to forecast the number of Japanese tourists that visit Taiwan each year. A novel Fourier method was used to revise the analysis of the residual terms of the forecasts that were made by the fuzzy time series model. A neuro-fuzzy model was used by Xiao et al. (2014) to forecast the demand for air travel in a Hong Kong airport. They applied a combined method of singular spectrum analysis, adaptive-network-based fuzzy inference systems, and improved particle swarm optimization to predict the short-term air passenger traffic. Fuzzy theory is also combined with SVM in a study by Xu et al. (2016). They have provided a method that extracts fuzzy rules from SVMs to forecast the tourism demand. Studying the case of Hong Kong – China shows the better prediction accuracy for the developed model comparing to the traditional forecasting methods.
3.6.5 Genetic algorithm (GA)
GA is a metaheuristic that mimics the natural selection process as described by Darwin (Talbi, 2009). Chen and Wang (2007) forecasted the tourism demand of China by using SVR whose parameters were optimized using GA. Shahrabi et al. (2013) tried to forecast tourism demand by developing a hybrid intelligence model. They used a combination of different tools and methods, such as data preprocessing, K-means clustering, genetic learning algorithms and fuzzy systems to propose a forecasting system. In a more recent study, Srisaeng Baxter, Richardson and Wild, (2015) used the GA to predict the demand volume of Australia’s domestic airline. They used 74 training data sets and 13 testing data sets for the proposed GA models. The results of the study showed that the quadratic models outperform the linear model in terms of accuracy and reliability. Urraca et al. (2015) combined the GA with SVR to improve the hotel room demand forecasting. They evaluated their model by applying it to a database obtained from a hotel in northern Spain.
4. Comparative analysis and concluding remarks
This paper reviewed the tourism and passenger transportation demand forecasting literature from 2007 to 2017. Contributions of this paper are twofold. First, the state-of-the-art methodological applications and studies in tourism and passenger transportation demand forecasting are reviewed. Second, several examples where a different method has outperformed the others are provided, which shows that no single method can universally be preferred over the others. This study can be a useful source for researchers and practitioners such as government agencies and private businesses to review the forecasting methodological developments and their applications in tourism and passenger transportation demand forecasting.
Considering all the reviewed studies, a significant increase in development of hybrid methods can be observed. This trend is more dominant in soft computing and artificial intelligence methods such as ANN and SVM. Combination of forecasting methods can integrate the advantages of various methods and provide useful tools specially when it is necessary to deal with non-linear patterns in data or intermittent and lumpy demands (Sharif Azadeh et al., 2013). It is noteworthy that while soft computing methods provide a useful hybridization framework, fine-tuning their parameters remains an important question in the literature. One observable trait in the literature is the use of metaheuristics such as GA and particle swarm optimization for fine tuning the soft computing methods (Xiao et al., 2014; Hong et al., 2011). In the literature, beside the fine-tuning, metaheuristics have been used for model building and feature selection as well (Urraca et al., 2015). While dealing with sophisticated forecasting models or data sets with numerous dimensions, using metaheuristics seems to be an effective approach for finding the optimal or near-optimal set of input variables that best describe the behavior of the forecasted parameter or the parameters of the models. For instance, for finding a proper number of layers and neurons in ANN, GA can be very useful.
Another important output of the reviewed articles is that from a long term perspective the working population is a more important input variable than the whole population of the destination region, although most of the classic studies consider the whole population as an input variable (Chen, Lai and Yeh, 2012). It is shown that the increase in tourism supply variables – such as international flight capacity and the number of hotel rooms – will positively affect the tourism demand (Lin and Lee, 2013). From a short term perspective, it has been shown that the demand can have cyclic fluctuations during the weekdays (Mukai and Yoden, 2012).
Most of the reviewed articles studied the effect of quantitative variables such as temperature and rainy days on demand fluctuations, but the effect of qualitative variables on demand fluctuations is not studied so much. Studying the effect of variables such as political regime in the destination country or religious and cultural believes of the majority population of the destination country to host the foreigners can be an interesting topic for the future studies. Moreover, using the concept of utility theory in tourism demand forecasting may add a valuable contribution to the field. Studying the tradeoff between the money and the quality of tourism service and its effect on tourism demand can make benefit to many companies in this area.
As can be seen in all the forecasting applications, seasonality and the occurrence of unexpected events are the main obstacles to achieving more accurate forecasts, which should require more attention in future works. The importance and effectiveness of seasonality is noted and analyzed, but analyzing the effects of more than one seasonal pattern, such as having both weather cycles and economic cycles is still an interesting direction for the future studies (Anvari et al., 2016). Although many advancements have been made in tourism and passenger transportation demand, still many possible methodologies remain untested in this area. In this regard, the use of new advancements made in soft computing, such as deep learning, in conjunction with other methods in tourism and passenger transportation demand forecasting can be an interesting research area.
The importance of method accuracy is considered in most of the reviewed papers. But, there is a lack of study on the importance of appropriate data for forecasting purposes. Emerging big data concept has developed so many forecasting methods based on the big data applications. Studying data cleansing methods and developing creative demand forecasting methods using big data can be the most interesting topic for the future studies. Moreover, bringing the dynamic of data into the forecasting methods can increase the accuracy of forecasts. If a model is working accurate for a current system, it does not mean that the model may have the same accuracy for the same system in a year. Developing dynamic demand forecasting methods which can make decisions based on the real time data would help the companies to save a lot of money.
Number of studies in tourism area that have the words “tourism,” “transport,” “travel” and “passenger” in their title using Thomson Reuters’ Web of Science search engine between 1998 and 2017
The most important and highest-quality articles using time series for tourism and passenger demand forecasting
| Authors | Purpose | Demand type and period | Determinants | Modeling |
|---|---|---|---|---|
| Oh and Morzuch (2005) | Comparing some time series forecasting models | Monthly travelers’ arrivals | Historic data of arrivals | Naïve I/Naïve II/Linear regression/Winters’s model/Autoregressive Integrated Moving Average (ARIMA)/Sine-wave regression |
| Lin et al. (2011) | Comparing different methods to forecast the tourism demand in Taiwan | Monthly tourism demand | Historic data of monthly visitors to Taiwan | ARIMA/ANNs/Multivariate adaptive regression splines |
| Claveria and Torra (2014) | Comparing the tourism demand forecast models of time series and ANNs | Monthly demand of tourism | Monthly data of tourist arrivals and overnight stays | ARIMA/Self-exciting threshold autoregressions/ANNs |
| Tang et al. (2015) | Predicting the inbound tourism demand | Monthly tourist arrivals | Monthly data of international tourist arrivals | Time series/Likelihood-based belief function |
| Cankurt and SUBAŞI (2016) | Forecast the multivariate tourism demand in Turkey | Monthly tourist arrivals | Financial and demographic information | Multiple linear regression/Artificial neural network/Support vector regression |
The most important and highest-quality articles using ARMA, ARIMA and SARIMA for tourism and passenger demand forecasting
| Authors | Purpose | Demand type and period | Determinants | Modeling |
|---|---|---|---|---|
| Andreoni and Postorino (2006) | Predicting the air transport demand | Annual airport passengers | Historic annual airport passenger demand | ARIMA/ARIMAX |
| Chu (2008) | Comparing ARMA-based methods to forecast the tourism demand | Monthly tourist arrivals | Historic data of international tourist arrivals | Naïve/Linear regression/Cubic polynomial regression/Sine wave nonlinear regression/ARIMA/ARFIMA/SARIMA |
| Chu (2009) | Forecasting the tourism demand | Monthly and quarterly tourist arrivals | Historic monthly tourist arrivals data | ARIMA/ARAR/ARFIMA |
| Loganathan and Ibrahim (2010) | Predicting the tourism demand | Seasonal international tourist arrivals | Historic seasonal data of international tourist arrivals | Box-Jenkins SARIMA |
| Nguyen et al. (2013) | Predicting the inbound tourism demand | Inbound arrivals per month | Historic monthly arrivals of International tourists | ARIMA/Grey forecasting/Fourier series |
| Anvari et al. (2016) | Predicting the urban rail passenger demand | Periodic number of passengers | Historic periodic passengers data | Box–Jenkins ARIMA |
| Petrevska (2017) | Predicting tourism demand in Macedonia | Annual tourist arrivals | Historic annual arrivals | Box–Jenkins ARIMA |
The most important and highest-quality articles using regression models for tourism and passenger demand forecasting
| Authors | Purpose | Demand type and period | Determinants | Modeling |
|---|---|---|---|---|
| Varagouli et al. (2005) | Forecasting the travel demand | Number of passengers traveling by car | GDP of both origin and destination zones/Population of the both origin and destination zones/Number of cars per thousand inhabitants of the origin zone/Trip time by car or trip length/Trip price by car | Multiple linear regression |
| Wu et al. (2012) | Forecasting the tourism demand based on the external effective factors | Monthly international tourist arrivals | Travel demand by each origin country/Income of origins/Prices in destination/Transportation costs/Foreign exchange rate/Population of the origin country/… | Sparse Gaussian process regression/ARIMA/v-SVM/g-SVM |
| Sivrikaya and Tunç (2013) | Predicting the domestic air transport demand | Number of passengers carried per city pair | Urban population/Bedding capacity/Distance/Transit/Price/Airline count/Travel match/Schedule consistency/Travel time | Semi-logarithmic regression model |
| Chu (2014) | Predicting the tourism demand | Monthly tourist arrivals | Historic monthly tourist arrival data | Logistic growth regression/SARIMA/Naïve 1 |
| Semeida (2014) | Predicting the taxi passenger demand | Number of trips per person per year | Distance/Population/Area/Income/Travel time/Travel cost/Trip frequency | Multiple linear Regression/Generalized linear modeling |
| Chinnakum and Boonyasana (2017) | Modeling the tourism demand for Thailand | Annual tourist arrivals | Gross domestic product per capita/Relative price of tourism in Thailand/Exchange rate/Population | Panel data regression models |
The most important and highest-quality articles using support vector machines for tourism and passenger demand forecasting
| Authors | Purpose | Demand type and period | Determinants | Modeling |
|---|---|---|---|---|
| Pai and Hong (2005) | Predicting the tourism demand | Annual number of visitors | Service price/Foreign exchange rate/Population/Market expenses/Gross domestic expenditure/Average hotel rate | Back-propagation neural networks/Multifactor support vector machine model |
| Samsudin et al. (2010) | Forecasting the tourism demand using a hybrid algorithm | Number of visitors per month | Historic monthly tourist arrivals data | Group method of data handling/Least squares support vector machine |
| Hong et al. (2011) | Forecasting the tourism demand using a hybrid algorithm | Annual tourist arrivals | Historic annual tourist arrivals data | Support vector regression/Chaotic genetic algorithm |
| Lin and Lee (2013) | Forecasting the tourism demand using a hybrid algorithm | Monthly tourist arrivals | Average hotel price/Number of hotel rooms/Capacity of international flights/GDP/CPI/Foreign exchange rate | Multivariate adaptive regression splines/ANN/Support vector regression |
| Rafidah et al. (2017) | Forecasting the tourist arrivals to Malaysia from Singapore | Monthly tourist arrivals | Historic monthly tourist arrivals data | Support vector machine model |
The most important and highest-quality articles using ANNs for tourism and passenger demand forecasting
| Authors | Purpose | Demand type and period | Determinants | Modeling |
|---|---|---|---|---|
| Tsai et al. (2009) | Predicting short-term railway passenger demand | Daily and monthly passenger demand | Ticket sales data | Multiple temporal units neural network/Parallel ensemble neural network |
| Celebi et al. (2009) | Predicting light rail passenger demand | Passenger demand per 15 min | Historic daily passenger data | ANN/ARIMA |
| Chen, Lai, Yeh (2012) | Forecasting tourism demand by decomposing data into a finite set of intrinsic mode functions | Monthly tourism demand | Historic tourist arrivals series | ARIMA/Back propagation neural network/Empirical mode decomposition |
| Chen, Kuo, Chang, Wang (2012) | Predicting the air passenger and cargo demand | Annually air passenger and cargo demand | Population/GDP/GNP/CPI/Economic growth rate/Hotel rate | Back-propagation neural network |
| Cuhadar et al. (2014) | Predicting the cruise tourism demand | Monthly passenger demand | Monthly foreign tourist arrivals by cruise | Radial basis function ANN/Multi-layer perceptron ANN/Generalized regression ANN |
| Claveria and Torra (2014) | Predicting the tourism demand | Monthly tourist arrivals from different countries | Monthly data of tourist arrivals | Multi-layer perceptron ANN/Radial basis function ANN/Elman recurrent neural networks |
| Noersasongko et al. (2016) | Forecasting tourist arrivals in Indonesia | Monthly foreign tourist arrivals | Historic tourist arrivals to three cities in central Java | Genetic algorithm based neural network |
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© Iman Ghalehkhondabi, Ehsan Ardjmand, William A. Young and Gary R. Weckman. This work is published under https://creativecommons.org/licenses/by-nc/3.0/legalcode (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Purpose
The purpose of this paper is to review the current literature in the field of tourism demand forecasting.
Design/methodology/approachPublished papers in the high quality journals are studied and categorized based their used forecasting method.
FindingsThere is no forecasting method which can develop the best forecasts for all of the problems. Combined forecasting methods are providing better forecasts in comparison to the traditional forecasting methods.
Originality/valueThis paper reviews the available literature from 2007 to 2017. There is not such a review available in the literature.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer






