This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
In order to enhance the logistics service experience of customers in the e-commerce industry chain, supply chain collaboration [1] requires that commodities are stocked in advance in local warehouses of various markets around the world, which can effectively reduce logistics time. However, for cross-border e-commerce enterprises, the production and sales areas of e-commerce products are globalized, which takes them longer to make preparations from the procurement of commodities, transportation, to customs quality inspection, etc. Therefore, algorithms and technologies of big data analysis are widely applied to predict sales of e-commerce commodities, which provide the data basis for the supply chain management and will provide key technical support for the global supply chain scheme of cross-border e-commerce enterprises.
Besides the large quantity and diversity of transaction data [2], sales forecasts are affected by many other factors due to the complexity of the cross-border e-commerce market [3, 4]. Therefore, to improve the precision and efficiency of forecasting, consideration of various factors in sales forecasting is still a challenge for e-commerce enterprises.
There are plenty of studies having been undertaken in sales forecasting. The methods of sales forecasts adopted in these studies can roughly be divided into time series models (TSMs) and machine learning algorithms (MLAs) [5, 6].
TSMs range from the exponential smoothing [7] to the ARIMA families [8], which have been used extensively to predict future trends by extrapolating based on historical observation data. Although TSMs have been proven to be useful for sales forecasting, their forecasting ability is limited by their assumption of a linear behavior [9], and they do not take external factors such as price changes and promotions into account [10]. Therefore, univariate forecasting methods are usually adopted as a benchmark model in many studies [11, 12].
Another important branch of forecasting has been MLAs. The existing MLAs have been largely influenced by state-of-the-art forecasting techniques, which range from artificial neural network (ANN), convolutional neural network (CNN), radial basis function (RBF), long short-term memory network (LSTM), extreme learning machine (ELM) to support vector regression (SVR), etc. [13].
On the one hand, some existing forecasting models have made comparisons between MLAs and TSMs [14]. Ansuj et al. showed the superiority of ANN on the ARIMA method in sales forecasting [15]. Alon et al. compared ANN with traditional methods, including Winters exponential smoothing, Box–Jenkins ARIMA model, and multivariate regression, indicating that ANNs perform favorably in relation to the more traditional statistical methods [16]. Di Pillo et al. assessed the application of SVM to sales forecasting under promotion impacts, which was compared with ARIMA, Holt-Winters, and exponential smoothing [17].
On the other hand, MLAs based on TSMs have also been applied in sales prediction. Wang et al. proved the advantages of the integrated model combining ARIMA with ANN in modeling the linear and nonlinear parts of the data set [18]. In [19], an ARIMA forecasting model was established and the residual of the ARIMA model was trained and fitted by the BP neural network. A novel LSTM ensemble forecasting algorithm was presented by Choi and Lee [20] that effectively combines multiple forecast results from a set of individual LSTM networks. In order to better handle irregular sales patterns and take various factors into account, some algorithms have been attempted to exploit more information in sales forecasting as an increasing amount of data are becoming available in e-commerce. Zhao and Wang [21] provided a novel approach to learning effective features automatically from structured data using CNN. Bandara et al. attempted to incorporate sales demand patterns and cross-series information in a unified model by training the LSTM model [22]. More importantly, ELM was widely applied in forecasting. Luo et al. [23] proposed a novel data-driven method to predict user behavior by using ELM with distribution optimization. In [24], ELM was enhanced under deep learning framework to forecast wind speed.
Although there are various methods of forecasting, the choice of methods is determined by the characteristics of different goods [25]. Kulkarni et al. [26] argued that product characteristics could have an impact on both searching and sales due to the characteristics inherent to products were the main attributes that potential consumers were interested in. Therefore, to better reflect the characteristics of goods into sales forecasting, clustering techniques have been introduced to forecast [27]. For example, in [28, 29], both fuzzy neural networks and clustering methods were used to improve the results of neural networks. Lu and Wang [30] constructed the SVR to deal with the demand forecasting problem with the aid of the hierarchical self-organizing maps and independent component analysis. Lu and Kao [31] put forward a sales forecasting method based on clustering using extreme learning machine and combination linkage method. Dai et al. [32] built a clustering-based sales forecasting scheme based on SVR. A clustering-based forecasting model by combining clustering and machine learning methods was developed by Chen and Lu [33] for computer retailing sales forecasting.
According to the above literature review, a three-stage XGBoost-based forecasting model is constructed to focus on the two aspects (the sales features and tendency of a data series) mentioned above in this study.
Firstly, in order to forecast the sales features, various influencing factors of sales are first introduced in this study by the two-step clustering algorithm [34], which is an improved algorithm based on BIRCH [35]. Then, a C-XGBoost model based on clustering is presented to model for each cluster of the resulting clusters with the XGBoost algorithm, which has been proved to be an efficient predictor in many data analysis contests such as Kaggle and in many recent studies [36, 37].
Secondly, to achieve higher predicting accuracy in the tendency of data series, an A-XGBoost model is presented integrating the strengths of the ARIMA and XGBoost model, respectively, for the linear part and the nonlinear part of data series. Therefore, a C-A-XGBoost model is constructed as the final combination model by weighting for the C-XGBoost and A-XGBoost models, which takes the multiple factors affecting the sales of goods and the trend of the time series into account.
The paper is organized into 5 sections, the rest of which is organized as follows: In Section 2, the key models and algorithms employed in the study are shortly described, including the feature selection, two-step clustering algorithm, a method of parameter determination of the ARIMA, and the XGBoost. In Section 3, a three-stage XGBoost-based model is proposed to forecast both the sales features and tendency of time series. In Section 4, numerical examples are used to illustrate the validity of the proposed forecasting model. In Section 5, the conclusions along with a note regarding future research directions are summarized.
2. Methodologies
2.1. Feature Selection
With the emergence of web technologies, there is an ever-increasing growth in the amount of big data in the e-commerce environment [38]. Variety is one of the critical attributes in big data as they are generated from a wide variety of sources and formats, including text, web, tweet, audio, video, click-stream, and log files [39]. In order to remove most irrelevant and redundant information from various data, many techniques of feature selection (removing variables that are irrelevant) and feature extraction (applying some transformations to the existing variables to obtain a new one) have been discussed to reduce the dimensionality of the data [40], including filter-based and wrapper feature selection. Wrapper feature selection employs a subroutine statistical resampling technique (such as cross-validation) in the actual learning algorithm to forecast the accuracy of feature subsets [41], which is a better choice for different algorithms modeling the different data series. Instead, filter-based feature selection is suitable for different algorithms, modeling the same data series [42].
In this study, wrapper feature selection in the forecasting and clustering algorithms is directly applied to removing unimportant attributes in multidimensional data based on standard deviation (SD), the coefficient of variation (CV), Pearson correlation coefficient (PCC), and feature importance scores (FIS), of which the details are as follows.
SD reflects the degree of dispersion of data set, which is calculated as
CV is a statistic to measure the degree of variation of observed values in the data which is calculated as
PCC is a statistic used to reflect the degree of linear correlation between two variables, which is calculated as
FIS provides a score indicating how useful or valuable each feature is in the construction of the boosted decision trees within the model. The more an attribute is used to make key decisions with decision trees, the higher its relative importance [43]. The importance is calculated for a single decision tree by the performance measure increased by each attribute split point, weighted by the number of observations the node is responsible for. The performance measure may be the purity such as the Gini Index [44] used to select the split points or another more specific error function. The feature importance is then averaged across all of the decision trees within the model [45].
2.2. Two-Step Clustering Algorithm
Clustering aims at partitioning samples into several disjoint subsets, making samples in the same subsets highly similar to each other [46]. The most widely applied clustering algorithms can broadly be categorized as the partition, hierarchical, density-based, grid-based, and model-based methods [47, 48].
The selection of clustering algorithms mainly depends on the scale and the type of collected data. Clustering can be conducted using traditional algorithms when dealing with numeric or categorical data [49, 50]. The BIRCH, as one of the hierarchical methods, introduced by Zhang et al. [35] is especially suitable for the large data sets of continuous attributes [51]. But in case of the large and mixed data, the two-step clustering algorithm in SPSS Modeler is advised in this study. The two-step clustering algorithm is a modified method based on BIRCH setting the log-likelihood distance as the measure, which can measure the distance between continuous data and the distance between categorical data [34]. Similar to BIRCH, the two-step clustering algorithm first performs a preclustering step of scanning the entire data set and storing the dense regions of data records in terms of summary statistics. A hierarchical clustering algorithm is then applied to clustering the dense regions. Apart from the ability to handle the mixed type of attributes, the two-step clustering algorithm differs from BIRCH in automatically determining the appropriate number of clusters and a new strategy of assigning cluster membership to noisy data.
As one of the hierarchical algorithms, the two-step clustering algorithm is also more efficient in handling noise and outliers than partition algorithms. More importantly, it has unique advantages over other algorithms in the automatic mechanism of determining the optimal number of clusters. Therefore, with regard to large and mixed transaction data sets of e-commerce, two-step clustering algorithm is a reliable choice for clustering goods, of which the key technologies and processes are illustrated in Figure 1.
[figure omitted; refer to PDF]
2.2.1. Preclustering
The clustering feature (CF) tree growth in the BIRCH algorithm is used to read data records in data set one by one, in the process of which the handling of outliers is implemented. Then, subclusters
2.2.2. Clustering
Take the subclusters
2.2.3. Cluster Membership Assignment
The data records are assigned to the nearest clusters by calculating the log-likelihood distance between the data records and subclusters of the clusters
2.2.4. Validation of the Results
The performance of clustering results is measured by silhouette coefficient
2.3. Parameter Determination of ARIMA Model
ARIMA models obtained from a combination of autoregressive and moving average models [53]. The Box–Jenkins methodology in time series theory is applied to establish an ARIMA (p, d, q) model, and its calculation steps can be found in [54]. The ARIMA has limitations in determining parameters because its parameters are usually determined based on plots of ACF and PACF, which usually leads to the judging deviation. However, a function named auto.arima ( ) in R package “forecast” [55] is used to automatically generate an optimal ARIMA model for each of the time series based on the smallest Akaike information criterion (AIC) and BIC [56], which makes up for the disadvantage of ARIMA during judging parameters.
Therefore, a combined method of parameter determination is proposed to improve the fitting performance of the ARIMA, which combines the results of ACF and PACF plots with that of the auto.arima ( ) function. The procedures are illustrated in Figure 2 and described as follows:
Step 1. Test the stationary and white noise by the augmented Dickey–Fuller (ADF) and Box–Pierce tests before modeling ARIMA. If both stationarity and white noise tests are passed, the ARIMA is suitable for the time series.
Step 2. Determine a part of parameter combinations based on ACF and PACF plots, and determine another part of parameter combinations by the auto.arima ( ) function in R application.
Step 3. Model the ARIMA under different parameter combinations, and then calculate the values of AIC for different models.
Step 4. Determine the optimal parameters combination of the ARIMA with the minimum of AIC.
[figure omitted; refer to PDF]
2.4. XGBoost Algorithm
The XGBoost is short for “Extreme Gradient Boosting” proposed by Friedman [57]. As the relevant basic theory of the XGBoost has been mentioned in plenty of previous papers [58, 59], the procedures of the algorithm [60] are covered in this study rather than basic theory.
2.4.1. Feature Selection
The specific steps of feature selection via the XGBoost are as follows: data cleaning, data feature extraction, and data feature selection based on the scores of feature importance.
2.4.2. Modeling Training
The model is trained based on the selected features with default parameters.
2.4.3. Parameter Optimization
Parameter optimization is aimed at minimizing the errors between predicted values and actual values. There are three types of parameters in the algorithm, of which the descriptions are listed in Table 1.
Table 1
The description of parameters in the XGBoost model.
| Type of parameters | Parameters | Description of parameters | Main purpose |
|---|---|---|---|
| Booster parameters | Max depth | Maximum depth of a tree | Increasing this value will make the model more complex and more likely to be overfit |
| Min_child_weight | Minimum sum of weights in a child | The larger the min_child_weight is, the more conservative the algorithm will be | |
| Max delta step | Maximum delta step | It can help make the update step more conservative | |
| Gamma | Minimum loss reduction | The larger the gamma is, the more conservative the algorithm will be | |
| Subsample | Subsample ratio of the training instances | It is used in the update to prevent overfitting | |
| Col sample by a tree | Subsample ratio of columns for each tree | It is used in the update to prevent overfitting | |
| Eta | Learning rate | Step size shrinkage used in the update can prevent overfitting | |
|
|
|||
| Regularization parameters | Alpha | Regularization term on weights | Increasing this value will make the model more conservative |
| Lambda | |||
|
|
|||
| Learning task parameters | Reg: linear | Learning objective | It is used to specify the learning task and the learning objective |
|
|
|||
| Command line parameters | Number of estimators | Number of estimators | It is used to specify the number of iterative calculations |
The general steps of determining the hyperparameter of the XGBoost model are as follows:
Step 1. The number of estimators is firstly tuned to optimize the XGBoost when fixing the learning rate and other parameters
Step 2. Different combinations of max_depth and min_child_weight are tuned to optimize the XGBoost
Step 3. Max delta step and Gamma is tuned to make the model more conservative with the determined parameter in Step 1 and Step 2
Step 4. Different combinations of subsample and colsample_bytree are tuned to prevent overfitting
Step 5. Regularization parameters are increased to make the model more conservative
Step 6. The learning rate is reduced to prevent overfitting
3. The Proposed Three-Stage Forecasting Model
In this research, a three-stage XGBoost-based forecasting model, named C-A-XGBoost model, is proposed in consideration of both the sales features and tendency of data series.
In Stage 1, a novel C-XGBoost model is put forward based on the clustering and XGBoost, which incorporates different clustering features into forecasting as influencing factors. The two-step clustering algorithm is first applied to partitioning commodities into different clusters based on features, and then each cluster in the resulting clusters is modeled via XGBoost.
In Stage 2, an A-XGBoost model is presented by combining the ARIMA with XGBoost to predict the tendency of time series, which takes the strength of linear fitting ability of ARIMA and the strong nonlinear mapping ability of XGBoost. ARIMA is used to predict the linear part, and the rolling prediction method is employed to establish XGBoost to revise the nonlinear part of the data series, namely, residuals of the ARIMA.
In Stage 3, a combination model is constructed based on C-XGBoost and A-XGBoost, named C-A-XGBoost. The C-A-XGBoost is aimed at minimizing the sum errors of squares by assigning weights to the results of C-XGBoost and A-XGBoost, in which the weights reflect the reliability and credibility of sales features and tendency of data series.
The procedures of the proposed three-stage model are demonstrated in Figure 3, of which the details are given as follows.
[figure omitted; refer to PDF]
3.1. Stage 1. C-XGBoost Model
The two-step clustering algorithm is applied to clustering a data series into several disjoint clusters. Then, each cluster in the resulting clusters is set as the input and output sets to construct and optimize the corresponding C-XGBoost model. Finally, testing samples are partitioned into the corresponding cluster by the trained two-step clustering model, and then the prediction results are calculated based on the corresponding trained C-XGBoost model.
3.2. Stage 2. A-XGBoost Model
The optimal ARIMA based on the minimum of AIC after the data series pass the tests of stationarity and white noise is trained and determined, of which the processes are described in Section 2. Then, the residual vector
The final results of the test set are calculated by summing the predicted results of the linear part by the trained ARIMA and that of residuals with the established XGBoost.
3.3. Stage 3. C-A-XGBoost Model
In this stage, a combination strategy is explored to minimize the error sum of squares
The least squares are employed in exploring the optimal weights (
In equation (8), the matrix
In equation (9), the matrix
In equation (10), the matrix
Equation (11) is obtained by transforming the equation (7) into the matrix form.
Equation (12) is calculated based on equation (11) left multiplying by the transpose of the matrix
According to equation (13), the optional weights (
4. Numerical Experiments and Comparisons
4.1. Data Description
To illustrate the effectiveness of the developed C-A-XGBoost model, the following data series are used to verify the forecasting performance.
4.1.1. Source Data Series
As listed in Table 2, there are eight data series in source data series. The data series range from Mar. 1, 2017 to Mar. 16, 2018.
Table 2
The description of source data series.
| Data series | Fields |
|---|---|
| Customer behavior dataa | Data date; goods click; cart click; favorites click |
| Goods information datab | Goods id; SKUi id; level; season; brand id |
| Goods sales datac | Data date; SKU sales; goods price; original shop price |
| The relationship between goods id and SKU idd | Goods id; SKU id |
| Goods promote pricee | Data date; goods price; goods promotion price |
| Marketingf | Data date; marketing; plan |
| Holidaysg | Data date; holiday |
| Temperatureh | Data date; temperature mean |
a–fThe six data series are sourced from the historical data of the Saudi Arabian market in Jollychic cross-border e-commerce trading platform (https://www.jollychic.com/). gThe data of holidays are captured from the URL http://shijian.cc/114/jieri2017/. hThe data of temperature are captured from the URL https://www.wunderground.com/weather/eg/saudi-arabia. iSKU’s full name is stock keeping unit. Each product has a unique SKU number.
4.1.2. Clustering Series
There are 10 continuous attributes and 6 categorical attributes in clustering series, which are obtained by reconstructing the source data series. The attribute descriptions of the clustering series are illustrated in Table 3.
Table 3
The description of clustering series.
| Fields | Meaning of fields | Fields | Meaning of fields |
|---|---|---|---|
| Data date | Date | Favorites click | Number of clicks on favorites |
| Goods code | Goods code | Sales unique visitor | Number of unique visitors |
| SKU code | SKU code | Goods season | Seasonal attributes of goods |
| SKU sales | Sales of SKU | Marketing | Activity type code |
| Goods price | Selling price | Plan | Activity rhythm code |
| Original shop price | Tag price | Promotion | Promotion code |
| Goods click | Number of clicks on goods | Holiday | The holiday of the day |
| Cart click | Number of clicks on purchasing carts | Temperature mean | Mean of air temperatures (°F) |
4.2. Uniform Experimental Conditions
To verify the performance of the proposed model according to performance evaluation indexes, some uniform experimental conditions are established as follows.
4.2.1. Uniform Data Set
As shown in Table 4, the data series are partitioned into the training set, validation set, and test set so as to satisfy the requirements of different models. The data application is described as follows:
(1)
The clustering series cover samples of 381 days.
(2)
For the C-XGBoost model, training set 1, namely, samples of the first 347 days in clustering series, is utilized to establish the two-step clustering models. The resulting samples of two-step clustering are used to construct XGBoost models. The test set with the remaining samples of 34 days is selected to validate the C-XGBoost model. In detail, the test set is first partitioned into the corresponding clusters by the established two-step clustering model, and then the test set is applied to checking the validity of the corresponding C-XGBoost models.
(3)
For A-XGBoost model, the training set 2 with the samples of 1st–277th days are used to construct the ARIMA, and the validation set is used to calculate the residuals of ARIMA forecast, which are used to train the A-XGBoost model. Then, the test set is employed to verify the performance of the model.
(4)
The test set had the final 34 data samples, which are employed to fit the optimal combination weights for C-XGBoost and A-XGBoost models.
Table 4
The description of the training set, validation set, and test set.
| Data set | Samples | Number of weeks | Start date | End date | The first day | The last day |
|---|---|---|---|---|---|---|
| Training set 1 | Clustering series | 50 | Mar.1, 2017 (WED) | Dec.2, 2017 (SAT) | 1 | 347 |
| Training set 2 | SKU code = 94033 | 50 | Mar.1, 2017 (WED) | Dec.2, 2017 (SAT) | 1 | 277 |
| Validation set | SKU code = 94033 | 10 | Dec.3, 2017 (SUN) | Feb.10, 2018 (SAT) | 278 | 347 |
| Test set | SKU code = 94033 | 5 | Feb.11, 2018 (SUN) | Mar.16, 2018 (FRI) | 348 | 381 |
4.2.2. Uniform Evaluation Indexes
Several performance measures have previously been applied to verifying the viability and effectiveness of forecasting models. As illustrated in Table 5, the common evaluation measurements are chosen to distinguish the optimal forecasting model. The smaller they are, the more accurate the model is.
Table 5
The description of evaluation indexes.
| Evaluation indexes | Expression | Description |
|---|---|---|
| ME |
|
The mean sum error |
| MSE |
|
The mean squared error |
| RMSE |
|
The root mean squared error |
| MAE |
|
The mean absolute error |
4.2.3. Uniform Parameters of the XGBoost Model
The first priority for optimization is to tune depth and min_child_weight with other parameters fixed, which are the most effective way for optimizing the XGBoost. The ranges of depth and child weigh are 6–10 and 1–6, respectively. Default values of parameters are listed in Table 6.
Table 6
Default parameters values of XGBoost.
| Parameters | Number of estimators | Max depth | Min_child_weight | Max delta step | Objective | Subsample | Eta |
|---|---|---|---|---|---|---|---|
| Default value | 100 | 6 | 1 | 0 | Reg: linear | 1 | 0.3 |
| Parameters | Gamma | Col sample by tree | Col sample by level | Alpha | Lambda | Scale position weight | |
| Default value | 0.1 | 1 | 1 | 0 | 1 | 1 |
4.3. Experiments of C-A-XGBoost Model
4.3.1. C-XGBoost Model
(1) Step 1. Commodity clustering: The two-step clustering algorithm is first applied to training set 1. Standardization applies to the continuous attributes; the noise percent of outliers handling is 25%; log-likelihood distance is the basis of distance measurement; BIC is set as the clustering criterion.
As shown in Figure 4, the clustering series are partitioned into 12 homogeneous clusters based on 11 features, denoted as
[figures omitted; refer to PDF]
As illustrated in Figure 5, the ratio of sizes is 2.64 and the percentage is not too large or too small for each cluster. Therefore, cluster quality is acceptable.
[figure omitted; refer to PDF]
(2) Step 2. Construct the C-XGBoost models: Features are first selected from each cluster
Take the cluster
For
[figure omitted; refer to PDF]
Setting the 11 features of the cluster
(3) Step 3. Parameter optimization: XGBoost is an algorithm with supervised learning, so the key to optimization is to determine the appropriate input and output variables. In contrast, parameter optimization has less impact on the accuracy of the algorithm. Therefore, in this paper, only the primary parameters including max_depth and min_child_weight are tuned to optimize the XGBoost [61]. The model can achieve a balanced point because increasing the value of max_depth will make the model more complex and more likely to be overfit, but increasing the value of min_child_weight will make the model more conservative.
The prebuilt
Figure 7 shows the changes of ME and MAE based on XGBoost as depths and min_child_weight change. It can be seen that both the ME and MAE are the smallest when depth is 9 and min_child_weight is 2. That is, the model is optimal.
[figures omitted; refer to PDF]
(4) Step 4. Results on the test set: The test set is partitioned into the corresponding clusters by the trained two-step clustering model in Step 1. After that, the Steps 2-3 are repeated for the test set.
As shown in Table 7, the test set is partitioned into the clusters
Table 7
The results of C-XGBoost for the test set.
| Test set | Days | C12_j | C12_j_XGBoost model | Depth and min_child_weight | Training set 1 | Test set | ||
|---|---|---|---|---|---|---|---|---|
| ME | MAE | ME | MAE | |||||
| 348th–372th | 25 | 3 | C12_3_XGBoost model | (9, 2) | 0.351 | 0.636 | 4.385 | 4.400 |
| 373rd–381st | 9 | 4 | C12_4_XGBoost model | (10, 2) | 0.339 | 0.591 | 1.778 | 2.000 |
| 348th–381st | 34 | — | — | — | — | — | 3.647 | 3.765 |
As illustrated in Figure 8, ME and MAE for
[figures omitted; refer to PDF]
4.3.2. A-XGBoost Model
(1) Step 1. Test stationarity and white noise of training set 2: For training set 2, the
(2) Step 2. Train ARIMA model: According to Section 2.3, parameter combinations are firstly determined by ACF and PACF plots, and auto.arima ( ) function in R package “forecast.”
As shown in Figure 9(a), SKU sales have a significant fluctuation in the first 50 days compared with the sales after 50 days; in Figure 9(b), the plot of ACF has a high trailing characteristic; in Figure 9(c), the plot of PACF has a decreasing and oscillating phenomenon. Therefore, the first-order difference should be calculated.
[figures omitted; refer to PDF]
As illustrated in Figure 10(a), SKU sales fluctuate around zero after the first-order difference. Figures 10(b) and 10(c) graphically present plots of ACF and PACF after the first-order difference, both of which have a decreasing and oscillating phenomenon. It indicates that the training set 2 conforms to the ARMA.
[figures omitted; refer to PDF]
As a result, the possible optimal models are ARIMA (2, 1, 2), ARIMA (2, 1, 3), and ARIMA (2, 1, 4) according to the plots of ACF and PACF in Figure 10.
Table 8 shows the AIC values of the ARIMA under different parameters, which are generated by the auto.arima ( ) function. It can be concluded that the ARIMA (0, 1, 1) is the best model because its AIC has the best performance.
Table 8
AIC values of the resulting ARIMA by auto.airma ( ) function.
| ARIMA (p, d, q) | AIC | ARIMA (p, d, q) | AIC |
|---|---|---|---|
| ARIMA (2, 1, 2) with drift | 2854.317 | ARIMA (0, 1, 2) with drift | 2852.403 |
| ARIMA (0, 1, 0) with drift | 2983.036 | ARIMA (1, 1, 2) with drift | 2852.172 |
| ARIMA (1, 1, 0) with drift | 2927.344 | ARIMA (0, 1, 1) | 2850.212 |
| ARIMA (0, 1, 1) with drift | 2851.296 | ARIMA (1, 1, 1) | 2851.586 |
| ARIMA (0, 1, 0) | 2981.024 | ARIMA (0, 1, 2) | 2851.460 |
| ARIMA (1, 1, 1) with drift | 2852.543 | ARIMA (1, 1, 2) | 2851.120 |
To further determine the optimal model, the AIC and RMSE of ARIMA models under different parameters are summarized in Table 9. The possible optimal models include the 3 possible optimal ARIMA judged by Figure 10 and the best ARIMA generated by the auto.arima ( ) function. According to the minimum principles, the ARIMA (2, 1, 4) is optimal because both AIC and RMSE have the best performance.
Table 9
AIC values and RMSE of ARIMA models under different parameters.
| ARIMA model | ARIMA (p, d, q) | AIC | RMSE |
|---|---|---|---|
| 1 | ARIMA (0, 1, 1) | 2850.170 | 41.814 |
| 2 | ARIMA (2, 1, 2) | 2852.980 | 41.572 |
| 3 | ARIMA (2, 1, 3) | 2854.940 | 41.567 |
| 4 | ARIMA (2, 1, 4) | 2848.850 | 40.893 |
(3) Step 3. Calculate residuals of the optimal ARIMA: The prediction results from the 278th to the 381st day are obtained by using the trained ARIMA (2, 1, 4), denoted as
(4) Step 4. Train A-XGBoost by setting
(5) Step 5. Calculate predicted residuals of the test set using the trained A-XGBoost in Step 4, denoted as
(6) Step 6. Calculate the final prediction results: For the test set, calculate the final prediction results by summing over the corresponding values of
Table 10
The performance evaluation of A-XGBoost.
| A-XGBoost | Validation set | Test set |
|---|---|---|
| Minimum error | −0.003 | −8.151 |
| Maximum error | 0.002 | 23.482 |
| Mean error | 0.000 | 1.213 |
| Mean absolute error | 0.001 | 4.566 |
| Standard deviation | 0.001 | 6.262 |
| Linear correlation | 1 | −0.154 |
| Occurrences | 70 | 34 |
4.3.3. C-A-XGBoost Model
The optimal combination weights are determined by minimizing the MSE in equation (6).
For the test set, the weights
4.4. Models for Comparison
In this section, the following models are chosen for the comparison between the proposed models and other classical models:
ARIMA. As one of the common time series model, it is used to predict sales of time sequence, of which the processes are the same as the ARIMA in Section 4.3.2.
XGBoost. The XGBoost model is constructed and optimized by setting the selected features and the corresponding
C-XGBoost. Taking sales features of commodities into account, the XGBoost is used to forecast sales based on the resulting clusters by the two-step clustering model. The procedures are the same as that in Section 4.3.1.
A-XGBoost. The A-XGBoost is applied to revising residuals of the ARIMA. Namely, the ARIMA is firstly used to model the linear part of the time series, and then XGBoost is used to model the nonlinear part. The relevant processes are described in Section 4.3.2.
C-A-XGBoost. The model combines the advantages of C-XGBoost and A-XGBoost, of which the procedures are displayed in Section 4.3.3.
4.5. Results of Different Models
In this section, the test set is used to verify the superiority of the proposed C-A-XGBoost.
Figure 11 shows the curve of actual values
[figure omitted; refer to PDF]
It can be seen that C-A-XGBoost has the best fitting performance to the original value, as its fitting curve is the most similar in five fitting curves to the curve of actual values
To further illustrate the superiority of the proposed C-A-XGBoost, the evaluation indexes mentioned in Section 4.2.2 are applied to distinguishing the best model of the sales forecast. Table 11 provides a comparative summary of the indexes for the five models in Section 4.4.
Table 11
The performance evaluation of ARIMA, XGBoost, A-XGBoost, C-XGBoost, and C-A-XGBoost.
| Evaluation indexes | ARIMA | XGBoost | A-XGBoost | C-XGBoost | C-A-XGBoost |
|---|---|---|---|---|---|
| ME | −21.346 | −3.588 | 1.213 | 3.647 | 0.288 |
| MSE | 480.980 | 36.588 | 39.532 | 23.353 | 10.769 |
| RMSE | 21.931 | 6.049 | 6.287 | 4.832 | 3.282 |
| MAE | 21.346 | 5.059 | 4.566 | 3.765 | 2.515 |
According to Table 11, it can be concluded that the superiority of the proposed C-A-XGBoost is distinct compared with the other models, as its evaluation indexes are minimized.
C-XGBoost is inferior to C-A-XGBoost but outperforms the other three models, underlining that C-XGBoost is superior to the single XGBoost.
A-XGBoost has a superior performance relative to ARIMA, proving that XGBoost is effective for residual modification of ARIMA.
According to the analysis above, the proposed C-A-XGBoost has the best forecasting performance for sales of commodities in the cross-border e-commerce enterprise.
5. Conclusions and Future Directions
In this research, a new XGBoost-based forecasting model named C-A-XGBoost is proposed, which takes the sales features and tendency of data series into account.
The C-XGBoost is first presented combining the clustering and XGBoost, aiming at reflecting sales features of commodities into forecasting. The two-step clustering algorithm is applied to partitioning data series into different clusters based on selected features, which are used as the influencing factors for forecasting. After that, the corresponding C-XGBoost models are established for different clusters using the XGBoost.
The proposed A-XGBoost takes the advantages of the ARIMA in predicting the tendency of data series and overcomes the disadvantages of the ARIMA by applying the XGBoost to dealing with the nonlinear part of the data series. The optimal ARIMA is obtained in comparison of AICs under different parameters and then the trained ARIMA model is used to predict the linear part of the data series. For nonlinear part of data series, the rolling prediction is conducted by the trained XGBoost, of which the input and output are the resulting residuals by the ARIMA. The final results of the A-XGBoost are calculated by adding the predicted residuals by the XGBoost to the corresponding forecast values by the ARIMA.
In conclusion, the C-A-XGBoost is developed by assigning appropriate weights to the forecasting results of the C-XGBoost and A-XGBoost so as to take their respective strengths. Consequently, a linear combination of the two models’ forecasting results is calculated as the final predictive values.
To verify the effectiveness of the proposed C-A-XGBoost, the ARIMA, XGBoost, C-XGBoost, and A-XGBoost are employed for comparison. Meanwhile, four common evaluation indexes, including ME, MSE, RMSE, and MAE, are utilized to check the forecasting performance of C-A-XGBoost. The experiment demonstrates that the C-A-XGBoost outperforms other models, indicating that C-A-XGBoost has provided theoretical support for sales forecast of the e-commerce company and can serve as a reference for selecting forecasting models. It is advisable for the e-commerce company to choose different forecasting models for different commodities instead of utilizing a single model.
The two potential extensions are put forward for future research. On the one hand, owing to the fact that there may be no model in which all evaluation indicators are minimal, which leads to the difficulty in choosing the optimal model. Therefore, a comprehensive evaluation index of forecasting performance will be constructed to overcome the difficulty. On the other hand, sales forecasting is actually used to optimize inventory management, so some relevant factors should be considered, including inventory cost, order lead time, delivery time, and transportation time.
Acknowledgments
This research was supported by the National Key R & D Program of China through the China Development Research Foundation (CDRF) funded by the Ministry of Science and Technology (CDRF-SQ2017YFGH002106).
[1] Y. Jin, Data Science in Supply Chain Management: Data-Related Influences on Demand Planning, 2013.
[2] S. Akter, S. F. Wamba, "Big data analytics in e-commerce: a systematic review and agenda for future research," Electronic Markets, vol. 26 no. 2, pp. 173-194, DOI: 10.1007/s12525-016-0219-0, 2016.
[3] J. L. Castle, M. P. Clements, D. F. Hendry, "Forecasting by factors, by variables, by both or neither?," Journal of Econometrics, vol. 177 no. 2, pp. 305-319, DOI: 10.1016/j.jeconom.2013.04.015, 2013.
[4] A. Kawa, "Supply chains of cross-border e-commerce," .
[5] L. Song, T. Lv, X. Chen, J. Gao, "Architecture of demand forecast for online retailers in China based on big data," .
[6] G. Iman, A. Ehsan, R. W. Gary, A. Y. William, "An overview of energy demand forecasting methods published in 2005–2015," Energy Systems, vol. 8 no. 2, pp. 411-447, 2016.
[7] S. Gmbh, Forecasting with Exponential Smoothing, vol. 26 no. 1, 2008.
[8] G. E. P. Box, G. M. Jenkins, "Time series analysis: forecasting and control," Journal of Time, vol. 31 no. 4, 2010.
[9] G. P. Zhang, "Time series forecasting using a hybrid ARIMA and neural network model," Neurocomputing, vol. 50, pp. 159-175, DOI: 10.1016/s0925-2312(01)00702-0, 2003.
[10] S. Ma, R. Fildes, T. Huang, "Demand forecasting with high dimensional data: the case of SKU retail sales forecasting with intra- and inter-category promotional information," European Journal of Operational Research, vol. 249 no. 1, pp. 245-257, DOI: 10.1016/j.ejor.2015.08.029, 2015.
[11] Ö. Gür Ali, S. SayIn, T. van Woensel, J. Fransoo, "SKU demand forecasting in the presence of promotions," Expert Systems with Applications, vol. 36 no. 10, pp. 12340-12348, DOI: 10.1016/j.eswa.2009.04.052, 2009.
[12] T. Huang, R. Fildes, D. Soopramanien, "The value of competitive information in forecasting FMCG retail product sales and the variable selection problem," European Journal of Operational Research, vol. 237 no. 2, pp. 738-748, DOI: 10.1016/j.ejor.2014.02.022, 2014.
[13] F. Cady, "Machine learning overview," The Data Science Handbook, 2017.
[14] N. K. Ahmed, A. F. Atiya, N. E. Gayar, H. El-Shishiny, "An empirical comparison of machine learning models for time series forecasting," Econometric Reviews, vol. 29 no. 5-6, pp. 594-621, DOI: 10.1080/07474938.2010.481556, 2010.
[15] A. P. Ansuj, M. E. Camargo, R. Radharamanan, D. G. Petry, "Sales forecasting using time series and neural networks," Computers & Industrial Engineering, vol. 31 no. 1-2, pp. 421-424, DOI: 10.1016/0360-8352(96)00166-0, 1996.
[16] I. Alon, M. Qi, R. J. Sadowski, "Forecasting aggregate retail sales: a comparison of artificial neural networks and traditional methods," Journal of Retailing and Consumer Services, vol. 8 no. 3, pp. 147-156, DOI: 10.1016/s0969-6989(00)00011-4, 2001.
[17] G. Di Pillo, V. Latorre, S. Lucidi, E. Procacci, "An application of support vector machines to sales forecasting under promotions," 4OR, vol. 14 no. 3, pp. 309-325, DOI: 10.1007/s10288-016-0316-0, 2016.
[18] L. Wang, H. Zou, J. Su, L. Li, S. Chaudhry, "An ARIMA-ANN hybrid model for time series forecasting," Systems Research and Behavioral Science, vol. 30 no. 3, pp. 244-259, DOI: 10.1002/sres.2179, 2013.
[19] S. Ji, H. Yu, Y. Guo, Z. Zhang, "Research on sales forecasting based on ARIMA and BP neural network combined model," .
[20] J. Y. Choi, B. Lee, "Combining LSTM network ensemble via adaptive weighting for improved time series forecasting," Mathematical Problems in Engineering, vol. 2018,DOI: 10.1155/2018/2470171, 2018.
[21] K. Zhao, C. Wang, "Sales forecast in e-commerce using the convolutional neural network," 2017. https://arxiv.org/abs/1708.07946
[22] K. Bandara, P. Shi, C. Bergmeir, H. Hewamalage, Q. Tran, B. Seaman, "Sales demand forecast in e-commerce using a long short-term memory neural network methodology," 2019. https://arxiv.org/abs/1901.04028
[23] X. Luo, C. Jiang, W. Wang, Y. Xu, J.-H. Wang, W. Zhao, "User behavior prediction in social networks using weighted extreme learning machine with distribution optimization," Future Generation Computer Systems, vol. 93, pp. 1023-1035, DOI: 10.1016/j.future.2018.04.085, 2018.
[24] L. Xiong, S. Jiankun, W. Long, "Short-term wind speed forecasting via stacked extreme learning machine with generalized correntropy," IEEE Transactions on Industrial Informatics, vol. 14 no. 11, pp. 4963-4971, DOI: 10.1109/tii.2018.2854549, 2018.
[25] J. L. Zhao, H. Zhu, S. Zheng, "What is the value of an online retailer sharing demand forecast information?," Soft Computing, vol. 22 no. 16, pp. 5419-5428, DOI: 10.1007/s00500-018-3091-3, 2018.
[26] G. Kulkarni, P. K. Kannan, W. Moe, "Using online search data to forecast new product sales," Decision Support Systems, vol. 52 no. 3, pp. 604-611, DOI: 10.1016/j.dss.2011.10.017, 2012.
[27] A. Roy, "A novel multivariate fuzzy time series based forecasting algorithm incorporating the effect of clustering on prediction," Soft Computing, vol. 20 no. 5, pp. 1991-2019, DOI: 10.1007/s00500-015-1619-3, 2016.
[28] R. J. Kuo, K. C. Xue, "A decision support system for sales forecasting through fuzzy neural networks with asymmetric fuzzy weights," Decision Support Systems, vol. 24 no. 2, pp. 105-126, DOI: 10.1016/s0167-9236(98)00067-0, 1998.
[29] P.-C. Chang, C.-H. Liu, C.-Y. Fan, "Data clustering and fuzzy neural network for sales forecasting: a case study in printed circuit board industry," Knowledge-Based Systems, vol. 22 no. 5, pp. 344-355, DOI: 10.1016/j.knosys.2009.02.005, 2009.
[30] C.-J. Lu, Y.-W. Wang, "Combining independent component analysis and growing hierarchical self-organizing maps with support vector regression in product demand forecasting," International Journal of Production Economics, vol. 128 no. 2, pp. 603-613, DOI: 10.1016/j.ijpe.2010.07.004, 2010.
[31] C.-J. Lu, L.-J. Kao, "A clustering-based sales forecasting scheme by using extreme learning machine and ensembling linkage methods with applications to computer server," Engineering Applications of Artificial Intelligence, vol. 55, pp. 231-238, DOI: 10.1016/j.engappai.2016.06.015, 2016.
[32] W. Dai, Y.-Y. Chuang, C.-J. Lu, "A clustering-based sales forecasting scheme using support vector regression for computer server," Procedia Manufacturing, vol. 2, pp. 82-86, DOI: 10.1016/j.promfg.2015.07.014, 2015.
[33] I. F. Chen, C. J. Lu, "Sales forecasting by combining clustering and machine-learning techniques for computer retailing," Neural Computing and Applications, vol. 28 no. 9, pp. 2633-2647, DOI: 10.1007/s00521-016-2215-x, 2016.
[34] T. Chiu, D. P. Fang, J. Chen, Y. Wang, C. Jeris, "A robust and scalable clustering algorithm for mixed type attributes in a large database environment," ,DOI: 10.1145/502512.502549, .
[35] T. Zhang, R. Ramakrishnan, M. Livny, "Birch: a new data clustering algorithm and its applications," Data Mining and Knowledge Discovery, vol. 1 no. 2, pp. 141-182, DOI: 10.1023/a:1009783824328, 1997.
[36] L. Li, R. Situ, J. Gao, Z. Yang, W. Liu, "A hybrid model combining convolutional neural network with XGBoost for predicting social media popularity," .
[37] J. Ke, H. Zheng, H. Yang, X. Chen, "Short-term forecasting of passenger demand under on-demand ride services: a spatio-temporal deep learning approach," Transportation Research Part C: Emerging Technologies, vol. 85, pp. 591-608, DOI: 10.1016/j.trc.2017.10.016, 2017.
[38] K. Shimada, "Customer value creation in the information explosion era," .
[39] H. A. Abdelhafez, "Big data analytics: trends and case studies," Encyclopedia of Business Analytics & Optimization, 2014.
[40] K. Kira, L. A. Rendell, "A practical approach to feature selection," Machine Learning Proceedings, vol. 48 no. 1, pp. 249-256, DOI: 10.1016/b978-1-55860-247-2.50037-1, 1992.
[41] T. M. Khoshgoftaar, K. Gao, L. A. Bullard, "A comparative study of filter-based and wrapper-based feature ranking techniques for software quality modeling," International Journal of Reliability, Quality and Safety Engineering, vol. 18 no. 4, pp. 341-364, DOI: 10.1142/s0218539311004287, 2011.
[42] M. A. Hall, L. A. Smith, "Feature selection for machine learning: comparing a correlation-based filter approach to the wrapper," Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference. DBLP, .
[43] V. A. Huynh-Thu, Y. Saeys, L. Wehenkel, P. Geurts, "Statistical interpretation of machine learning-based feature importance scores for biomarker discovery," Bioinformatics, vol. 28 no. 13, pp. 1766-1774, DOI: 10.1093/bioinformatics/bts238, 2012.
[44] M. Sandri, P. Zuccolotto, "A bias correction algorithm for the Gini variable importance measure in classification trees," Journal of Computational and Graphical Statistics, vol. 17 no. 3, pp. 611-628, DOI: 10.1198/106186008x344522, 2008.
[45] J. Brownlee, "Feature importance and feature selection with xgboost in python," 2016. https://machinelearningmastery.com
[46] N. V. Chawla, S. Eschrich, L. O. Hall, "Creating ensembles of classifiers," .
[47] A. K. Jain, M. N. Murty, P. J. Flynn, "Data clustering: a review," ACM Computing Surveys, vol. 31 no. 3, pp. 264-323, DOI: 10.1145/331499.331504, 1999.
[48] A. Nagpal, A. Jatain, D. Gaur, "Review based on data clustering algorithms," Proceedings of the IEEE Conference on Information & Communication Technologies, .
[49] Y. Wang, X. Ma, Y. Lao, Y. Wang, "A fuzzy-based customer clustering approach with hierarchical structure for logistics network optimization," Expert Systems with Applications, vol. 41 no. 2, pp. 521-534, DOI: 10.1016/j.eswa.2013.07.078, 2014.
[50] B. Wang, Y. Miao, H. Zhao, J. Jin, Y. Chen, "A biclustering-based method for market segmentation using customer pain points," Engineering Applications of Artificial Intelligence, vol. 47, pp. 101-109, DOI: 10.1016/j.engappai.2015.06.005, 2015.
[51] M. Halkidi, Y. Batistakis, M. Vazirgiannis, "On clustering validation techniques," Journal of Intelligent Information Systems: Integrating Artificial, Intelligence and Database Technologies, vol. 17 no. 2-3, pp. 107-145, DOI: 10.1023/a:1012801612483, 2001.
[52] R. W. Sembiring, J. M. Zain, A. Embong, "A comparative agglomerative hierarchical clustering method to cluster implemented course," Journal of Computing, vol. 2 no. 12, 2010.
[53] M. Valipour, M. E. Banihabib, S. M. R. Behbahani, "Comparison of the ARIMA and the auto-regressive artificial neural network models in forecasting the monthly inflow of the dam reservoir," Journal of Hydrology, vol. 476,DOI: 10.1016/j.jhydrol.2012.11.017, 2013.
[54] E. Erdem, J. Shi, "Arma based approaches for forecasting the tuple of wind speed and direction," Applied Energy, vol. 88 no. 4, pp. 1405-1414, DOI: 10.1016/j.apenergy.2010.10.031, 2011.
[55] R. J. Hyndman, "Forecasting functions for time series and linear models," . 2019, http://mirror.costar.sfu.ca/mirror/CRAN/web/packages/forecast/index.html
[56] S. Aishwarya, "Build high-performance time series models using auto ARIMA in Python and R," 2018. https://www.analyticsvidhya.com/blog/2018/08/auto-arima-time-series-modeling-python-r/
[57] J. H. Friedman, "Machine," The Annals of Statistics, vol. 29 no. 5, pp. 1189-1232, DOI: 10.1214/aos/1013203451, 2001.
[58] T. Chen, C. Guestrin, "Xgboost: a scalable tree boosting system," 2016. https://arxiv.org/abs/1603.02754
[59] A. Gómez-Ríos, J. Luengo, F. Herrera, "A study on the noise label influence in boosting algorithms: AdaBoost, Gbm, and XGBoost," Proceedings of the International Conference on Hybrid Artificial Intelligence Systems, .
[60] J. Wang, C. Lou, R. Yu, J. Gao, H. Di, "Research on hot micro-blog forecast based on XGBOOST and random forest," Proceedings of the 11th International Conference on Knowledge Science, Engineering and Management KSEM 2018, pp. 350-360, .
[61] C. Li, X. Zheng, Z. Yang, L. Kuang, "Predicting short-term electricity demand by combining the advantages of ARMA and XGBoost in fog computing environment," Wireless Communications and Mobile Computing, vol. 2018,DOI: 10.1155/2018/5018053, 2018.
[62] A. M. Jain, "Complete guide to parameter tuning in XGBoost with codes in Python," , 2016. https://www.analyticsvidhya.com/blog/2016/03/complete-guide-parameter-tuning-xgboost-with-codes-python/
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2019 Shouwen Ji et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. http://creativecommons.org/licenses/by/4.0/
Abstract
Sales forecasting is even more vital for supply chain management in e-commerce with a huge amount of transaction data generated every minute. In order to enhance the logistics service experience of customers and optimize inventory management, e-commerce enterprises focus more on improving the accuracy of sales prediction with machine learning algorithms. In this study, a C-A-XGBoost forecasting model is proposed taking sales features of commodities and tendency of data series into account, based on the XGBoost model. A C-XGBoost model is first established to forecast for each cluster of the resulting clusters based on two-step clustering algorithm, incorporating sales features into the C-XGBoost model as influencing factors of forecasting. Secondly, an A-XGBoost model is used to forecast the tendency with the ARIMA model for the linear part and the XGBoost model for the nonlinear part. The final results are summed by assigning weights to forecasting results of the C-XGBoost and A-XGBoost models. By comparison with the ARIMA, XGBoost, C-XGBoost, and A-XGBoost models using data from Jollychic cross-border e-commerce platform, the C-A-XGBoost is proved to outperform than other four models.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
; Zhao, Wenpeng 2 ; Guo, Dong 3 1 School of Traffic and Transportation, Beijing Jiaotong University, Haidian District, Beijing 100044, China
2 Beijing Capital International Airport Co. Ltd., Beijing 100621, China
3 School of Mechanical-Electronic and Vehicle Engineering, Beijing University of Civil Engineering and Architecture, Beijing 102600, China





