Forest fires are a major disturbance in Canada that are costly to suppress and put people and infrastructure at risk (Wang et al., 2017). There has been considerable effort in Canada to quantify wildfire risk, including developing models of ignition, spread, and impacts (Johnston et al., 2020). Weather plays a significant role in fire risk models. This role is formalized in the Canadian Forest Fire Weather Index (FWI) System (Stocks et al., 1989). It is an essential element of the Canadian Fire Prediction System, which is used to help fire managers decide what actions to take on any given day to prepare for a potential fire in their management area (Wotton, 2009). There are six indices that comprise the Canadian FWI System (see Table 1). These include three moisture codes, the Drought Code (DC; calculated from temperature, precipitation, and the previous day's value), the Duff Moisture Code (DMC; calculated from temperature, precipitation, relative humidity, and the previous day's value), and the Fine-Fuel Moisture Code (FFMC; calculated from temperature, precipitation, relative humidity, wind speed, and the previous day's value) (Van Wagner, 1987; Wotton, 2009). The three remaining codes are calculated using the moisture codes. These are the Build-Up Index (BUI; calculated using the DMC and the DC), the Initial Spread Index (ISI; calculated using the FFMC and wind speed), and the Fire Weather Index (FWI; calculated using the ISI and BUI) (Van Wagner, 1987; Wotton, 2009).
Table 1 Overview of Components of the Canadian Fire Weather Index System and How They Relate to Weather Inputs
Note. Information adapted from (Wotton, 2009).
Fire weather indices are also used by researchers for fire modeling (Jain et al., 2022; James et al., 2017; Walker et al., 2020) and their accuracy is particularly important due to the changing climate and fire regimes (Hessl, 2011). Fire forecasting and modeling requires reliable fine-resolution, spatially-explicit fire weather data at the daily level over long time periods (McElhinny et al., 2020). These historical daily fire weather data can be generated from the Environment and Climate Change Canada weather station network, which has stronger temporal coverage than data sources such as the Ontario FWI station network. We can use spatial interpolation models to create continuous, gridded estimates of the Canadian FWI System inputs (temperature, relative humidity, wind speed, and precipitation). FWI values can then be calculated for each grid cell using these weather surfaces. Despite considerable research into spatial interpolation models for weather variables, less attention has been paid to methods for evaluating them.
Spatial interpolation models perform better under different conditions (Ly et al., 2013) and for different weather variables (Perry & Hollis, 2005). For example, methods differ in performance depending on topography (Mair & Fares, 2011), the spatial distribution and density of weather stations (Dirks et al., 1998), and the degree of interpolation versus extrapolation. A key challenge is identifying the most suitable interpolation method for a weather variable given the extent and topography of a study area, weather station distribution and density, and spatial autocorrelation between stations. Cross-validation methods are used for selecting the most suitable spatial interpolation model by producing an error estimate for an interpolated weather surface. Researchers can use these error estimates to select between spatial interpolation models. LOOCV is commonly used for producing error estimates for weather surfaces (Dirks et al., 1998; Luo et al., 2008; Mair & Fares, 2011) and is the most common method in FWI research (Cai et al., 2019; Flannigan & Wotton, 1989; Jain & Flannigan, 2017). LOOCV involves progressively omitting each weather station from the procedure and then comparing observed and expected results at each station. It has been criticized because it is sensitive to the spatial distribution and density of weather stations (Little et al., 2017).
Other cross-validation approaches can account for weather station characteristics. Shuffle-split cross-validation randomly withholds a group of stations when testing model performance, rather than one at a time. Although this method can reduce bias, some will persist because the procedure still samples more stations from areas of higher station density (Hutchinson et al., 2009). Alternatively, a researcher can select stations to withhold, avoiding ones near to each other or at similar elevations (supervised stratification) (Hutchinson et al., 2009). This approach works well when the researcher knows the weather station network well and it is stable over time. It may not be practical when the density and distribution of the weather station network changes over time. Blocked, stratified, and buffered cross-validation approaches can be used to reduce bias in error estimates in these networks (Roberts et al., 2017; Valavi et al., 2019).
Different cross-validation methods are suitable for different tasks. Cross-validation methods to accurately assess the error of a weather surface must take into account the density and distribution of weather stations. Cross-validation methods for spatial interpolation model selection do not necessarily need to focus on reducing biases in error estimates arising from the weather station network. The difference between the error of different spatial interpolation models (relative error) is more important for this task. However, if these relative error estimates do not account for spatial bias, they only allow for the selection of the best model as opposed to the evaluation of weather surface accuracy. Additional considerations such as computational efficiency can also be relevant. However, there is a significant gap in the current research regarding the question of whether different cross-validation methods identify the same spatial interpolation model as having the lowest error. If they do not, researchers must think carefully about which cross-validation method to use for spatial interpolation model selection to effectively account for their weather station network characteristics. If they do, researchers can use the most computationally efficient method.
The objective of our study was to determine whether different cross-validation methods identify the same spatial interpolation model as having the lowest error. We compared six cross-validation methods across the boreal region of Ontario and Québec (Figure 1) for the period of 1956–2018. Our goal was to identify an optimal cross-validation method for use in an automatic selection procedure for spatial interpolation models for the generation of historical daily FWI maps.
Figure 1. Location of the study area (Agriculture & Agri-Food, 2013; Flannigan et al., 2005).
We obtained weather data from Environment and Climate Change Canada for the period of 1956–2018 for the provinces of Ontario and Québec (Environment and Climate Change Canada, 2020). We used all available stations for the provinces, including those outside the study area limits, to ensure accuracy at the edges of the region. We used daily weather stations for precipitation and hourly weather stations for temperature, relative humidity, and wind speed data. The spatial distribution of weather stations varies year-to-year (and in some cases, day-by-day, if the station failed to collect data). There are generally more precipitation stations than hourly stations in early years and more hourly stations in later years (Figure 2). For this reason, we evaluated the cross-validation methods for spatial interpolation model selection over a time period of 63 years Figure 2 shows the density and distribution of stations at different points in the study period.
Figure 2. Distribution and density of the weather stations at different points in the study period (1 July 1956; 1 July 1987; 1 July 2018) for (a) temperature, (b) relative humidity, (c) wind speed, and (d) precipitation.
We ensured data quality by identifying potentially erroneous values in the historical weather files provided by Environment and Climate Change Canada (see the get_data module in
We compared five spatial interpolation models and six cross-validation methods (Table 2). Each combination was evaluated using the mean absolute error (MAE) generated using cross-validation at a single test date/time (July 1 at 13:00 DST) for each year in the study period. We selected this date/time because (a) it is clearly within the fire season and (b) FWI are calculated using hourly temperature, wind speed, and relative humidity, recorded at 13:00 DST. The exception is precipitation because the Canadian FWI System uses 24-hr estimates.
Table 2 Overview of Spatial Interpolation Models and Cross-Validation Methods
MAE was calculated as: [Image Omitted. See PDF]Where n is the number of stations, xi is the observed value, and i is the predicted value at station. MAE closer to zero indicates better performance (Luo et al., 2008). Here, the “best” or “optimal” model refers to the spatial interpolation model with the lowest MAE.
We compared the following spatial interpolation models for the weather variables: Inverse distance weighting (IDW), inverse distance elevation weighting (IDEW), thin plate smoothing splines (TPSS), Gaussian process regression (GPR; equivalent to regression kriging), and random forests (RF). We evaluated these spatial interpolation models using six cross-validation methods: LOOCV, shuffle-split, stratified shuffle-split, spatial k-fold with two blocking types (2d blocks and 3d clusters), and a modified buffered leave-one-out (LOO) procedure. We summarize these methods in Table 2.
Results Spatial Interpolation Model PerformanceGPR and TPSS performed well for temperature (Table 3, and Figure 3). They both produced low average MAE compared to the other methods for all cross-validation types. GPR also performed well for precipitation and RF performed well across the board (Table 3, Figure 3). On the other hand, according to all cross-validation methods, IDW β = 1 and IDEW β = 1 never resulted in the lowest average error estimates for any of the weather variables. IDW β = 2 and RF performed strongly for relative humidity and wind speed. GPR was consistently selected by all cross-validation methods for precipitation. The relative performance of most spatial interpolation models remained stable over time. Figure 4 shows their performance over the study period.
Table 3 Average Mean Absolute Error (MAE) Values for 1 July 1956–2018 for the Cross-Validation Methods With the Lowest Average MAE Presented in Bold for Each Weather Variable
LOOCV | Modified buffered LOO | Shuffle-split | Stratified shuffle-split | Spatial k-fold (9 clusters) | Spatial k-fold (16 clusters) | Spatial k-fold (25 clusters) | Spatial k-fold (9 blocks) | Spatial k-fold (16 blocks) | Spatial k-fold (25 blocks) | |
Temperature (⁰C) | ||||||||||
IDW β = 1 | 3.03 | 4.36 | 3.14 | 3.86 | 7.05 | 5.34 | 5.55 | 4.77 | 6.18 | 6.63 |
IDW β = 2 | 2.26 | 3.59 | 2.51 | 3.05 | 6.28 | 4.45 | 4.70 | 3.75 | 5.26 | 5.49 |
IDEW β = 1 | 3.16 | 4.36 | 3.26 | 4.05 | 7.13 | 5.44 | 5.71 | 5.02 | 6.25 | 6.78 |
IDEW β = 2 | 2.45 | 3.76 | 2.67 | 3.23 | 6.47 | 4.67 | 4.97 | 4.22 | 5.43 | 5.88 |
TPSS | 2.08 | 3.22 | 2.41 | 2.48 | 4.70 | 3.02 | 3.56 | 3.37 | 3.51 | 3.75 |
GPR | 1.88 | 2.68 | 2.22 | 2.54 | 5.04 | 2.80 | 3.43 | 2.84 | 3.51 | 3.73 |
RF | 2.20 | 3.14 | 2.52 | 2.80 | 4.82 | 3.47 | 3.80 | 3.22 | 4.01 | 4.10 |
Relative Humidity (%) | ||||||||||
IDW β = 1 | 11.47 | 14.90 | 11.79 | 12.95 | 19.32 | 14.78 | 17.82 | 16.69 | 17.17 | 19.79 |
IDW β = 2 | 9.54 | 13.40 | 10.25 | 11.08 | 18.15 | 13.13 | 15.80 | 14.13 | 16.15 | 18.48 |
IDEW β = 1 | 11.78 | 14.98 | 12.12 | 13.17 | 19.31 | 15.13 | 17.91 | 17.30 | 17.21 | 19.76 |
IDEW β = 2 | 10.01 | 13.54 | 10.75 | 11.52 | 18.23 | 13.69 | 16.18 | 15.34 | 16.17 | 18.53 |
TPSS | 9.90 | 14.16 | 11.37 | 11.21 | 17.69 | 12.82 | 13.87 | 13.47 | 15.26 | 14.13 |
GPR | 9.97 | 14.20 | 11.31 | 11.71 | 19.58 | 14.68 | 17.94 | 16.70 | 17.23 | 20.04 |
RF | 9.80 | 12.98 | 10.74 | 11.17 | 16.43 | 12.35 | 13.10 | 13.11 | 15.72 | 16.04 |
Wind Speed (km/hr) | ||||||||||
IDW β = 1 | 5.82 | 6.70 | 5.89 | 6.14 | 7.22 | 6.62 | 8.27 | 7.01 | 7.89 | 7.68 |
IDW β = 2 | 5.54 | 6.32 | 5.81 | 5.95 | 6.94 | 6.28 | 7.85 | 6.64 | 6.76 | 7.16 |
IDEW β = 1 | 5.90 | 6.68 | 6.04 | 6.24 | 7.24 | 6.73 | 8.35 | 7.09 | 7.69 | 7.69 |
IDEW β = 2 | 5.62 | 6.59 | 5.85 | 5.97 | 7.04 | 6.52 | 8.02 | 6.82 | 6.84 | 7.29 |
TPSS | 6.77 | 8.92 | 7.32 | 7.42 | 10.22 | 7.93 | 9.05 | 9.86 | 9.13 | 9.11 |
GPR | 5.90 | 6.84 | 6.14 | 6.30 | 7.36 | 7.48 | 8.35 | 7.14 | 7.49 | 7.45 |
RF | 5.60 | 6.43 | 5.92 | 5.97 | 6.81 | 6.52 | 7.56 | 6.87 | 7.25 | 6.77 |
Precipitation (mm) | ||||||||||
IDW β = 1 | 2.71 | 3.93 | 2.75 | 3.08 | 3.57 | 4.28 | 3.69 | 3.94 | 3.98 | 3.52 |
IDW β = 2 | 1.99 | 3.64 | 2.11 | 2.35 | 3.40 | 4.22 | 3.56 | 3.86 | 3.91 | 3.42 |
IDEW β = 1 | 2.82 | 3.89 | 2.85 | 2.85 | 3.60 | 4.27 | 3.71 | 3.97 | 3.97 | 3.50 |
IDEW β = 2 | 2.18 | 3.83 | 2.29 | 2.50 | 3.46 | 4.19 | 3.62 | 3.90 | 3.95 | 3.47 |
TPSS | 2.06 | 4.91 | 2.31 | 2.39 | 5.69 | 6.07 | 5.26 | 5.96 | 3.94 | 3.93 |
GPR | 1.86 | 2.95 | 1.91 | 2.00 | 2.73 | 3.38 | 2.69 | 3.03 | 2.92 | 2.65 |
RF | 2.00 | 3.27 | 2.18 | 2.26 | 3.27 | 4.17 | 3.46 | 4.04 | 3.72 | 3.11 |
Figure 3. Relative performance of different spatial interpolation models according to different cross-validation methods for (a) temperature, (b) relative humidity, (c) wind speed, and (d) precipitation.
Figure 4. Mean absolute error calculated using (a) leave-one-out cross-validation, (b) stratified shuffle-split, (c) shuffle-split, (d) spatial k-fold (16 clusters), and (e) modified buffered leave-one-out for the spatial interpolation models on July 1 13:00 DST for every year in the study period for temperature.
LOOCV resulted in the lowest error estimates for all weather variables. Shuffle-split reported similar but slightly higher error. Spatial k-fold generally resulted in the highest errors. The error did not always decrease with higher cluster/block numbers, as we might expect. For example, spatial k-fold with 9 clusters resulted in lower error for precipitation than the procedure using 16 clusters. Stratified shuffle-split and the modified buffered LOO procedure resulted in error estimates that were generally higher than LOOCV and shuffle-split but lower than spatial k-fold (Table 3, Figure 3). Overall, the difference between error estimates from different cross-validation methods is generally higher than the difference between error estimates for different spatial interpolation models.
Differences in Spatial Interpolation Model Selection by Cross-Validation MethodsThe cross-validation methods consistently identified certain spatial interpolation models as having lower error. They generally identified GPR as the best method for temperature, although stratified shuffle-split and some spatial k-fold iterations identified TPSS (Table 3). LOOCV, shuffle-split, and stratified shuffle split identified IDW β = 2 as the best performing method for relative humidity. The modified buffered LOO procedure and most of the spatial k-fold iterations identified RF. Similarly, for wind speed, most cross-validation methods identified IDW β = 2 as the best performing method, with some spatial k-fold iterations identifying RF. All cross-validation methods identified GPR as the best performing method for precipitation.
Visual Comparison of MethodsMaps produced by the spatial interpolation models for temperature for 1 July 1956, 1987, and 2018 show that the models produce very different surfaces (Figures 5–7). Some methods (GPR and TPSS) result in smoother surfaces. IDW β = 1 and 2 displayed the characteristics “bulls-eye” pattern that sometimes results from IDW methods. There is banding on the maps produced by RF, which is an artifact of latitude and longitude (Meyer et al., 2019). Unfortunately, excluding these predictors is not an option, since RF does not consider spatial autocorrelation if given no spatial information (Georganos et al., 2019). For example, if the model is only given elevation information, it will not be able to predict colder temperatures at higher latitudes, because it has only formed an association between lower elevations and lower temperatures.
Figure 5. Comparison of surfaces produced by (a) inverse distance weighting (IDW) β = 1, (b) IDW β = 2, (c) inverse distance elevation weighting (IDEW) β = 1, (d) IDEW β = 2, (e) thin plate smoothing splines , (f) random forests, and (g) Gaussian process regression for temperature for 1 July 2018 13:00 DST and (h) the distribution of weather stations on that date and time.
Figure 6. Comparison of surfaces produced by (a) inverse distance weighting (IDW) β = 1, (b) IDW β = 2, (c) inverse distance elevation weighting (IDEW) β = 1, (d) IDEW β = 2, (e) thin plate smoothing splines, (f) random forests , and (g) Gaussian process regression for temperature for 1 July 1987 13:00 DST and (h) the distribution of weather stations on that date and time.
Figure 7. Comparison of surfaces produced by (a) inverse distance weighting (IDW) β = 1, (b) IDW β = 2, (c) inverse distance elevation weighting (IDEW) β = 1, (d) IDEW β = 2, (e) thin plate smoothing splines , (f) random forests , and (g) Gaussian process regression for temperature for 1 July 1956 13:00 DST and (h) the distribution of weather stations on that date and time.
LOOCV, shuffle-split, stratified shuffle-split, and the modified buffered LOO procedure generally identify the same optimal spatial interpolation model, despite generating different error estimates (Table 3). Automatic daily selection of spatial interpolation models for producing FWI maps is a computationally intensive task, requiring cross-validation over large spatiotemporal scales for the four weather variables. The computationally efficient shuffle-split or modified buffered LOO procedure can thus be used for automatic selection because they generally select the same spatial interpolation model as the other cross-validation methods.
Figure 8 illustrates the improvements that are possible using shuffle-split cross-validation to select between spatial interpolation models at the daily scale for calculating the FWI codes. We compared maps of the Drought Code (DC) computed using the automatic selection procedure (Figure 8a) and a single model (IDW β = 2) (Figure 8b). The two approaches generate different results, with the “bulls-eye” pattern clearly visible on the map generated with IDW β = 2 (Figure 8b). FWI calculations were based on methods from the cffdrs R package (Wang et al., 2017) that we translated to Python for better integration with the cross-validation methods. Fire season start and end dates were estimated using the criteria of three days of maximum daily temperatures over 12°C and under 5°C, respectively (Wotton & Flannigan, 1993). The start and end dates were estimated using daily weather stations with unbroken records during the spring and fall seasons. These dates were then interpolated across the study area. Overwintering DC was not calculated because our study area receives large amounts of overwinter precipitation (Wang et al., 2017).
Figure 8. Comparison of drought code (DC) maps produced by (a) a daily automatic selection procedure (thin plate smoothing splines, Gaussian process regression (GPR) for temperature and inverse distance weighting (IDW) β = 4, GPR for precipitation) for the spatial interpolation model used for each weather variable and (b) solely using IDW β = 2 for the Drought code component of the Canadian Forest Fire Weather Index System for 30 June 2018.
The objective of this study was to determine whether different cross-validation methods identified the same spatial interpolation model as having the lowest error. The final application of this research is an automatic selection procedure for spatial interpolation models for generating daily historical FWI maps. Overall, the results suggest three important findings. (a) Shuffle-split and a modified buffered LOO approach select the same optimal spatial interpolation model as the more computationally intensive LOOCV and stratified shuffle-split methods. (b) There are significant differences in the error estimates generated from different cross-validation approaches. This means that researchers must think critically about the structure of their weather station network and their objectives before selecting a cross-validation method to estimate the true error of a weather surface. (c) The overall optimal spatial interpolation model differed by weather variable (Table 3, Figure 3) and was consistent over time (Figure 4). This suggests that we do not have to compare all options for automatic selection of spatial interpolation models when producing historical FWI maps, but only those that are consistently identified as having low relative error (such as TPSS and GPR for temperature).
Spatial Interpolation Model PerformanceWe found that spatial interpolation models selected by the six methods were dependent on the level of spatial autocorrelation between weather stations for different weather variables (Figure 9). Figure 9 shows the results of an analysis of spatial autocorrelation between weather stations using Moran's I for the four weather variables. Figure 9 suggests that the spatial autocorrelation between stations is much stronger for temperature than for relative humidity, wind speed, and precipitation.
Figure 9. Result of Moran's I for each testing date between 2000 and 2019 for different distance classes (maximum distances between stations for use in the analysis). Red indicates that positive spatial autocorrelation was detected (p [less than] 0.05, Moran's I value greater than the calculated reference value of −1/(n−1) where n is the number of weather station pairs within the distance class). Red indicates that positive spatial autocorrelation was detected and blue indicates that it was not detected.
TPSS performed strongly for temperature but poorly for wind speed and precipitation (Figure 3). TPSS likely performs well for temperature interpolation because it produces a smooth surface (Price et al., 2000). The smooth surface models a high level of spatial autocorrelation across the study area, which might be more accurate for variables like temperature that vary gradually over space. However, other weather variables, such as precipitation and wind speed, vary over shorter distances (Luo et al., 2008; Perry & Hollis, 2005) (Figure 9). Unlike TPSS, GPR performed well for both temperature and precipitation. It likely produced low error because it models the zone of influence around a station by fitting the length-scale parameter of the covariance function (Rasmussen & Williams, 2006). TPSS, although also based on Gaussian processes (Duvenaud, 2014), does not model the level of spatial autocorrelation in the weather network based on known data and is thus less flexible and suited to modeling different weather variables than GPR. Additionally, interpolation using thin plate splines can be unpredictable and produce extreme values outside of the area covered by the weather station network (Fornberg et al., 2002). We also found RF, as a machine learning method, performed better for extrapolation (as measured by spatial k-fold).
Although the spatial distribution and density of weather stations changed over time (Figure 2), we found that some spatial interpolation models consistently produced lower error (Figure 4). The consistency in optimal spatial interpolation models is likely related to the characteristics of the study area that do not change over time, such as topography and dominant weather masses. This consistency over time has implications for assuming temporal stationarity in the level of modeled spatial autocorrelation in the weather station network. For example, instead of fitting a covariance function for GPR for each individual day, we can increase computational speed by fitting it on a certain day with good spatial coverage and then applying it over the entire study period.
Differences Among Cross-Validation MethodsLOOCV and shuffle-split produced the lowest error estimates. Stratified shuffle-split and the modified buffered LOO approach produced intermediate error estimates. Spatial k-fold generally produced the highest error estimates. LOOCV might underestimate error because it does not adequately account for spatial dependence in the weather network. Specifically, it does not correct for patterns in error relating to denser stations in parts of the network. Consequently, areas with more stations will have more accurate error estimates and will be overrepresented in the final error calculation, biasing it lower. This problem is particularly true for our study area, where there are more stations in the south than the north (Figure 2). Past research showed that cross-validation methods that ignored spatial dependence underestimated model error (Roberts et al., 2017). Shuffle-split also does not account for spatial dependence.
Stratified shuffle-split and the modified buffered LOO procedure generated error estimates that were higher than LOOCV and shuffle-split. Both procedures deliberately oversample or weight remote stations compared to clustered ones. For stratified shuffle-split, sampling one station from each cluster means that the procedure rarely involves a high degree of extrapolation because it ensures that weather stations are always present across the study area. Similarly, the modified buffered LOO procedure ensures that training stations (those not withheld) are not extremely close to the testing data (the withheld station) (Valavi et al., 2019). However, the buffer size used in this study (200 km) was small enough that it did not induce a high degree of extrapolation.
The error estimates generated by spatial k-fold were generally the highest. This method places more emphasis on extrapolation error (Roberts et al., 2017). The error estimates varied by block and cluster size, likely relating to the weather station network and study area characteristics. The lower block number (9) did not always result in the highest error, likely due to the resulting grouping of stations and how this grouping relates to the underlying spatial autocorrelation of the station network (Roberts et al., 2017). The size of the clusters or blocks should describe the underlying error structure so that spatial autocorrelation is maximized inside and minimized between blocks (Roberts et al., 2017). For example, we should see remote stations with higher error in one block, and groups of stations with low error in another. However, this objective does not necessarily translate to a greater number of blocks. As the number of blocks increases, the error estimate approaches the LOOCV estimate. Spatial blocking by overlaying a grid over a map of stations is not as robust a method as spatial clustering for estimating the underlying spatial autocorrelation structure of the station network (Roberts et al., 2017). Spatial blocking using a grid may arbitrarily assign nearby stations to different folds even though they are highly spatially autocorrelated.
Differences in Spatial Interpolation Model Selection by Cross-Validation MethodsLOOCV, shuffle-split, stratified shuffle-split, and the modified buffered LOO procedure generally agree on the best method although they produce different error estimates. However, spatial k-fold identifies machine learning methods (GPR and RF) as performing strongly. RF can learn patterns in the data set (for example, “temperature is lower as elevation rises”), which it can then use to estimate values at new locations. GPR also performs well for extrapolation (particularly for precipitation). However, it may be a less accurate extrapolator compared to RF without automatic covariance function selection (Duvenaud, 2014). IDEW also performed better than IDW for extrapolation, suggesting that elevation is important for extrapolation purposes.
The difference in the results of the cross-validation methods is related to the extent to which they assess extrapolation versus interpolation. Spatial k-fold favors methods adept at extrapolation (RF and GPR). LOOCV and shuffle-split place more emphasis on interpolation by placing a higher weight on the error in the internal part of the weather station network. Thus, these cross-validation types favor methods such as TPSS and IDW. Stratified shuffle-split and the modified buffered LOO procedure place a heavier weight on extrapolation error compared to LOOCV and shuffle-split but less than spatial k-fold. However, they both generally select the same methods as LOOCV and shuffle-split.
For relative humidity, the model identified as having the lowest error by the modified buffered LOO procedure (RF) differed from that identified by LOOCV, shuffle-split, and stratified shuffle-split (IDW β = 2). This might be related to the choice of buffer size (200 km). It is possible that the modified buffered LOO procedure placed more emphasis on extrapolation error for relative humidity at this buffer size. For temperature, the model identified as having the lowest error by stratified shuffle-split (TPSS) differed from LOOCV, shuffle-split, and the modified buffered LOO procedure (GPR). This is likely related to the fact that GPR and TPSS produce very similar maps for temperature (Figures 5–7) and thus very close error estimates.
Study Limitations and Remaining ChallengesA major limitation of this study is that we assumed that cross-validation is a robust method for determining which spatial interpolation models are best for creating the input surfaces for the Canadian FWI System. However, the question of whether the surfaces selected by cross-validation actually result in FWI maps that correspond more strongly to actual wildfire activity remains untested. Next steps for this research topic involve testing and refining the automatic selection procedure and relating the FWI maps it produces to historical fire activity. This study did not evaluate the accuracy of error estimates, just how they varied between cross-validation types. Future work could look into the accuracy of the error estimated by different cross-validation for weather variable interpolation by calculating target (ideal) error values that the cross-validation methods should approximate (Roberts et al., 2017). A final limitation of this study is that it did not test whether the cross-validation methods selected the same spatial interpolation model when ensemble or multi-output spatial interpolation models (such as multi-output Gaussian processes/co-kriging) were included. Given that these are growing in popularity for interpolation, this should be tested in the future.
ConclusionIn summary, this study showed that cross-validation methods that place a focus on interpolation (LOOCV, shuffle-split) or a mixture of interpolation and extrapolation (stratified shuffle-split, modified buffered LOO) tend to identify the same spatial interpolation model as having the lowest error. Cross-validation methods placing a greater emphasis on extrapolation (spatial k-fold) tended to select a consistent but different method depending on the weather variable. These results suggest that computationally efficient methods (shuffle-split, modified buffered LOO) can be used for automatic daily selection of spatial interpolation models for daily generation of improved historical FWI maps. Accuracy and efficiency in creating FWI maps is important for fire modeling and forecasting. These data are necessary for modeling the complex spatial and temporal relationships among climate change, fire weather, and fire activity, and to forecast future impacts and risk.
AcknowledgmentsThe authors declare no financial conflicts of interests (nor related to their affiliations). Funding for this work was provided by a Queen Elizabeth II Graduate Scholarship in Science and Technology from the Province of Ontario and the University of Toronto awarded to C. Risk and the Canada Wildfire/NSERC Strategic Network to PMAJ. Plots and analysis were completed with Matplotlib (Hunter, 2007), scikit-learn (Buitinck et al., 2013; Pedregosa et al., 2011), SciPy (Virtanen et al., 2020), SciPy's limited-memory BFGS-B algorithm (Byrd et al., 1995; Morales & Nocedal, 2011; Zhu et al., 1997), NumPy (Harris et al., 2020), PySAL (Rey & Anselin, 2007), and Pandas (McKinney, 2010). Components of custom software (Risk & James, 2022) for calculating the Drought code (DC) (Figure 8) were adapted from the Cffdrs R package (Wang et al., 2017). The shape files for Ontario and Québec were obtained from Natural Earth. We would like to thank Dr. Mike Wotton for advice on spatial interpolation method and resources to confirm the accuracy of the DC calculations, Dr. Marie- Josée Fortin, whose course parts of this manuscript were produced for, and the two reviewers of this manuscript for their extensive and incredibly helpful comments.
Data Availability StatementThe data supporting the conclusions and the Python code for implementing the analysis are available online here:
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The Canadian forest fire weather index (FWI) system requires spatially continuous, gridded weather data for temperature, relative humidity, wind speed, and precipitation. Reliable estimates of the Canadian FWI system components are needed to ensure the safety of communities, resources, and ecosystems. The quality of the interpolated input weather variables are typically evaluated using error estimates from cross‐validation. These error estimates are used for selecting between spatial interpolation methods for generating the continuous weather surfaces. Leave‐one‐out cross‐validation (LOOCV) is the most commonly used method, but it is biased in spatially clustered weather station networks. Accurate error estimation is important for selecting the optimal interpolation method and evaluating how well an interpolated surface represents true patterns in a weather variable. Other cross‐validation methods may better account for bias relating to clustered weather station networks. We present a comparison of cross‐validation methods for evaluating spatial interpolation models of weather variables for generating the inputs to the Canadian FWI system with the objective of determining whether they identify the same spatial interpolation model as having the lowest error. We found that LOOCV, shuffle‐split, stratified shuffle‐split, and a modified buffered leave‐one‐out procedure generally identified the same spatial interpolation models as having the lowest error. Spatial k‐fold favored spatial interpolation models with extrapolation ability. Our findings indicate that the most computationally efficient cross‐validation approach can be used for automatically selecting spatial interpolation models for weather surface generation, which will improve the quality of historical daily FWI maps.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer