1. Introduction
Forests are essential parts of terrestrial ecosystems, rich in both biomass and biodiversity [1,2]. In northeastern China, a major timber-producing region, pine caterpillar infestations threaten forest resources [3,4,5], ecosystem stability, and economic sustainability [6,7,8]. Accurately assessing the risk of pine caterpillar infestation and understanding how habitat factors drive this risk are crucial.
Understanding the habitat factors influencing pine caterpillar infestation mainly relies on sample plot surveys and meteorological station data, which have limitations [9,10,11]. Meteorological stations are unevenly distributed, with few in remote or mountainous areas, creating gaps in capturing local climate variations in complex terrains [12,13,14]. Furthermore, meteorological data focus on single factors like temperature and precipitation, often overlooking key non-meteorological factors such as soil, topography, and vegetation [15]. Some stations also lack long-term continuous data, which limits their usability [16]. To address these issues, recent research combines remote sensing and ground survey data with multi-scale modeling approaches [17]. Advances in remote sensing technology now provide large-scale, long-term habitat data with spatiotemporal continuity and diverse dimensions, enabling the monitoring of critical factors such as host area, growth status, field environment, and agricultural landscape patterns. Multi-source remote sensing, including optical, microwave, and thermal infrared data, improves habitat suitability assessment [18,19,20]. However, some studies still rely solely on outdated datasets, such as ‘WorldClim’ (1970–2000) [21,22], or short-term local data, limiting the ability to identify long-term trends and analyze the driving effects of habitat factors on infestation risk [23,24]. Additionally, incomplete or unevenly distributed data on climate, vegetation, and infestations reduce the general applicability of the findings. This highlights the need for a new approach to improve pest infestation risk assessments.
When assessing the level of infestation risk, multiple factors (e.g., climate, topography, soil, and land use) are often weighted for a comprehensive analysis [25,26,27]. However, these methods have limitations, including inaccurate weight assignments, overlooked factor interactions, lack of scientific basis for importance rankings, and insufficient uncertainty analysis, which reduce applicability across regions and scenarios. Pest habitat studies often use models like species distribution [28,29,30], statistical [31,32,33], and niche models [34,35,36]. However, these methods often lack consistency, limiting their integration into a unified, dynamic framework. MaxEnt is widely used to analyze species distributions, but its results typically focus on areas near the sample data [37,38,39], with limited comparability due to differences in model assumptions and parameters. Many models are region- or condition-specific, lacking scalability for larger areas [40]. Machine learning models often predict infestation risk zones but rarely analyze geographical or spatiotemporal distributions in depth [41,42,43], focusing mostly on short-term infestations and neglecting long-term trends [44]. Pine caterpillar infestations are influenced by both intrinsic (e.g., eggs) and extrinsic (e.g., climate, topography, soil, forest composition, human activities) factors, creating a complex habitat system that single models struggle to analyze [11,45,46]. Current studies often overlook multi-factor interactions [10], and real-world habitat factors display non-uniform distributions, further limiting analysis accuracy [47,48,49].
Existing studies mainly examine single-factor impacts on pine caterpillar infestation risk. Sparse meteorological data and insufficient fusion have hindered a systematic risk assessment framework. The absence of region-specific factor selection complicates accurate risk assessment and long-term habitat influence analysis. To improve this issue, this study incorporated snow cover and soil factors in Northeast China and proposed an APCIRD framework combined with the idea of MLP-random forest (MRF) to comprehensively understand and quantify the driving role of habitat factors on the risk of pine caterpillar infestation in Northeast China. Multi-layer Perceptron (MLP) can better capture complex relationships in non-normal data, and random forests have been widely used in building models [50,51,52]. This study assesses the temporal and spatial variations in pine caterpillar infestation risk from 2019 to 2024 by combining MRF with SHAP, fitting functions, frequency analysis, and GeoDetector. It identifies key areas and habitat factors requiring attention, models the functional relationship between key habitat factors and infestation risk, quantifies the optimal threshold range of these factors that contribute to high infestation risk, and demonstrates that the interaction between factors provides stronger explanatory power for infestation risk than individual factors.
The main contributions of this study are as follows: (1) Considering actual conditions, including snow accumulation and soil, this study integrates the APCIRD framework with an MRF to draw a county-scale infection risk map from 2019 to 2024 and highlight key areas of concern. (2) Assessing the impact of snowpack and soil on infestation risk, quantifying the relationship between key habitat factors and risk, and analyzing risk variation with these factors. (3) Analyzing the characteristics of habitat factors contributing to high risk and quantifying their optimal threshold ranges. (4) Demonstrating that interaction between factors provides stronger explanatory power for infestation risk than individual factors.
2. Study Area and Data
2.1. Study Area
We focused on northeastern China, covering Heilongjiang, Jilin, Liaoning, and eastern Inner Mongolia (Figure 1). The geographical extent spans from 38°40′N to 53°30′N and from 115°05′E to 135°00′E, covering approximately 1.24 million km2. The terrain is mainly composed of plains, hills, and mountains. Winters are marked by low temperatures, high wind speeds, and prolonged snow cover, affecting forest water cycles and energy flow. Key pine species include Pinus sylvestris L., Pinus koraiensis Siebold & Zucc., Pinus tabuliformis Carrière., and Pinus pumila (Pall.) Regel (
2.2. Data Source
2.2.1. Distribution Points of Pine Caterpillar Infestation and Non-Infestation
The distribution point data of pine caterpillar infestation mainly come from the Global Biodiversity Information Network Database (
In 2018 and 2019, a serious pine caterpillar disaster occurred in Changbai Mountain. In 2019 and 2020, two field surveys were conducted on the pine caterpillar disaster in Changbai Mountain. There were 34 coordinate points recorded in 2019 and 65 coordinate points recorded in 2020 (Figure 2). After comprehensive analysis, 528 distribution point data of pine caterpillar infestation from 1981 to 2018 were finally determined and collected. The repeated points among the 528 points were sorted out year by year, and a total of 1903 records were collected. After sorting out the data of pine caterpillar infestation distribution points, the data were screened and the data with missing information were removed to obtain the usable pine caterpillar infestation distribution point data.
In order to avoid spatial overlap of samples, the minimum spatial buffer distance (10 km) from the known infestation points was set, which was controlled within the range of 10–50 km, and non-infestation points were randomly selected according to time and space stratification to ensure the representativeness and balanced distribution of the samples, so that they matched the infestation points as much as possible in terms of spatial distribution and sample quantity. In this study, such points do not represent “absolutely no insect pests” but serve as an approximate substitute for “unrecorded infestation” areas. Although there are limitations in historical records, non-infestation points may include areas that have not been monitored or were missed in records. In the absence of continuous insect population density monitoring data, the spatial modeling of infestation risk can still be supported to a certain extent by setting spatial range and time layer screening strategies.
2.2.2. Habitat Factor Data
The habitat factor data (Table 1, “5-year average” in the Table 1 refers to the 5-year average before the year of pine caterpillar infestation) primarily come from the ERA5_Land reanalysis dataset (
Since the studied pine forest areas are mainly distributed in mountainous areas, the annual average habitat factor dataset from 1979 to 2018 was reconstructed using the CDO (1.9.9rc1) software by the bilinear interpolation method with a spatial resolution of 0.01° × 0.01°. We resampled DEM, slope, and population density data to a uniform 0.01° resolution. Infestation data were matched with habitat factors using time and location information. A 5 km sampling window was centered on infestation coordinates. Given the typical 3–8-year infestation cycle, the five-year average of habitat factors before each infestation year were used as independent variables for model training, applying the same method to non-infestation points. After data cleaning, the dataset comprised 3900 samples.
2.3. Shapiro–Wilk Test
The normality of habitat factor data is crucial for selecting the appropriate model construction method. The Shapiro–Wilk test includes three indicators: the statistic, p-value, and normality. The statistic measures the degree of deviation from a normal distribution, with values ranging from 0 to 1. Values closer to 1 indicate a better fit to a normal distribution [53]. The p-value is used to assess whether the null hypothesis (data follow a normal distribution) can be rejected. When the p-value > 0.05, the data may follow a normal distribution; when the p-value ≤ 0.05, the data significantly deviates from normality.
The results of the Shapiro–Wilk test (Figure 3) show that the p-values for all factors are less than 0.05, indicating that these factors do not pass the normality test at the 0.05 significance level. Some factors (e.g., u10 and v10) have values close to 1, all greater than 0.98, while the statistics for t2m_04 and stl1_04 are 0.9854 and 0.9800, respectively, indicating they are close to a normal distribution. Despite their p-values being less than 0.05, the deviation is relatively low, but potential bias should be noted. In contrast, the statistics for sro_04, sd_04, and tvh are significantly lower than 1 (0.4529, 0.3390, and 0.5076, respectively), suggesting these factors deviate markedly from normality, possibly due to skewness or kurtosis. Other factors, such as d2m, e, and stl1, have statistics around 0.90, indicating a clear deviation from normality.
2.4. Risk Assessment and Habitat Factor Analysis Methods
2.4.1. Conceptual Framework
This study proposes the idea of combining the APCIRD framework with machine learning (Figure 4), which integrates pine caterpillar infestation distribution data, habitat factors, SHAP, frequency analysis, fitting functions, and GeoDetector to assess the risk and spatiotemporal changes of the infestation from 2019 to 2024, while also comprehensively analyzing the driving effects of habitat factors on infestation risk.
2.4.2. The Risk Assessment Model
In the previous study, the Shapiro–Wilk test on 3900 data points showed that all factors failed the normality test, which may better reflect real conditions. The MLP feature extractor, a deep learning model for capturing nonlinear features, maps high-dimensional data to a lower-dimensional space through fully connected layers. Therefore, we used MLP to extract factor features and combined them with random forest to improve the accuracy of infestation risk assessment, forming an MRF model.
To compare performance, we applied three machine learning models—random forest (RF) [54], Extreme Gradient Boosting (XGBoost) [55], and Light Gradient Boosting Machine (LightGBM) [56]—along with SHapley Additive exPlanations (SHAP), frequency analysis, and GeoDetector to analyze factor contributions and interactions in infestation risk. RF, an ensemble method, enhances accuracy by aggregating predictions from multiple decision trees, handles high-dimensional data well, and provides feature importance insights. XGBoost, a high-performance gradient boosting variant, iteratively trains decision trees for better efficiency but requires complex parameter tuning. LightGBM, developed by Microsoft, is a fast, memory-efficient gradient boosting framework ideal for large-scale and complex data. The dataset had a roughly 1:1 positive-to-negative sample ratio, split into training and test sets at a 4:1 ratio. Model performance was assessed using precision, recall, F1-score, and validation accuracy.
2.4.3. SHAP and Fitting Function
SHAP enhances the interpretability of machine learning models. By combining SHAP values with normalized variables, it helps to fit the functional relationship between key habitat factors and the risk of pine caterpillar infestation, illustrating how the risk changes as these factors vary [57]. This approach improves our understanding of the driving relationship between pine caterpillar infestation risk and habitat factors.
2.4.4. Frequency Analysis
Introducing frequency analysis methods commonly used in ecology and geography to examine the distribution of biological or environmental phenomena helps analyze the characteristics of habitat factors contributing to high infestation risk [58]. This method clarifies the optimal threshold range between high infestation risk levels and each habitat factor, calculated using the following formula:
F = Si/S(1)
In Equation (1), Si represents the number of pixels in factor i with higher and highest risk levels, while (S) represents the total number of pixels with higher and highest risk levels occurring. (F ∈ [0,1]).
2.4.5. GeoDetector
GeoDetector is a tool used to analyze spatial distribution and its influencing factors, commonly applied in fields like geography and environmental science [59]. It reveals driving effects by detecting spatial heterogeneity and interactions between factors, without assuming homogeneity of variance or normality. This study employed GeoDetector to analyze the relationship between pine caterpillar infestation risk levels and habitat factors. The analysis included (1) evaluating the explanatory power of individual factors on infestation risk and (2) exploring the explanatory power of habitat factor interactions on infestation risk. Each factor was discretized into five levels using the natural breakpoint method, and GeoDetector was used to calculate each factor’s explanatory power (q-value) and significance (p-value).
3. Results and Analysis
3.1. Comparison of Accuracy of Different Models
This study estimated pine caterpillar infestation risk probabilities using machine learning models, defining probabilities above 0.5 as “Occur” and below 0.5 as “No Occur”. The models showed significant performance differences (Table 2). MRF performed the best, achieving high precision (0.97), recall (0.99), and F1-score (0.98) for the “No Occur” class, indicating a low false positive rate. For the “Occur” class, it had a precision of 0.97, a recall of 0.90, and an F1-score of 0.95, with a low false negative rate. The validation accuracy (val_accuracy) was 0.9748. RF also performed well, with a precision of 0.95, recall of 0.99, and F1-score of 0.97 for the “No Occur” class. For the “Occur” class, it had a precision of 0.96, recall of 0.86, and an F1-score of 0.91, with a val_accuracy of 0.9505. XGBoost had slightly lower performance than RF, with an F1-score of 0.96 for the “No Occur” class (precision: 0.93, recall: 0.98), while for “Occur”, its precision was 0.95, recall was lower at 0.83, and the F1-score was 0.89, resulting in a val_accuracy of 0.9385. LightGBM outperformed RF in the “No Occur” class, with a precision of 0.94, recall of 0.98, and F1-score of 0.96. For the “Occur” class, it had a precision of 0.95, recall of 0.85, and an F1-score of 0.90, positioning it between RF and XGBoost, with a val_accuracy of 0.9417. Overall, MRF had the best performance, followed by RF.
3.2. Spatiotemporal Changes of Infestation Risk Levels of Pine Caterpillar
Model validation results show that the MRF method is the most effective for assessing pine caterpillar infestation risk. We applied this model to evaluate infestation risk in the study area for 2019 and classified the risk levels (Table 3 and Figure 5). Comparison with the 2019 warning data from the Pest Control Center of the State Forestry and Grassland Administration of China showed that the high-risk infestation areas aligned with the official warnings (
From 2019 to 2024, the pine caterpillar infestation risk level showed obvious changes (Figure 5). In general, the proportion of low-risk areas (Lowest and Lower) was high, while the proportion of high-risk areas (Higher and Highest) decreased year by year, reflecting the gradual reduction of the risk of disaster. In 2019, the low-risk areas were 39.09% (Lowest) and 37.48% (Lower), the medium-risk areas were 12.83%, and the high-risk areas (Higher was 7.46% and Highest was 3.14%) accounted for a small proportion. By 2020, the low-risk areas increased further, with Lowest at 40.49% and Lower rising sharply to 51.34%, while the high-risk areas decreased significantly, with Highest falling to 1.37%. In 2021, the low-risk areas dominated, with Lowest and Lower accounting for 47.93% and 48.90% respectively, and the high-risk areas almost disappeared (Higher and Highest were 0.01% and 0%). In 2022, the proportion of Lowest risk areas further increased to 59.37%, and the proportion of Highest risk areas was almost zero. In 2023 and 2024, the proportion of low-risk areas fluctuated, especially in 2024, when the Lower area rose sharply to 74.74%. Despite this, the proportion of Highest risk areas remained at a very low level, especially in 2024, when the Highest risk area completely disappeared. Overall, the risk of pine caterpillar infestation has decreased significantly in the past few years, with the number of low-risk areas increasing year by year and the number of high-risk areas gradually decreasing.
As can be seen from Figure 6, the number of pixels of the pine caterpillar disaster risk level showed significant fluctuations between 2019 and 2024, reflecting the changing trend of disaster risk in different years. The low-risk areas (Lowest and Lower) occupied a higher number of pixels in most years, especially in 2022 and 2024, the number of pixels in the Lowest risk area increased significantly, to 8655 and 3517, respectively, while the Lower area reached 10,896 pixels in 2024, indicating that the low-risk area has a certain expansion trend in space. In the medium-risk area (Medium), it decreased year by year from 1870 pixels in 2019 to 168 pixels in 2024, showing a gradual weakening of the disaster risk. In particular, the number of pixels in the Medium area dropped significantly in 2021 and 2022. The number of pixels in the high-risk areas (Higher and Highest) showed a clear downward trend. In 2019, the number of pixels in the Higher and Highest areas was 1087 and 458, respectively. By 2024, these areas had almost disappeared, with only 126 Higher areas remaining. Overall, the changes in the risk of pine caterpillar infestation between 2019 and 2024 showed that the proportion of low-risk areas increased year by year, while the proportion of high-risk areas gradually decreased, reflecting the overall downward trend of the risk of pine caterpillar infestation.
In practical applications, after verifying the accuracy of the model, we conducted a detailed assessment of the risk of pine caterpillars from 2019 to 2024, and based on this, we drew a spatial distribution map of the pine caterpillar risk level (Figure 7). This analysis shows the changing trend and spatial distribution characteristics of the risk of pine caterpillars in different years. Specifically, the high-risk areas for pine caterpillars in 2019 were mainly distributed in the southwest and northeast of Liaoning Province, the east and southeast of Jilin Province, and the south, east and northeast of Heilongjiang Province. The risk of pine caterpillars in these areas is relatively concentrated, especially in Lingyuan, Jianping, Yuanbaoshan, Ningcheng, and other places in the southwest of Liaoning Province, while the northeast mainly covers Tonghua, Xinbin, Qingyuan, and other areas. In Jilin Province, Hunchun, Dunhua, Longjing, Antu, Helong, and Changbai Mountain in the east, as well as Liuhe, Huinan, Panshi, Jingyu, and other places in the southeast, are all areas with a high risk of pine caterpillars. There are also certain high-risk areas in the south, east, and northeast of Heilongjiang Province. Acheng and Binxian in the south, Yangming, Aimin, Muling, Linkou, Jitong, and other areas in the east, as well as Huanan, Huachuan, Tangyuan, Youyi, Baoshan, and other places in the northeast, all have a high risk of pine caterpillars. By 2020, the high-risk areas for pine caterpillars changed, mainly concentrated in the southwest of Liaoning Province, mainly in Lingyuan, Chaoyang, Beipiao, Jianping, Jianchang, and other places. High-risk areas have also appeared in the Oroqen area of the Inner Mongolia Autonomous Region. Compared with 2019, the risk areas in Jilin Province decreased. The risk distribution in 2023 shows that the higher-risk areas for pine caterpillars are still concentrated in the southwest of Liaoning Province, including Lingyuan, Chaoyang, Beipiao, Jianping, Jianchang, and other places, but the risk areas have moved northward and extended to Fuxin, Changtu and other places. The high-risk areas in Jilin Province are mainly distributed in the northern slope of Tianchi Lake in Changbai Mountain, indicating that the ecological environment in this area still has an important impact on the growth and spread of pine caterpillars. In 2021, 2022, and 2024, the disaster risk of pine caterpillars was generally at a medium risk level or below, the risk area gradually decreased, and the distribution was relatively more dispersed. This change shows that the infestation risk of pine caterpillars has been effectively controlled, especially in 2024, when the overall risk level was low. By analyzing the spatial distribution of pine caterpillar risk levels in different years, it can be clearly observed that the infestation risk of pine caterpillars has shown a certain spatial contraction trend over time, especially in the years when prevention and control measures were gradually strengthened, the disaster risk of pine caterpillars decreased significantly.
3.3. Frequency Analysis of Infestation Risk Levels of Pine Caterpillar
To accurately quantify the optimal threshold range of each habitat factor that leads to higher and highest risk levels of pine caterpillar infestation, frequency analysis was applied to the model assessment results. Since higher and highest infestation risk levels were absent in 2021, 2022, and 2024, only the risk assessment data for 2019, 2020, and 2023 were considered (Figure 8, Figure 9 and Figure 10). The frequency analysis revealed general characteristics of areas with higher pine caterpillar infestation risks, including low to medium altitude, medium to high net surface solar radiation, moderate to high temperatures, gentle slopes (<30°), low to medium evaporation, low snow depth, medium snow temperature, low to moderate soil moisture, moderate to high soil temperature, low to moderate rainfall, low to moderate wind speed, low to moderate leaf area index, high vegetation type, low to moderate vegetation cover, low population density, and low surface runoff. Altitude affects pine caterpillars through temperature, humidity, and vegetation. At higher altitudes, cooler temperatures and shorter growing seasons limit their life cycle and reproduction. Lower altitudes, with warmer climates and longer seasons, support caterpillar growth, aided by abundant pine vegetation [60]. Slope also influences distribution; steeper slopes typically have lower soil moisture, which limits pine growth and food sources, while wet, low-slope areas favor caterpillar growth and reproduction [61]. Soil moisture is critical for plant water supply, with wet conditions promoting pine growth and providing more food. Soil temperature affects root growth and plant resistance. Extreme temperatures, however, can negatively impact the caterpillar life cycle [62]. The interaction between soil temperature and moisture influences activity and habitat selection, with moist, warm springs providing ideal conditions for growth and reproduction.
The optimal threshold ranges for the quantified high and highest levels of pine caterpillar infestation risk for each habitat factor are as follows: cvh: optimal threshold is <0.7, with risk increasing in the range of 0.2–0.6; d2m: optimal range is 268–274, especially 271–274, where infestation risk is higher; DEM: optimal range is <800, with higher occurrence probability in this range; e: optimal range is −0.0017 to −0.0009, with strong reactions in the range of −0.0017 to −0.0013; lai_hv: optimal range is 1.01–3.50, with risk increasing from 1.88 to 3.20; slope: optimal range is <26°, with significant impact below 17.32°; srr: optimal range is >11.10 M (Million), with significant impact from 11.09 M to 15.19 M; stl1: optimal range is 276.73–286.92, with a notable response above 280, consistent with stl1_04; swvl1: optimal range is 0.2–0.3, with notable dependence, consistent with swvl1_04; t2m: strong response above 278, consistent with t2m_04; tp: optimal range is <0.0023, with high sensitivity, consistent with tp_04; tsn: optimal range is 266.73–275; tvh: optimal range is 13.9–19; u10: optimal range is 0.6–1.7; and v10: optimal range is −0.76 to 0.30. Quantifying the optimal threshold ranges between habitat factors and high pine caterpillar infestation risk contributes to a better understanding of the driving effects of these factors.
3.4. Identification of Key Habitat Factors
To accurately identify the key habitat factors influencing the risk of pine caterpillar infestation, a combination of characteristic importance and single-factor explanatory power analysis was used. Figure 11 presents the importance ranking of each habitat factor, with higher values indicating a greater impact on the model. The top 10 most important factors are stl1, swvl1, sd, ssr, t2m, tsn, d2m, stl1_04, swvl1_04, and u10. Figure 12 and Table 4 display the explanatory power of each factor on infestation risk, with higher q-values indicating stronger explanatory power.
The factors marked in red in Table 4 represent the top 12 factors based on their explanatory power rankings each year, having a strong influence on infestation risk. By analyzing both the ranking of feature importance and the frequency of the top twelve factors each year, eight factors—stl1, swvl1, ssr, sd, t2m, tsn, d2m, and lai_hv—emerged as key habitat factors for infestation risk. The snow factors (sd, tsn) and soil factors (stl1, swvl1) are particularly important. This underscores the significance of considering the area’s long snow cover duration and the effect of snow on soil factors, highlighting the need to include both snow and soil factors in the analysis. This approach provides a fresh perspective compared to previous studies, which primarily focused on topography, temperature, and precipitation.
3.5. SHAP and Fitting Function Analysis
Figure 13 and Figure 14 illustrate the driving effect of habitat factors on the risk of pine caterpillar infestation, based on the average absolute SHAP value. A longer bar on the horizontal axis indicates a greater driving effect of the factor. Key habitat factors such as ssr, sd, swvl1, stl1, t2m, d2m, tsn, and lai_hv show higher SHAP values, consistent with the factors identified through feature importance and explanatory power analysis. Some medium and low-importance factors exhibit higher SHAP values than key drivers. The SHAP values of ssr, tsn, d2m, t2m, stl1, and lai_hv are positively correlated, with lower values found in areas less than 0 and higher values in areas greater than 0. In contrast, swvl1 shows a negative correlation. Factors like cvh, slope, and t2m_04 have narrower SHAP value distributions, indicating minimal impact on infestation risk.
Figure 13 and Figure 14 illustrate the overall influence of habitat factors on pine caterpillar infestation risk but do not provide a detailed analysis of their specific relationships. To address this, key habitat factors were normalized, and SHAP value scatter plots were generated (Figure 15). The driving relationships of ssr, lai_hv, d2m, stl1, swvl1, sd, and tsn were modeled using quartic polynomials. For ssr, the trend increased gently at first, then sharply around 0.4. The lai_hv showed steady growth, transitioning from rapid to gradual around 0.4. The d2m exhibited a complex pattern of decrease, increase, and then another decrease, with trend shifts around 0.1 and 0.6. A similar pattern was observed for sd, with changes around 0.15. The tsn increased steadily, shifting from gradual to rapid growth around 0.3. The stl1 showed a steady rise, accelerating around 0.4. The swvl1 decreased, with a trend shift at approximately 0.7. The relationship for t2m was modeled using a cubic polynomial, showing an initial rapid increase, a slower rise, and another sharp increase, with trend changes at about 0.2 and 0.7.
3.6. Interaction Detector Results
Figure 16 demonstrates that the interaction between habitat factors has greater explanatory power regarding the risk of pine caterpillar infestation than individual factors alone. Light blue indicates a double factor enhancement, while orange represents a nonlinear enhancement. The p-values for the single-factor explanatory power of tvh, sro_04, slope, and popu_densi are close to 1, suggesting their limited reliability in explaining infestation risk, both individually and through interactions. Only the interactions of the remaining factors are discussed. Nonlinear enhancement is primarily observed between sd and tp, as well as between tp and DEM. The interaction between habitat factors offers a more substantial explanatory power for infestation risk than individual factors, underscoring the complexity of the driving effects on pine caterpillar infestation risk.
4. Discussion
Previous studies have primarily focused on the impact of individual factors on the risk of pine caterpillar infestations, particularly climate variables (such as temperature, precipitation, and drought) and stand structure [63,64]. Moreover, much of the research has concentrated on short-term infestation factors, with a lack of long-term studies on habitat variables and insufficient integration of diverse data sources [65,66]. Previous studies suggest that pure forests, particularly coniferous forests, are more vulnerable to pine caterpillar infestations than mixed forests [67,68]. In addition, site conditions such as topography and soil characteristics also influence the risk of infestation [69,70,71]. These studies highlight the importance of considering habitat factors comprehensively in assessing the risk of pine caterpillar infestation. However, there are few studies that further explore the driving effects or relationships.
Instead of relying on traditional methods such as species distribution and niche modeling, this study introduced the APCIRD framework and MRF to assess the risk of pine caterpillar infestation. Key habitat factors identified include ssr, sd, swvl1, stl1, t2m, d2m, tsn, and lai_hv, with particular emphasis on snow (sd, tsn) and soil factors (stl1, swvl1), underscoring their significance in this study’s design. The driving relationships of these factors were modeled using quartic and cubic polynomials, revealing complex, nonlinear interactions. For example, ssr (importance value 0.0942) exhibits a rapid increase in risk beyond a certain threshold, while lai_hv (importance value 0.0334) has a strong impact until a certain limit, after which its effect diminishes. sd (importance value 0.0814) shows a U-shaped relationship with risk, with moderate levels increasing it. The swvl1 (importance value 0.0587) has a negative impact at lower levels but exacerbates risk beyond a certain threshold. This study also identifies optimal threshold ranges for habitat factors contributing to high infestation risks and demonstrates that factor interactions offer stronger explanatory power for infestation risk than individual factors. These findings align with previous research on the spatiotemporal driving effects of habitat factors on pine caterpillar infestation risk [16,72].
Unlike previous studies, this research indicates that in this study area, climate, soil, snow, and vegetation are the primary factors influencing the risk of pine caterpillar infestation, while topography and human factors play a lesser role. Frequency analysis results show the area is characterized by low to medium altitude and gentle slopes (slope < 30°). High temperatures at lower altitudes favor pine caterpillar development, while low temperatures at higher altitudes hinder it. Moderate to high surface net solar radiation enhances the risk of infestation. This study emphasizes that climate, snow, and soil factors have significant driving effects on infestation risk. Understanding these dynamic feedbacks is crucial, especially in the context of climate change and its impact on insect disturbances.
Dynamic feedback is essential to understanding insect disturbances, particularly in the context of climate change [73,74]. Global climate change is increasing the likelihood of range expansions for certain insects, thereby amplifying the risk of infestations. Climate change may also weaken the limiting effects of some habitat factors on insect distribution in specific regions [75,76]. For example, an interaction between soil moisture (swvl1) and soil temperature (stl1) was observed: moist soils warm more slowly than dry soils, suggesting that under moist conditions, swvl1 and stl1 may be negatively correlated. During the day, increased soil moisture suppresses rapid temperature rises, while at night, it slows down cooling. Consequently, habitats with high solar radiation and low soil moisture are associated with a higher risk of pine caterpillar infestations.
This study argues that the risk of pine caterpillar infestation results from the complex interaction of multiple habitat factors, closely tied to the specific conditions of the study area. It highlights the importance of considering snow and soil factors, which have been rarely addressed in previous research on infestation risk. This provides new insights for future studies in this area. However, factors like slope aspect, forest structure, and landscape pattern were not included, potentially affecting the accuracy of the risk assessment and driving effect analysis. The data resolution may also limit the ability to capture important details, suggesting that higher resolution data should be used in future research. The APCIRD framework, combined with MRF, serves as an effective tool for assessing future large-scale insect infestation risks and their driving factors.
5. Conclusions
Considering the actual situation, this study incorporated factors like snow accumulation and soil, proposing a combination of the APCIRD risk assessment framework and MRF to evaluate the risk and spatiotemporal variations of pine caterpillar infestation from 2019 to 2024. It identified key areas and habitat factors for infestation risk, modeled the functional relationship between these factors and infestation risk, analyzed the characteristics and optimal threshold ranges for high-risk factors, and found that factor interactions provide stronger explanatory power for infestation risk than individual factors. (1) From 2019 to 2024, areas with high pine caterpillar infestation and the highest risk levels gradually decrease, with risk levels changing from high to low and spatial distribution changing from concentrated to scattered. Eastern Heilongjiang and Southwest Liaoning remain key areas of focus. (2) Snow cover and soil factors play a key role in pine caterpillar infestation. ssr, sd, swvl1, stl1, t2m, d2m, tsn, and lai_hv are key habitat factors significantly impacting infestation risk. (3) Key habitat factors exhibit quartic and cubic polynomial relationships with infestation risk, with t2m following a cubic polynomial function and swvl1 showing a negative correlation, indicating nonlinear driving effects. This suggests that forestry management and protection should consider the specific relationships between habitat factors and pine caterpillar infestation risk when developing policies. (4) The characteristics and threshold ranges of factors triggering high infestation risks are mainly at low to medium levels. Areas with high pine caterpillar infestation risk generally exhibit the following characteristics: low to moderate altitude (<800 m), moderate to high surface net solar radiation, moderate to high temperature, gentle slopes (<30°), low to moderate evaporation, low snow depth (<0.02), moderate snow temperature (266.73–275), low to moderate soil moisture (0.2–0.3), moderate to high soil temperature (276.73–286.92), low to moderate rainfall, low to moderate wind speed, low to moderate leaf area index, high vegetation type, low to moderate vegetation cover, low population density, and low surface runoff. The integration of the APCIRD framework and MRF effectively assesses infestation risk and analyzes the driving role of habitat factors.
J.Z.: conceptualization, methodology, investigation, visualization, formal analysis, original draft. M.W. (Mingchang Wang): funding acquisition, review and editing, project administration. D.C.: data curation, investigation. L.W.: data curation. X.J.: review and editing. Q.D.: review and editing. F.W.: review and editing. M.W. (Minshui Wang): resources. All authors have read and agreed to the published version of the manuscript.
The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 1 Study area location, pine forest distribution, and historical pine caterpillar infestation areas.
Figure 2 Distribution of field survey sites in 2019 and 2020.
Figure 3 Analysis of Shapiro–Wilk normality test.
Figure 4 Framework for pine caterpillar infestation risk assessment and habitat factor analysis.
Figure 5 Changes in pine caterpillar infestation risk levels from 2019 to 2024.
Figure 6 Pine caterpillar infestation risk levels pixels from 2019 to 2024.
Figure 7 Spatial distribution of pine caterpillar infestation risk from 2019 to 2024.
Figure 8 Frequency analysis results of 2019 (the colored and white bars represent areas with higher and highest risk, respectively).
Figure 9 Frequency analysis results of 2020 (the colored and white bars represent areas with higher and highest risk, respectively).
Figure 10 Frequency analysis results of 2023 (the colored and white bars represent areas with higher and highest risk, respectively).
Figure 11 Importance ranking of habitat factors.
Figure 12 Explanatory power of single factors on infestation risk from 2019 to 2024.
Figure 13 SHAP interpretability analysis.
Figure 14 Beeswarm plot of SHAP values.
Figure 15 Driving relationships between key habitat factors and infestation risk.
Figure 16 Interaction detector analysis results.
Habitat factor data.
Types | Factor Description | Abbreviation |
---|---|---|
Climate | 5-year average surface net solar radiation | ssr (J/m2) |
5-year average 2 m temperature | t2m (K) | |
5-year average 2 m dewpoint temperature | d2m (K) | |
5-year average 10 m u-component of wind | u10 (m/s) | |
5-year average 10 m v-component of wind | v10 (m/s) | |
5-year average evaporation | e (m of weq) | |
5-year average total precipitation | tp (m) | |
average total precipitation in April | tp_04 (m) | |
average 2 m temperature in April | t2m_04 (K) | |
average surface runoff in April | sro_04 (m) | |
Vegetation | 5-year average high vegetation coverage | cvh (0–1) |
5-year average leaf area index of high vegetations | lai_hv (m2/m2) | |
5-year average high vegetation types | tvh | |
Soil | 5-year average soil moisture at 0–7 cm depth | swvl1 (m3/m3) |
5-year average temperature at 0–7 cm depth | stl1 (K) | |
monthly mean soil moisture in April of the year of occurrence occurrence | swvl1_04 (m3/m3) | |
monthly mean temperature in April of the year of occurrence | stl1_04 (K) | |
Snow | 5-year average temperature of snow layer | tsn (K) |
5-year average snow depth | sd (m of weq) | |
average snow depth in April | sd_04 (m of weq) | |
Topography | digital elevation model | DEM (m) |
slope | slope | |
Human | 5-year average population density | popu_densi |
Note: “m of weq” stands for “water equivalent”.
Accuracy of pine caterpillar infestation risk assessment models using different methods.
Methods | Class | Precision | Recall | F1-Score | Val_Accuracy |
---|---|---|---|---|---|
MRF | No Occur | 0.97 | 0.99 | 0.98 | 0.9748 |
Occur | 0.97 | 0.90 | 0.95 | ||
RF | No Occur | 0.95 | 0.99 | 0.97 | 0.9505 |
Occur | 0.96 | 0.86 | 0.91 | ||
XGBoost | No Occur | 0.93 | 0.98 | 0.96 | 0.9385 |
Occur | 0.95 | 0.83 | 0.89 | ||
LGBM | No Occur | 0.94 | 0.98 | 0.96 | 0.9417 |
Occur | 0.95 | 0.85 | 0.90 |
Pine caterpillar infestation risk levels and their corresponding ranges.
Lowest | Lower | Medium | Higher | Highest | |
---|---|---|---|---|---|
Probability value | 0~0.25 | 0.25~0.50 | 0.50~0.70 | 0.70~0.85 | 0.85~1.00 |
Changes in the explanatory power of single factors on infestation risk from 2019 to 2024.
2019 | 2020 | 2021 | 2022 | 2023 | 2024 | |
---|---|---|---|---|---|---|
d2m | 0.9865 | 0.9211 | 0.9437 | 0.9193 | 0.9437 | 0.9553 |
e | 0.9642 | 0.8111 | 0.8355 | 0.8089 | 0.8355 | 0.9513 |
cvh | 0.9739 | 0.9463 | 0.9598 | 0.9322 | 0.9598 | 0.9383 |
lai_hv | 0.9846 | 0.9529 | 0.9614 | 0.9393 | 0.9614 | 0.9562 |
sd | 0.7452 | 0.1751 | 0.3443 | 0.2967 | 0.3443 | 0.9615 |
ssr | 0.9815 | 0.9279 | 0.9223 | 0.8983 | 0.9223 | 0.9595 |
stl1 | 0.9860 | 0.9473 | 0.9509 | 0.9277 | 0.9509 | 0.9524 |
swvl1 | 0.9868 | 0.9616 | 0.9565 | 0.9392 | 0.9565 | 0.9602 |
tvh | 0.2477 | 0.1472 | 0.2414 | 0.2893 | 0.2414 | 0.2049 |
t2m | 0.9870 | 0.9454 | 0.9526 | 0.9263 | 0.9526 | 0.9560 |
tp | 0.9249 | 0.7601 | 0.7484 | 0.7441 | 0.7484 | 0.9524 |
tsn | 0.9855 | 0.9312 | 0.9395 | 0.9138 | 0.9395 | 0.9477 |
u10 | 0.9818 | 0.8495 | 0.8671 | 0.8475 | 0.8671 | 0.9616 |
v10 | 0.9863 | 0.8983 | 0.9196 | 0.9119 | 0.9196 | 0.9616 |
sro_04 | 0.0901 | 0.0779 | 0.1760 | 0.2003 | 0.1760 | 0.3464 |
sd_04 | 0.1627 | 0.1819 | 0.1979 | 0.2719 | 0.1979 | 0.8530 |
tp_04 | 0.7016 | 0.7572 | 0.8022 | 0.7637 | 0.8022 | 0.2371 |
t2m_04 | 0.9288 | 0.9275 | 0.9437 | 0.9272 | 0.9437 | 0.9567 |
swvl1_04 | 0.9671 | 0.9577 | 0.9692 | 0.9474 | 0.9692 | 0.9611 |
stl1_04 | 0.9407 | 0.9411 | 0.9491 | 0.9284 | 0.9491 | 0.9572 |
popu_densi | 0.1389 | 0.2135 | 0.2626 | 0.2088 | 0.2626 | 0.1719 |
DEM | 0.2193 | 0.2018 | 0.2398 | 0.2555 | 0.2398 | 0.3126 |
slope | 0.1395 | 0.1291 | 0.1472 | 0.1853 | 0.1472 | 0.1746 |
Note: The p-values of tvh, sro_04, popu_densi, and slope are close to 1, indicating unreliable explanatory power, while the other factors have p-values below 0.05, showing reliable explanatory power.
1. Grêt-Regamey, A.; Weibel, B. Global Assessment of Mountain Ecosystem Services Using Earth Observation Data. Ecosyst. Serv.; 2020; 46, 101213. [DOI: https://dx.doi.org/10.1016/j.ecoser.2020.101213]
2. Xie, Y.; Cheng, C.; Zhang, T.; Wu, X.; Wang, P. Donor-Side Valuation of Forest Ecosystem Services in China during 1990–2020. Energy Ecol. Environ.; 2023; 8, pp. 503-521. [DOI: https://dx.doi.org/10.1007/s40974-023-00294-5]
3. Cortini, F.; Comeau, P.G. Pests, Climate and Competition Effects on Survival and Growth of Trembling Aspen in Western Canada. New For.; 2020; 51, pp. 175-190. [DOI: https://dx.doi.org/10.1007/s11056-019-09726-9]
4. Han, D.; Wang, S.; Zhang, J.; Cui, R.; Wang, Q. Evaluating Dendrolimus superans (Lepidoptera: Lasiocampidae) Occurrence and Density Modeling with Habitat Conditions. Forests; 2024; 15, 388. [DOI: https://dx.doi.org/10.3390/f15020388]
5. Schroeder, M.; Cocoş, D. Performance of the Tree-Killing Bark Beetles Ips typographus and Pityogenes chalcographus in Non-Indigenous Lodgepole Pine and Their Historical Host Norway Spruce. Agric. For. Entomol.; 2018; 20, pp. 347-357. [DOI: https://dx.doi.org/10.1111/afe.12267]
6. Chen, H.; Hu, Y.; Chang, Y.; Bu, R.; Li, Y.; Liu, M. Simulating Impact of Larch Caterpillar (Dendrolimus superans) on Fire Regime and Forest Landscape in Da Hinggan Mountains, Northeast China. Chin. Geogr. Sci.; 2011; 21, pp. 575-586. [DOI: https://dx.doi.org/10.1007/s11769-011-0494-9]
7. Cheng, X.; Qian, G.; Song, X.; Zhang, S.; Zhou, X.; Zou, Y.; Zhang, G.; Fang, G.; Song, Y.; Bi, S. The Catastrophe Prediction Models of Dendrolimus punctatus Based on Disaster Index. Int. J. Pest Manag.; 2021; 70, pp. 616-625. [DOI: https://dx.doi.org/10.1080/09670874.2021.2018066]
8. Wu, S.J.; Zhu, T.H.; Qiao, T.M.; Li, S.J.; Shan, H. Prediction of the Potential Distribution of Dendrolimus Houi Lajonquiere in Sichuan of China Based on the Species Distribution Model. Appl. Ecol. Environ. Res.; 2021; 19, pp. 2227-2240. [DOI: https://dx.doi.org/10.15666/aeer/1903_22272240]
9. Bao, Y.; Han, A.; Zhang, J.; Liu, X.; Tong, Z.; Bao, Y. Contribution of the Synergistic Interaction between Topography and Climate Variables to Pine Caterpillar (Dendrolimus spp.) Outbreaks in Shandong Province, China. Agric. For. Meteorol.; 2022; 322, 109023. [DOI: https://dx.doi.org/10.1016/j.agrformet.2022.109023]
10. Bao, Y.; Na, L.; Han, A.; Guna, A.; Wang, F.; Liu, X.; Zhang, J.; Wang, C.; Tong, S.; Bao, Y. Drought Drives the Pine Caterpillars (Dendrolimus spp.) Outbreaks and Their Prediction under Different RCPs Scenarios: A Case Study of Shandong Province, China. For. Ecol. Manag.; 2020; 475, 118446. [DOI: https://dx.doi.org/10.1016/j.foreco.2020.118446]
11. You, W.; You, H.; Wu, L.; Ji, Z.; He, D. Landscape-Level Spatiotemporal Patterns of Dendrolimus punctatus Walker and Its Driving Forces: Evidence from a Pinus massoniana Forest. Trees-Struct. Funct.; 2020; 34, pp. 553-562. [DOI: https://dx.doi.org/10.1007/s00468-019-01936-0]
12. Liu, N.; Zhao, X.; Zhang, X.; Zhao, J.; Wang, H.; Wu, D. Remotely Sensed Evidence of the Divergent Climate Impacts of Wind Farms on Croplands and Grasslands. Sci. Total Environ.; 2023; 905, 167203. [DOI: https://dx.doi.org/10.1016/j.scitotenv.2023.167203] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/37730031]
13. Xiang, Y.; Tang, Y.; Wang, Z.; Peng, C.; Huang, C.; Dian, Y.; Teng, M.; Zhou, Z. Seasonal Variations of the Relationship between Spectral Indexes and Land Surface Temperature Based on Local Climate Zones: A Study in Three Yangtze River Megacities. Remote Sens.; 2023; 15, 870. [DOI: https://dx.doi.org/10.3390/rs15040870]
14. Xiang, Y.; Yuan, C.; Cen, Q.; Huang, C.; Wu, C.; Teng, M.; Zhou, Z. Heat Risk Assessment and Response to Green Infrastructure Based on Local Climate Zones. Build. Environ.; 2024; 248, 111040. [DOI: https://dx.doi.org/10.1016/j.buildenv.2023.111040]
15. Han, R.D.; Parajulee, M.; Zhong, H.; Feng, G. Effects of Environmental Humidity on the Survival and Development of Pine Caterpillars, Dendrolimus tabulaeformis (Lepidoptera: Lasiocampidae). Insect Sci.; 2008; 15, pp. 147-152. [DOI: https://dx.doi.org/10.1111/j.1744-7917.2008.00195.x]
16. Fang, L.; Yu, Y.; Fang, G.; Zhang, X.; Yu, Z.; Zhang, X.; Crocker, E.; Yang, J. Effects of Meteorological Factors on the Defoliation Dynamics of the Larch Caterpillar (Dendrolimus superans Butler) in the Great Xing’an Boreal Forests. J. For. Res.; 2021; 32, pp. 2683-2697. [DOI: https://dx.doi.org/10.1007/s11676-020-01277-6]
17. Hua, H.; Wu, C.; Jassal, R.S.; Huang, J.; Liu, R.; Wang, Y. Pine Caterpillar Occurrence Modeling Using Satellite Spring Phenology and Meteorological Variables. Environ. Res. Lett.; 2022; 17, 104046. [DOI: https://dx.doi.org/10.1088/1748-9326/ac9636]
18. Gao, H.; Wang, C.; Wang, G.; Zhu, J.; Tang, Y.; Shen, P.; Zhu, Z. A Crop Classification Method Integrating GF-3 PolSAR and Sentinel-2A Optical Data in the Dongting Lake Basin. Sensors; 2018; 18, 3139. [DOI: https://dx.doi.org/10.3390/s18093139]
19. Mercier, A.; Betbeder, J.; Denize, J.; Roger, J.L.; Spicher, F.; Lacoux, J.; Roger, D.; Baudry, J.; Hubert-Moy, L. Estimating Crop Parameters Using Sentinel-1 and 2 Datasets and Geospatial Field Data. Data Brief; 2021; 38, 107408. [DOI: https://dx.doi.org/10.1016/j.dib.2021.107408]
20. Zheng, Q.; Huang, W.; Cui, X.; Shi, Y.; Liu, L. New Spectral Index for Detecting Wheat Yellow Rust Using Sentinel-2 Multispectral Imagery. Sensors; 2018; 18, 868. [DOI: https://dx.doi.org/10.3390/s18030868]
21. Booth, T.H. Checking Bioclimatic Variables That Combine Temperature and Precipitation Data before Their Use in Species Distribution Models. Austral Ecol.; 2022; 47, pp. 1506-1514. [DOI: https://dx.doi.org/10.1111/aec.13234]
22. Soria-Auza, R.W.; Kessler, M.; Bach, K.; Barajas-Barbosa, P.M.; Lehnert, M.; Herzog, S.K.; Böhner, J. Impact of the Quality of Climate Models for Modelling Species Occurrences in Countries with Poor Climatic Documentation: A Case Study from Bolivia. Ecol. Model.; 2010; 221, pp. 1221-1229. [DOI: https://dx.doi.org/10.1016/j.ecolmodel.2010.01.004]
23. Archaux, F.; Bergès, L. Optimising Vegetation Monitoring. A Case Study in A French Lowland Forest. Environ. Monit. Assess.; 2008; 141, pp. 19-25. [DOI: https://dx.doi.org/10.1007/s10661-007-9874-0]
24. Zhang, X.; Zhang, Z.; Wang, W.; Fang, W.T.; Chiang, Y.T.; Liu, X.; Ju, H. Vegetation Successions of Coastal Wetlands in Southern Laizhou Bay, Bohai Sea, Northern China, Influenced by the Changes in Relative Surface Elevation and Soil Salinity. J. Environ. Manag.; 2021; 293, 112964. [DOI: https://dx.doi.org/10.1016/j.jenvman.2021.112964]
25. Christiansen, B. The Shortcomings of Nonlinear Principal Component Analysis in Identifying Circulation Regimes. J. Clim.; 2005; 18, pp. 4814-4823. [DOI: https://dx.doi.org/10.1175/JCLI3569.1]
26. Liu, Z.; Wang, M.; Liu, X.; Wang, F.; Li, X.; Wang, J.; Hou, G.; Zhao, S. Ecological Security Assessment and Warning of Cultivated Land Quality in the Black Soil Region of Northeast China. Land; 2023; 12, 1005. [DOI: https://dx.doi.org/10.3390/land12051005]
27. Yang, X.; Hao, Z.; Liu, K.; Tao, Z.; Shi, G. An Improved Unascertained Measure-Set Pair Analysis Model Based on Fuzzy AHP and Entropy for Landslide Susceptibility Zonation Mapping. Sustainability; 2023; 15, 6205. [DOI: https://dx.doi.org/10.3390/su15076205]
28. Campos, J.C.; Garcia, N.; Alírio, J.; Arenas-Castro, S.; Teodoro, A.C.; Sillero, N. Ecological Niche Models Using MaxEnt in Google Earth Engine: Evaluation, Guidelines and Recommendations. Ecol. Inform.; 2023; 76, 102147. [DOI: https://dx.doi.org/10.1016/j.ecoinf.2023.102147]
29. Gagula, A.; Campana, M.B.D.; Narit, M.G.; Guerrero, P.D.; Parac, E.P. Using Maxent in Quantifying the Impacts of Climate Change in Land Suitability of Abaca (Musa Testilis) in Caraga Region, Philippines. Proceedings of the 8th Geoinformation Science Symposium 2023: Geoinformation Science for Sustainable Planet; Yogyakarta, Indonesia, 28–30 August 2023; [DOI: https://dx.doi.org/10.1117/12.3009676]
30. Yalcin, M.; Sari, F.; Yildiz, A. Exploration of Potential Geothermal Fields Using MAXENT and AHP: A Case Study of the Büyük Menderes Graben. Geothermics; 2023; 114, 102792. [DOI: https://dx.doi.org/10.1016/j.geothermics.2023.102792]
31. Bera, D.; Das Chatterjee, N.; Bera, S. Comparative Performance of Linear Regression, Polynomial Regression and Generalized Additive Model for Canopy Cover Estimation in the Dry Deciduous Forest of West Bengal. Remote Sens. Appl. Soc. Environ.; 2021; 22, 100502. [DOI: https://dx.doi.org/10.1016/j.rsase.2021.100502]
32. Park, S.Y.; Yoon, D.K.; Park, S.H.; Jeon, J.I.; Lee, J.M.; Yang, W.H.; Cho, Y.S.; Kwon, J.; Lee, C.M. Proposal of a Methodology for Prediction of Indoor PM2.5 Concentration Using Sensor-Based Residential Environments Monitoring Data and Time-Divided Multiple Linear Regression Model. Toxics; 2023; 11, 526. [DOI: https://dx.doi.org/10.3390/toxics11060526] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/37368626]
33. Yılmaz, M. A Comparative Assessment of the Statistical Methods Based on Urban Population Density Estimation. Geocarto Int.; 2023; 38, 2152494. [DOI: https://dx.doi.org/10.1080/10106049.2022.2152494]
34. Early, R.; Rwomushana, I.; Chipabika, G.; Day, R. Comparing, Evaluating and Combining Statistical Species Distribution Models and CLIMEX to Forecast the Distributions of Emerging Crop Pests. Pest Manag. Sci.; 2022; 78, pp. 671-683. [DOI: https://dx.doi.org/10.1002/ps.6677] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/34647405]
35. Fitzgibbon, A.; Pisut, D.; Fleisher, D. Evaluation of Maximum Entropy (Maxent) Machine Learning Model to Assess Relationships between Climate and Corn Suitability. Land; 2022; 11, 1382. [DOI: https://dx.doi.org/10.3390/land11091382]
36. Zhao, Z.; Xiao, N.; Shen, M.; Li, J. Comparison between Optimized MaxEnt and Random Forest Modeling in Predicting Potential Distribution: A Case Study with Quasipaa boulengeri in China. Sci. Total Environ.; 2022; 842, 156867. [DOI: https://dx.doi.org/10.1016/j.scitotenv.2022.156867]
37. Dai, X.; Wu, W.; Ji, L.; Tian, S.; Yang, B.; Guan, B.; Wu, D. MaxEnt Model-Based Prediction of Potential Distributions of Parnassia Wightiana (Celastraceae) in China. Biodivers. Data J.; 2022; 10, e81073. [DOI: https://dx.doi.org/10.3897/BDJ.10.e81073]
38. Huercha,; Song, R.; Ma, Y.; Hu, Z.; Li, Y.; Li, M.; Wu, L.; Li, C.; Dao, E.; Fan, X.
39. Zhao, J.; Ma, L.; Song, C.; Xue, Z.; Zheng, R.; Yan, X.; Hao, C. Modelling Potential Distribution of Tuta Absoluta in China under Climate Change Using CLIMEX and MaxEnt. J. Appl. Entomol.; 2023; 147, pp. 895-907. [DOI: https://dx.doi.org/10.1111/jen.13181]
40. Song, J.W.; Jung, J.M.; Nam, Y.; Jung, J.K.; Jung, S.; Lee, W.H. Spatial Ensemble Modeling for Predicting the Potential Distribution of Lymantria dispar asiatica (Lepidoptera: Erebidae: Lymantriinae) in South Korea. Environ. Monit. Assess.; 2022; 194, 889. [DOI: https://dx.doi.org/10.1007/s10661-022-10609-4]
41. Bellin, N.; Tesi, G.; Marchesani, N.; Rossi, V. Species Distribution Modeling and Machine Learning in Assessing the Potential Distribution of Freshwater Zooplankton in Northern Italy. Ecol. Inform.; 2022; 69, 101682. [DOI: https://dx.doi.org/10.1016/j.ecoinf.2022.101682]
42. Chen, S.; Ding, Y. Machine Learning and Its Applications in Studying the Geographical Distribution of Ants. Diversity; 2022; 14, 706. [DOI: https://dx.doi.org/10.3390/d14090706]
43. El Alaoui, O.; Idri, A. Predicting the Potential Distribution of Wheatear Birds Using Stacked Generalization-Based Ensembles. Ecol. Inform.; 2023; 75, 102084. [DOI: https://dx.doi.org/10.1016/j.ecoinf.2023.102084]
44. Zhao, Y.; Liu, H.; Qu, W.; Luan, P.; Sun, J. Research on Geological Safety Evaluation Index Systems and Methods for Assessing Underground Space in Coastal Bedrock Cities Based on a Back-Propagation Neural Network Comprehensive Evaluation–Analytic Hierarchy Process (BPCE-AHP). Sustainability; 2023; 15, 8055. [DOI: https://dx.doi.org/10.3390/su15108055]
45. Azcárate, F.M.; Seoane, J.; Silvestre, M. Factors Affecting Pine Processionary Moth (Thaumetopoea pityocampa) Incidence in Mediterranean Pine Stands: A Multiscale Approach. For. Ecol. Manag.; 2023; 529, 120728. [DOI: https://dx.doi.org/10.1016/j.foreco.2022.120728]
46. Chen, L.; Huang, J.G.; Dawson, A.; Zhai, L.; Stadt, K.J.; Comeau, P.G.; Whitehouse, C. Contributions of Insects and Droughts to Growth Decline of Trembling Aspen Mixed Boreal Forest of Western Canada. Glob. Change Biol.; 2018; 24, pp. 655-667. [DOI: https://dx.doi.org/10.1111/gcb.13855]
47. Arabameri, A.; Nalivan, O.A.; Saha, S.; Roy, J.; Pradhan, B.; Tiefenbacher, J.P.; Ngo, P.T.T. Novel Ensemble Approaches of Machine Learning Techniques in Modeling the Gully Erosion Susceptibility. Remote Sens.; 2020; 12, 1890. [DOI: https://dx.doi.org/10.3390/rs12111890]
48. Lu, S.; Ye, S.-J. Using an Image Segmentation and Support Vector Machine Method for Identifying Two Locust Species and Instars. J. Integr. Agric.; 2020; 19, pp. 1301-1313. [DOI: https://dx.doi.org/10.1016/S2095-3119(19)62865-0]
49. Zhang, C.; Park, D.S.; Yoon, S.; Zhang, S. Editorial: Machine Learning and Artificial Intelligence for Smart Agriculture. Front. Plant Sci.; 2023; 13, 1121468. [DOI: https://dx.doi.org/10.3389/fpls.2022.1121468]
50. Rafiq, D.; Bazaz, M.A. A Collection of Large-Scale Benchmark Models for Nonlinear Model Order Reduction. Arch. Comput. Methods Eng.; 2023; 30, pp. 69-83. [DOI: https://dx.doi.org/10.1007/s11831-022-09789-6]
51. Wang, L.; Xu, M. Regression-Based Identification and Order Reduction Method for Nonlinear Dynamic Structural Models. Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng.; 2023; 237, pp. 3508-3523. [DOI: https://dx.doi.org/10.1177/09544100231199239]
52. Zhu, R.; Fei, Q.; Jiang, D.; Marchesiello, S.; Anastasio, D. Bayesian Model Selection in Nonlinear Subspace Identification. AIAA J.; 2022; 60, pp. 92-101. [DOI: https://dx.doi.org/10.2514/1.J060782]
53. Monter-Pozos, A.; González-Estrada, E. On Testing the Skew Normal Distribution by Using Shapiro–Wilk Test. J. Comput. Appl. Math.; 2024; 440, 115649. [DOI: https://dx.doi.org/10.1016/j.cam.2023.115649]
54. El Habib Daho, M.; Amine Chikh, M. Combining Bootstrapping Samples, Random Subspaces and Random Forests to Build Classifiers. J. Med. Imaging Health Inform.; 2015; 5, pp. 539-544. [DOI: https://dx.doi.org/10.1166/jmihi.2015.1423]
55. Zhu, H.; Liu, H.; Zhou, Q.; Cui, A. A XGBoost-Based Downscaling-Calibration Scheme for Extreme Precipitation Events. IEEE Trans. Geosci. Remote Sens.; 2023; 61, 4103512. [DOI: https://dx.doi.org/10.1109/TGRS.2023.3294266]
56. Lyu, J.; Zheng, P.; Qi, Y.; Huang, G. LightGBM-LncLoc: A LightGBM-Based Computational Predictor for Recognizing Long Non-Coding RNA Subcellular Localization. Mathematics; 2023; 11, 602. [DOI: https://dx.doi.org/10.3390/math11030602]
57. Zhang, X.; Wu, T.; Du, Q.; Quyang, N.; Nie, W.; Liu, Y.; Gou, P.; Li, G. Spatiotemporal changes of ecosystem health and the impact of its driving factors on the Loess Plateau in China. J. Ecol. Indic.; 2025; 170, pp. 1677-1680. [DOI: https://dx.doi.org/10.1016/j.ecolind.2024.113020]
58. Pham, B.T.; Luu, C.; Van Phong, T.; Nguyen, H.D.; Van Le, H.; Tran, T.Q.; Ta, H.T.; Prakash, I. Flood Risk Assessment Using Hybrid Artificial Intelligence Models Integrated with Multi-Criteria Decision Analysis in Quang Nam Province, Vietnam. J. Hydrol.; 2021; 592, 125815. [DOI: https://dx.doi.org/10.1016/j.jhydrol.2020.125815]
59. Liu, C.; Li, W.; Wang, W.; Zhou, H.; Liang, T.; Hou, F.; Xu, J.; Xue, P. Quantitative Spatial Analysis of Vegetation Dynamics and Potential Driving Factors in a Typical Alpine Region on the Northeastern Tibetan Plateau Using the Google Earth Engine. Catena; 2021; 206, 105500. [DOI: https://dx.doi.org/10.1016/j.catena.2021.105500]
60. Hodkinson, I.D. Terrestrial Insects along Elevation Gradients: Species and Community Responses to Altitude. Biol. Rev. Camb. Philos. Soc.; 2005; 80, pp. 489-513. [DOI: https://dx.doi.org/10.1017/S1464793105006767]
61. Marini, L.; Fontana, P.; Klimek, S.; Battisti, A.; Gaston, K.J. Impact of Farm Size and Topography on Plant and Insect Diversity of Managed Grasslands in the Alps. Biol. Conserv.; 2009; 142, pp. 394-403. [DOI: https://dx.doi.org/10.1016/j.biocon.2008.10.034]
62. Hamann, A.; Wang, T. Potential Effects of Climate Change on Ecosystem and Tree Species Distribution in British Columbia. Ecology; 2006; 87, pp. 2773-2786. [DOI: https://dx.doi.org/10.1890/0012-9658(2006)87[2773:PEOCCO]2.0.CO;2] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/17168022]
63. Gazol, A.; Hernández-Alonso, R.; Camarero, J.J. Patterns and Drivers of Pine Processionary Moth Defoliation in Mediterranean Mountain Forests. Front. Ecol. Evol.; 2019; 7, 458. [DOI: https://dx.doi.org/10.3389/fevo.2019.00458]
64. Marini, L.; Ayres, M.P.; Battisti, A.; Faccoli, M. Climate Affects Severity and Altitudinal Distribution of Outbreaks in an Eruptive Bark Beetle. Clim. Chang.; 2012; 115, pp. 327-341. [DOI: https://dx.doi.org/10.1007/s10584-012-0463-z]
65. Haynes, K.J.; Liebhold, A.M.; Lefcheck, J.S.; Morin, R.S.; Wang, G. Climate Affects the Outbreaks of a Forest Defoliator Indirectly through Its Tree Hosts. Oecologia; 2022; 198, pp. 407-418. [DOI: https://dx.doi.org/10.1007/s00442-022-05123-w]
66. Lai, H.; Hales, S.; Woodward, A.; Walker, C.; Marks, E.; Pillai, A.; Chen, R.X.; Morton, S.M. Effects of Heavy Rainfall on Waterborne Disease Hospitalizations among Young Children in Wet and Dry Areas of New Zealand. Environ. Int.; 2020; 145, 106136. [DOI: https://dx.doi.org/10.1016/j.envint.2020.106136]
67. Aoki, C.F.; Cook, M.; Dunn, J.; Finley, D.; Fleming, L.; Yoo, R.; Ayres, M.P. Old Pests in New Places: Effects of Stand Structure and Forest Type on Susceptibility to a Bark Beetle on the Edge of Its Native Range. For. Ecol. Manag.; 2018; 419–420, pp. 206-219. [DOI: https://dx.doi.org/10.1016/j.foreco.2018.03.009]
68. Bognounou, F.; De Grandprè, L.; Pureswaran, D.S.; Kneeshaw, D. Temporal Variation in Plant Neighborhood Effects on the Defoliation of Primary and Secondary Hosts by an Insect Pest. Ecosphere; 2017; 8, e01759. [DOI: https://dx.doi.org/10.1002/ecs2.1759]
69. Dodds, K.J.; Aoki, C.F.; Arango-Velez, A.; Cancelliere, J.; D’Amato, A.W.; DiGirolomo, M.F.; Rabaglia, R.J. Expansion of Southern Pine Beetle into Northeastern Forests: Management and Impact of a Primary Bark Beetle in a New Region. J. For.; 2018; 116, pp. 178-191. [DOI: https://dx.doi.org/10.1093/jofore/fvx009]
70. Sánchez-Cuesta, R.; Ruiz-Gómez, F.J.; Duque-Lazo, J.; González-Moreno, P.; Navarro-Cerrillo, R.M. The Environmental Drivers Influencing Spatio-Temporal Dynamics of Oak Defoliation and Mortality in Dehesas of Southern Spain. For. Ecol. Manag.; 2021; 485, 118946. [DOI: https://dx.doi.org/10.1016/j.foreco.2021.118946]
71. Walter, J.A.; Platt, R.V. Multi-Temporal Analysis Reveals That Predictors of Mountain Pine Beetle Infestation Change during Outbreak Cycles. For. Ecol. Manag.; 2013; 302, pp. 308-318. [DOI: https://dx.doi.org/10.1016/j.foreco.2013.03.038]
72. Figueredo, L.; Villa-Murillo, A.; Colmenarez, Y.; Vásquez, C. A Hybrid Artificial Intelligence Model for Aeneolamia varia (Hemiptera: Cercopidae) Populations in Sugarcane Crops. J. Insect Sci.; 2021; 21, 11. [DOI: https://dx.doi.org/10.1093/jisesa/ieab017] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/33822127]
73. DeRose, R.J.; Bentz, B.J.; Long, J.N.; Shaw, J.D. Effect of Increasing Temperatures on the Distribution of Spruce Beetle in Engelmann Spruce Forests of the Interior West, USA. For. Ecol. Manag.; 2013; 308, pp. 198-206. [DOI: https://dx.doi.org/10.1016/j.foreco.2013.07.061]
74. Lalande, B.M.; Hughes, K.; Jacobi, W.R.; Tinkham, W.T.; Reich, R.; Stewart, J.E. Subalpine Fir Mortality in Colorado Is Associated with Stand Density, Warming Climates and Interactions among Fungal Diseases and the Western Balsam Bark Beetle. For. Ecol. Manag.; 2020; 466, 118133. [DOI: https://dx.doi.org/10.1016/j.foreco.2020.118133]
75. Bajwa, A.A.; Farooq, M.; Al-Sadi, A.M.; Nawaz, A.; Jabran, K.; Siddique, K.H.M. Impact of Climate Change on Biology and Management of Wheat Pests. Crop Prot.; 2020; 137, 105304. [DOI: https://dx.doi.org/10.1016/j.cropro.2020.105304]
76. Ma, C.S.; Ma, G.; Pincebourde, S. Survive a Warming Climate: Insect Responses to Extreme High Temperatures. Annu. Rev. Entomol.; 2021; 66, pp. 163-184. [DOI: https://dx.doi.org/10.1146/annurev-ento-041520-074454]
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Pine caterpillar (Dendrolimus) infestations threaten pine forests, causing severe ecological and economic impacts. Identifying the driving factors behind these infestations is essential for effective forest management. This study uses the APCIRD framework combined with an improved random forest model to analyze spatiotemporal changes in infestation risk and the driving effects of habitat factors in Northeast China. From 2019 to 2024, we applied SHapley Additive exPlanations (SHAP), frequency analysis, fitting functions, and GeoDetector to quantify the impact of key drivers, such as snow cover and soil, on infestation risk. The findings include (1) the APCIRD framework with the MLP-random forest model (MRF) accurately assesses infestation risks. MRF is composed of MLP and random forest. Between 2019 and 2024, areas with high infestation risk declined, shifting from higher to lower levels, with Eastern Heilongjiang and Southwest Liaoning remaining as key concern areas; (2) snow cover and soil factors are critical to infestation risk, with eight key habitat factors significantly affecting the risk. Their relationships with infestation risk follow complex, non-monotonic quartic and cubic patterns; (3) factors triggering high infestation risks are mostly at low to moderate levels. High-risk areas tend to have low to moderate elevation (<800 m), moderate to high solar radiation and temperature, gentle slopes (<30°), low to moderate evaporation, shallow snow depth (<0.02), moderate snow temperature (266.73–275), low to moderate soil moisture (0.2–0.3), moderate to high soil temperature (276.73–286.92), low to moderate rainfall, moderate wind speed, low leaf area index, high vegetation type, low vegetation cover, low population density, and low surface runoff. Interactions between factors provide a stronger explanation of infestation risk than individual factors. The APCIRD framework, combined with MRF, offers valuable insights for understanding the drivers of pine caterpillar infestations.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer