Content area
Prediction of landslides is a burning issue on the global map especially in areas of the mountainous region where the occurrence of landslides may be a major setback in the development of sustainability in such areas. This paper presents the mapping of landslides susceptibility in Namchi-Sikkim region of India, which is implemented via the latest machine learning techniques. A variety of 6 models of machine learning were used, Support Vector Machine (SVM), Gradient Boosting (GB), Extreme Gradient Boosting (XGBoost), Random Forest (RF), Extra Trees (ET) and Logistic Regression (LR) to determine the regions prone to landslides in this seismically affected high-risk area. The database containing 1020 points, including the presence of landslides as well as their non-occurrence and other 23 environmental factors like terrain, lithology, and rainfall pattern were used to train the models. The Random Forest model and the Meta Classifier model demonstrated best accuracy (0.941) and F1 score (0.941) indicating that they predict very well. Gradient Boosting model showed the accuracy of 0.935 and Area Under Curve (AUC) 0.982, which points to its effectiveness in labeling the risky areas and the areas that remain stable. Other models such as SVM and XGBoost also gave useful outputs whereby SVM model achieved accuracy of 0.886 and XGBoost recorded a precision of 0.94. The results emphasize the importance of environmental and anthropogenic factors in landslide prone conditions. The study is a sophisticated workable model of predicting hazard being scalable, which has a major implication to disaster management, land-use planning and environmental protection especially in areas with high landslide susceptibility.
Introduction
Landslides are one of the most destructive natural hazards in the Sikkim Himalayas, leading to significant human casualties, infrastructure damage, and disruption of the local economy [135, 195, 202]. The region's rugged topography, active tectonic processes, and high rainfall make it especially vulnerable to landslides, particularly during the monsoon season. The Namchi District, located in the eastern Himalayan region, is particularly prone to such disasters due to rapid urbanization, infrastructural development, and deforestation [148, 227]. As the frequency of landslides increases in areas like Namchi, reliable methods for predicting and mapping landslide susceptibility become crucial for effective disaster management and mitigation strategies [264, 273].
Traditional landslide susceptibility mapping methods primarily involve expert-driven assessments and statistical models that analyze physical, topographic, and environmental factors influencing landslides. These methods have been useful but often suffer from limitations, especially in capturing complex, non-linear relationships between different environmental variables [38, 263]. For instance, the interactions between geological characteristics, hydrological patterns, soil types, and human-induced land-use changes often result in intricate spatial patterns that are difficult to model using conventional techniques. As a result, these models may fail to provide high accuracy, especially in regions with diverse environmental conditions, such as the Sikkim Himalayas [19, 25, 100, 105, 145].
To overcome these limitations, machine learning (ML) techniques have emerged as a powerful tool for landslide susceptibility mapping. ML algorithms are particularly well-suited for handling large datasets with numerous variables, as they can identify complex, non-linear relationships between features that are not easily captured by traditional methods [82, 118, 160, 224]. Machine learning models can process vast amounts of geospatial data, including topography, climate, soil characteristics, and land-use patterns, to identify patterns that correlate with landslide occurrences. This ability to model complex interactions makes ML techniques ideal for predicting landslide susceptibility, especially in regions with challenging terrain and dynamic environmental factors [118, 160, 224].
While conventional data-driven ML models, including ensemble methods, have achieved high predictive accuracy, a growing frontier in landslide research involves Physics-Informed Machine Learning (PIML) approaches [256, 271]. PIML models integrate physical principles (such as hydrological flow dynamics or slope stability equations) directly into the model architecture or the optimization process. This advanced methodology offers crucial advantages: it enhances model generalizability and transferability by constraining the solution space to be physically realistic, and it ensures that predictions are physically consistent, which is key to increasing the reliability of susceptibility maps for engineering and long-term planning applications. The incorporation of physical knowledge into ML, moving beyond purely data-driven correlations, represents the next generation of predictive modeling in geotechnical and hazard assessment science.
Among the various ML techniques, ensemble learning methods have gained particular attention due to their ability to combine the predictions of multiple individual models to improve overall accuracy. By leveraging the strengths of different algorithms while minimizing their individual weaknesses, ensemble methods can provide more reliable and robust results than single-model approaches [58, 71, 207, 239]. Ensemble models, such as Random Forests (RF) and Gradient Boosting (GB), have been successfully applied in several landslide susceptibility studies, demonstrating their ability to improve predictive accuracy and reduce overfitting by aggregating predictions from multiple decision trees [86, 108, 123]. Furthermore, Extreme Gradient Boosting (XGB), a more advanced version of GB, has shown improved performance due to its regularization mechanism that prevents overfitting and enhances model generalization [140, 146]. These ensemble methods are particularly valuable in landslide mapping, where the underlying relationships between environmental variables and landslide occurrences are often intricate and difficult to model with traditional approaches.
This study focuses on applying an ensemble machine learning framework that combines Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting (GB), and Extreme Gradient Boosting (XGB) to predict landslide susceptibility in the Namchi District of Sikkim. By integrating these diverse algorithms, the aim is to generate a more accurate and robust landslide susceptibility map that can aid in better risk management and decision-making processes for disaster prevention. The integration of different ML models allows for capturing various aspects of the data, such as non-linear relationships (SVM), tree-based interactions (RF, GB, and XGB), and further improving predictive power through ensemble methods [9, 56, 94, 104].
The use of machine learning in landslide susceptibility mapping represents a shift toward more data-driven and automated approaches, which can reduce the subjectivity inherent in expert-driven methods and provide more reliable results. By employing multiple ML models and using ensemble techniques, this study seeks to address the challenge of predicting landslides in the Sikkim Himalayas, a region with complex terrain and environmental conditions that require sophisticated modeling approaches. The ultimate goal is to provide local authorities and disaster management teams with accurate, actionable insights into the areas most at risk, enabling them to take preventive measures to mitigate the impacts of landslides on both human life and infrastructure.
This approach is not only innovative in terms of its methodology but also highly relevant to areas around the world where landslides are a recurring threat. The findings of this study could serve as a model for applying machine learning techniques to landslide susceptibility mapping in similar mountainous regions, thereby advancing the field of geospatial hazard prediction and providing more reliable tools for managing natural disasters in vulnerable areas.
This study aims to apply ensemble machine learning techniques to landslide susceptibility mapping in the Namchi District of Sikkim, combining diverse environmental variables into a comprehensive model. By providing more accurate and spatially explicit predictions, this research seeks to enhance the decision-making processes of local authorities, urban planners, and disaster management agencies, thereby contributing to improved risk assessment and resilience building in one of India’s most landslide-prone regions. The findings of this study will also provide valuable insights for the broader field of landslide susceptibility modeling, demonstrating the potential of advanced machine learning techniques to address complex environmental challenges. In this research, the primary objectives of the study include: (1) Demarcating highly affected landslide-prone areas within the Namchi district in the Himalayan region, (2) Identifying the major landslide-triggering factors, and (3) Constructing an ensemble GIS-based ensemble machine learning model that holds significant promise as a valuable tool for assessing environmental hazards and land use planning.
Study area: Namchi District, Sikkim
The Namchi District is in northeastern India, nestled within the Eastern Himalayan region of Sikkim. This district, while relatively small in geographical extent, is a microcosm of the complex environmental, social, and economic challenges many Himalayan areas face. The Sikkim Himalayas are known for their steep, rugged terrain and diverse ecosystems, and Namchi, in particular, is a prime example of this geological and ecological richness. The district's altitudinal range spans from 165 to 5683 m above mean sea level (MSL), resulting in diverse climatic conditions and ecosystems that vary significantly from the valley floors to the high mountain peaks.
The geology of Namchi is shaped by the region’s tectonic activity, with the area situated along the converging zone of the Indian and Eurasian plates. This tectonic activity has created a complex geological landscape, with rock formations ranging from metamorphic to sedimentary. While these rocks provide structural stability in some areas, they are also highly prone to weathering and erosion due to the region’s intense rainfall and steep slopes [37]. The district is underlain by a mix of unstable soils, which, when coupled with high rainfall, often lead to significant landslides, especially during the monsoon season. The presence of active fault lines in and around the district further adds to the region’s susceptibility to landslides, making the area a hotspot for slope instability [225].
The climate of Namchi is typically temperate in the lower altitudes and alpine in the higher regions, with average annual rainfall exceeding 3000 mm in some parts. The monsoon season, which occurs between June and September, is particularly critical for landslide occurrence. The heavy rain exacerbates soil erosion and increases the saturation of slopes, often triggering landslides. Moreover, the erratic nature of rainfall due to climate change has been linked to increased frequency and severity of landslide events in the region [242].
Namchi’s landscape is marked by a dense network of rivers, streams, and tributaries that flow down the mountainsides, often exacerbating the erosional processes. These watercourses also influence landslide occurrence, particularly during periods of heavy rainfall, as they can undermine the stability of slopes, leading to mass wasting events. The geomorphology of Namchi further adds complexity to landslide prediction, as the district features a combination of deep gorges, narrow valleys, and high ridgelines that can concentrate water flow and intensify the landslide hazards [37].
The region's human settlements, agriculture, and infrastructure are also highly vulnerable to landslides. Namchi’s population primarily relies on subsistence farming, with crops such as maize, potatoes, and tea grown in the lower valleys. In contrast, temperate crops such as oranges and cardamom are cultivated at higher elevations. The agricultural terraces, which are often built on steep slopes, are particularly vulnerable to landslide-induced soil erosion and slope failure. Additionally, urbanization in Namchi has led to increased construction activities along vulnerable slopes and the building of roads and infrastructure in areas prone to landslide events. The expansion of settlements and roads further disrupts natural drainage patterns, increasing the region's susceptibility to landslide disasters.
From an infrastructure perspective, Namchi’s road networks, which connect remote areas to the rest of Sikkim and neighboring states, are crucial for the region's economy and accessibility. However, these roads frequently face disruptions due to landslides, particularly during the monsoon season. The vital infrastructure, including bridges and roads, often lies along steep slopes, which increases the risk of landslides and poses challenges for maintenance and reconstruction after a landslide event. The region’s energy infrastructure, including hydroelectric plants, is highly susceptible to disruption due to landslides, which can damage power plants, transmission lines, and associated facilities.
The socio-economic landscape of Namchi is mainly rural, with the majority of the population relying on agriculture and forestry as primary sources of livelihood. While the population density is relatively low compared to urban areas, the people of Namchi are vulnerable to the consequences of landslides, which destroy homes and crops, cause displacement, and hinder access to healthcare, education, and other essential services. Furthermore, the increasing human population and urban development pressure the fragile environment, exacerbating landslide risks and limiting the region's resilience to future natural disasters.
Landslide susceptibility mapping is critical in this context. The ability to predict areas prone to landslides allows for better land-use planning and disaster management strategies. Understanding the distribution and triggers of landslides in Namchi is essential for local authorities, policy-makers, and disaster risk management agencies to prioritize resources, implement early warning systems, and develop effective mitigation strategies. Accurate mapping of landslide-prone areas could help reduce the number of fatalities and the economic losses resulting from landslides by guiding infrastructure development away from high-risk zones and providing better preparedness measures for local communities.
The Namchi District, with its steep terrain, fragile geological structures, intense monsoonal rainfall, and ongoing human encroachment, is highly susceptible to landslides. These frequent natural disasters threaten human life and infrastructure, underscoring the need for advanced predictive tools. By integrating machine learning techniques to map landslide susceptibility, this study provides a comprehensive, data-driven approach to understand better the factors contributing to landslide events and offer informed strategies for mitigating their impact (Fig. 1).
[See PDF for image]
Fig. 1
Study area (a) India, (b) Sikkim, and (c) Study area with landslide points
Landslide inventory map
A detailed map showcasing historical and current landslides is essential for assessing landslide risks and determining their potential impact. Creating such a map involves a combination of field surveys, remote sensing techniques, and historical data analysis, providing insights into the location, type, size, and timing of past landslide events [6]. The construction of an accurate Landslide Inventory Map (LIM) plays a critical role in understanding landslide susceptibility, as it helps identify correlations between landslide occurrences and the factors that trigger them [174].
This study began by gathering a total of 510 confirmed landslide points from primary and secondary data sources, including NASA’s Global Landslide Catalog (GLC), Bhukosh GSI landslide data, Google Earth Pro, and an extensive field survey. These locations were cross-verified through comprehensive field surveys supported by GPS data and Google Earth imagery. To establish a balanced dataset necessary for robust binary classification, an equal number of 510 non-landslide points were generated by creating a 2 km buffer zone around each landslide point and randomly selecting points outside these zones. This resulted in a total dataset size of 1,020 points (510 landslide and 510 non-landslide points).
The final dataset was split into training and validation sets, following the commonly used 70/30 ratio [11]. Specifically, 714 points (70%)—comprising 357 landslide points and 357 non-landslide points—were allocated for model training, while the remaining 306 points (30%)—consisting of 153 landslide points and 153 non-landslide points—were reserved for model validation. This methodical division ensures sufficient data for optimizing model parameters while dedicating an adequate amount of unseen, balanced data to accurately validate the model’s performance (Fig. 2).
[See PDF for image]
Fig. 2
Landslide Inventory: (a) represents the Google Earth image, and (b–g) represent the field photo
Constructing a spatial database for landslide susceptibility mapping on the GIS platform using multiple data sources
Various parameters influence landslide susceptibility mapping, each contributing differently to the overall risk assessment. Identifying the most relevant factors for accurate mapping is a critical task that involves thorough literature reviews, expert opinions, and field studies. Numerous research papers have identified key parameters essential for landslide susceptibility models [11, 48, 55, 67, 162, 252]. Table 1 presents an overview of the data sources used to extract significant parameters for these models.
Table 1. Sources of Data and Information for Thematic Layers in Landslide Susceptibility Mapping
Parameters | Data source and layer construction | Details | Web links |
|---|---|---|---|
Elevation, Slope, Aspect, Plane Curvature, Profile Curvature, Standard Curvature, TPI, Roughness Index, TRI, DTD, SPI, STI, TWI | Data was sourced from the Advanced Land Observing Satellite (ALOS) Phased Array type L-band Synthetic Aperture Radar (PALSAR) data, accessible from search.asf.alaska.edu. Layers were generated using ArcGIS (vs. 10.4.1) software | Raster layer with a resolution of 12.5 m × 12.5 m; Scale 1:445,500 | ALOS PALSAR Data |
Mean Annual Rainfall (1901–2024) | Rainfall data was acquired from the India Meteorological Department (IMD). The Inverse Distance Weighting (IDW) method generated the thematic layer | High-resolution rainfall gridded data with a resolution of 0.25° × 0.25°; Scale 1:445,500 | IMD—Pune |
Land Use/Land Cover (LULC), NDVI, MNDWI | The data was provided by the United States Geological Survey (USGS). Atmospheric corrections were made to clean the data, followed by mosaicking for layer creation | Landsat-9, Level-2, Collection-2 imagery with 30 m spatial resolution; UTM, Zone 45N; WGS84 datum; Acquired on 08th March, 2025 | USGS Earth Explorer |
Distance from River (DTR) | OpenStreetMap (OSM) provided the data, which was then processed using the "Spatial Analyst Tool" in ArcGIS | Vector data; Scale 1:445,500 | OpenStreetMap |
Lithology, Geomorphology, DTL, Geology | Digital shapefiles were retrieved from the Geological Survey of India (GSI), and the layers were constructed using ArcGIS software | Vector layer, Scale 1:445,500 | Geological Survey of India |
Soil | Digital shapefiles were retrieved from the Digital Soil Map of the World, and the layers were constructed using ArcGIS software | Vector layer, Scale 1:5.000.000 | Digital Soil Map of the World |
Landslide Inventory | Data for the landslide inventory was sourced from the NASA Global Landslide Catalog, the Geological Survey of India, Google Earth, and GPS | Vector layer with point data | NASA Landslide Catalog, GSI |
For the past two decades, Geographic Information Systems (GIS) have proven to be a vital tool in effectively analyzing landslide hazards, vulnerabilities, and risks [274]. This research developed a GIS spatial database (ArcGIS 10.4.1), encompassing 23 landslide conditioning factors. We leveraged high-resolution data from the Advanced Land Observing Satellite (ALOS) Phased Array type L-band Synthetic Aperture Radar (PALSAR) and LANDSAT-8/9, which were retrieved from the Alaska ASF and USGS websites. The ALOS PALSAR Digital Elevation Models (DEMs), with a spatial resolution of 12.5 m, were combined using the "Mosaic To New Raster" tool in ArcGIS’s "Data Management Tools." The DEMs were then processed through the "Hydrology" tool in the "Spatial Analyst Tools" to clip and fill the datasets.
Using the processed DEMs, we generated thematic layers for key parameters such as Elevation, Slope, Aspect, Curvatures (Plane, Profile, Standard), Terrain Ruggedness Index (TRI), Topographic Position Index (TPI), Stream Power Index (SPI), and Topographic Wetness Index (TWI). Additionally, satellite images were radiometrically corrected using the GIS platform to compute LULC and NDVI and mNDWI values. Satellite images were derived from the USGS (United States Geological Survey) portal (https://earthexplorer.usgs.gov/).
For precipitation data, we gathered long-term annual rainfall (1901–2024) information from the Indian Meteorological Department (IMD) website (https://www.imdpune.gov.in). Thiessen polygons were created using the "Proximity" tool in ArcGIS to determine the influence zones of rainfall measurement stations.
Further thematic layers for geological features such as lithology, lineaments, geomorphology, and geology were derived from the Geological Survey of India (GSI) portal (https://www.gsi.gov.in). Thematic data for distance to roads was created using Open Street Maps (OSM) data and analyzed with the "Spatial Analyst Tool."
Finally, all layers were resampled into the “Reclassify” function in the “Spatial Analyst Tools” in ArcGIS 10.4.1 into a 30-m spatial resolution and further incorporated according to the methodology of the Machine Learning (ML) algorithms to finalize the landslide susceptibility zonation of the region.
For landslide susceptibility zoning, various machine learning models such as Support Vector Machine (SVM), Gradient Boosting (GB), Extreme Gradient Boosting (XGB), Random Forest (RF), Extremely Randomized Trees (ET), Logistic Regression (LR), and ensemble meta-learning classifiers (MC) were employed. The overall methodology for this research is visually depicted in Fig. 3.
[See PDF for image]
Fig. 3
Graphical presentation of the methodological flowchart of the present study
Integrating high-resolution spatial data and advanced machine learning techniques, this multi-faceted approach provides a robust framework for creating accurate and reliable landslide susceptibility maps.
Methodology
In this study, six established machine learning algorithms were employed to predict landslide susceptibility within the Namchi District of Sikkim. The selection of these models—Support Vector Machine (SVM), Gradient Boosting (GB), Extreme Gradient Boosting (XGB), Random Forest (RF), Extra Trees (ET), and Logistic Regression (LR)—was a deliberate choice. These algorithms were chosen for their proven effectiveness in handling high-dimensional geospatial datasets, identifying complex spatial patterns, and exhibiting success in previous landslide prediction applications [76, 77]. The diversity in model types, ranging from simpler linear models to advanced ensemble techniques, provides a comprehensive analytical framework.
The models were rigorously trained using a comprehensive set of environmental features known to influence landslide occurrence in mountainous regions, including elevation, slope, land use/cover, rainfall, and proximity to infrastructure [221, 222, 270]. Collecting and processing these varied input features is critical, as landslide initiation is rarely tied to a single factor but rather to the complex interplay of numerous topographic, hydrologic, and anthropogenic variables.
Methodology and significance of correlation matrix for landslide study
The methodology involved loading landslide-related data from an Excel file into a pandas DataFrame. Subsequently, numerical columns were identified and selected for analysis, and the data was checked for missing values, none of which were found. The methodology's core was the computation of the pairwise correlation coefficients for all selected numerical features, resulting in a correlation matrix. This matrix quantifies the linear relationship between each pair of variables, ranging from -1 (perfect negative correlation) to + 1 (perfect positive correlation), with 0 indicating no linear correlation [209]. The significance of using a correlation matrix in landslide studies is its ability to reveal interdependencies between various causative factors such as elevation, slope, aspect, rainfall, land cover, and geological features. By identifying strong positive or negative correlations, researchers can understand which factors tend to vary together, which is crucial for identifying potential triggers and predisposing conditions for landslides. Furthermore, the correlation matrix helps detect multicollinearity among predictor variables, which is essential for selecting independent variables in statistical modeling and machine learning approaches used for landslide susceptibility mapping and risk assessment (Fig. 4).
[See PDF for image]
Fig. 4
Correlation matrix among the landslide causative factors
Support vector machine (SVM)
SVM is a powerful supervised classification algorithm renowned for its ability to manage high-dimensional data and effectively model non-linear relationships between features [45]. The core idea behind SVM is to find an optimal hyperplane that maximizes the margin of separation between the two classes (landslide-prone and stable categories). This maximization of the margin is essential, as it enhances the model's ability to generalize to new, unseen geospatial data.
The optimization problem for SVM is formulated as:
where is the weight vector, are the feature vectors, is the bias term, and are the labels. Critically, the use of non-linear kernels in SVM allowed the model to capture complex, non-obvious relationships among topographic, hydrological, and land-use factors influencing landslide risks across the region.
Gradient boosting (GB)
Gradient Boosting (GB) is an ensemble method that iteratively builds a series of weak learners (decision trees) in a sequential manner, where each subsequent tree is trained to correct the residual errors made by the previous ones [54]. This sequential error correction process allows GB to achieve high predictive accuracy. The prediction at iteration is given by:
where is the learning rate.
Extreme gradient boosting (XGB).
Extreme Gradient Boosting (XGB) represents an optimized and scalable version of GB [41]. XGBoost significantly improves model performance by introducing a regularization term to explicitly penalize model complexity, thereby preventing overfitting and improving generalization. The objective function is expressed as:
where is the regularization term. Both GB and XGB were trained using 23 environmental factors such as elevation, slope, proximity to roads, and soil type. These ensemble models are particularly effective in capturing the intricate, non-linear interactions between these numerous input variables, which is paramount for accurate landslide susceptibility predictions.
Random forest (RF)
Random Forest (RF) is a robust ensemble learning algorithm that operates by constructing multiple decision trees during training using bootstrap aggregating (or bagging). It aggregates their predictions (voting for classification) to improve overall classification accuracy and stability [31]. The final prediction for an input is calculated as the average of predictions from each individual tree:
RF is highly valued in geospatial applications for its ability to handle large datasets with high-dimensional features and for its inherent robustness to overfitting. Extra Trees (ET) is a variant of RF that introduces even more randomness during the tree construction phase by randomly selecting both a subset of features and the split points at each node [74]. This added randomness often leads to better computational efficiency and can sometimes result in superior predictive performance by increasing the diversity among the individual trees.
Extra trees (ET)
The Extra Trees (ET) algorithm, a randomized modification of Random Forest (RF), is particularly well-suited for Landslide Susceptibility Mapping (LSM) due to its ability to handle complex geospatial data. The core mechanism involves introducing extreme randomness when constructing decision trees, both in feature selection and split-point determination [265, 266, 268, 269]. This process effectively creates highly diverse individual predictors, which is crucial because landslide occurrence is governed by the intricate, non-linear interactions of numerous factors (e.g., slope, rainfall, lithology, and land cover). By using the entire training dataset without bootstrapping, ET maintains high predictive power while often improving computational efficiency. The final susceptibility prediction, , for a given location is derived by averaging the outputs of the individual trees , yielding a continuous susceptibility index or probability:
This ensemble aggregation enhances the model's robustness against noise and overfitting, which are common issues in real-world geospatial datasets that often contain errors or incomplete data. Therefore, ET's structural design allows it to accurately model the heterogeneous environmental conditions that define landslide risk, resulting in reliable and highly generalized susceptibility maps [64, 65].
Logistic regression (LR)
Logistic Regression (LR) is utilized as a fundamental, linear classification algorithm that models the probability of a binary outcome, specifically the likelihood of a landslide, using the logistic function [88].
where is the probability of a landslide occurring. While LR is significantly less complex than the ensemble methods, its inclusion is crucial as it provides a valuable baseline model for understanding and quantifying the linear relationships between features (like slope and elevation) and landslide susceptibility. However, as expected, LR struggled to fully capture the complex, non-linear dependencies present in the dataset.
Meta-learning
To achieve optimal model performance and leverage the unique strengths of each model, a Meta Classifier (MC) was employed. This technique utilizes stacking, a powerful ensemble learning method where the outputs of the individual base models are combined. The aggregated prediction is calculated as a weighted sum:
where is the prediction from base model , and is the optimized weight assigned to it [32]. This ensemble approach allowed the MC to synthesize the information from all base models, resulting in more reliable and accurate predictions for landslide susceptibility than any single model could achieve alone.
Description of the parameters
Landslides are natural events that can be influenced by various environmental, topographic, and human-induced factors can influence. Several parameters, such as elevation, slope, aspect, Topographic Position Index (TPI), curvature, and many others, play significant roles in determining landslide susceptibility. These factors contribute uniquely by directly influencing soil stability, water retention, or erosion, or indirectly through their effects on vegetation, human activities, or geological conditions. The various factors contributing to landslides and their role in triggering such events are elaborated below:
Elevation
The critical characteristic of landslides is elevation. As a rule, elevations above the sea level are accompanied by steeper slopes, which disposes the latter to failures because of accumulating potential energy driven by the relative variations of the elevation. Namely, the steep gradients, which tend to occur at the elevated locations, and the fragmented rock and poor structural soils have also been identified as the factors that increase the risks of landslides. Also, changes in elevation lead to climatic variations that influence the weathering and erosion processes. Freeze-thawing activities frequently found in higher altitudes may worsen the state of the structures of rocks and expose them to possible displacements [63]. Also, the accumulation of the snow at these areas may cause the undue pressure upon geological substrates, thereby leading to landslides under the thaw conditions [33].
In geological terms, the triggering of landslide processes is usually associated with tectonic processes and the relief of the landscape in existence. Tectonically active areas with a high rate of movement of rocks are also distinguished by an increased rate of landslides due to a change in the ecological situation predisposing the development of unstable slopes [70]. As an illustration, it is stated that the correlation between tectonic forces and an increase in topographical relief is well-documented, which implies that unstable slopes may be formed on the territory exposed to geological forces [164]. The study area's elevation ranges from 165 to 5683 m above sea level using the Advanced Land Observing Satellite (ALOS) Phased Array type L-band Synthetic Aperture Radar (PALSAR) data, which is shown in Fig. 5a.
[See PDF for image]
Fig. 5
Landslide conditioning factors. (a) Elevation (b) Slope (c) Aspect (d) Topographic Position Index (TPI) (e) Profile curvature (f) Plane curvature (g) Standard Curvature (h) Roughness index (i) Distance to lineament (DTL) (j) Geology 1. Chungthang fm. (central crystalline gneissic complex gp.), 2. Daling gp. (Buxa fm.) 3. Daling gp. (Gorubathan fm.) 4. Daling gp. (Reyang fm.) 5. Kanchenjunga gneiss (central crystalline gneissic complex) 6. Lower Gondwana gp. (Bhareli, Damuda fm.) 7. Lower Gondwana gp. (Rangit pebble slate fm.), (k) Geomorphology 1. Alluvial plain, 2. Flood plain, 3. Highly dissected hills and valleys, 4. Mass wasting products, 5. Moderately dissected hills and valleys, 6. Snow Cover, 7. Waterbody – river (l) Lithology 1. Amphibolite, 2. Banded migmatite, garnet bt gneiss, mica schist, 3. Boulder slate, conglomerate, phyllite, 4. Calc granulite with /without quartzite, 5. Calc silicate rock, 6. Chlorite sericite schist and quartzite, 7. Dolomitic quartzite, chert, phyllite, slate, 8. Meta greywacke, 9. Mylonitic granite gneiss, 10. Quartz arenite, 11. Quartz arenite, black slate, cherty phyllite, 12. Quartzite, 13. Sandstone, shale with minor coal (m) Sediment transport index (STI), (n) Topographic ruggedness index (TRI), (o) Stream power index (SPI), (p) Topographic wetness index (TWI), (q) Longterm mean annual rainfall (1901–2024), (r) Distance from drainage (DTD), (s) Distance from roads (DTR), (t) Modified normalized difference water index (mNDWI), (u) Normalized difference vegetation index (NDVI), (v) Soil (FAO) and (w) Land use landcover (LULC)
Slope
Along with elevation, slope inclination is likely the most common feature that defines the possibility of a landslide. It is in the steeper areas where landslides are prone, as gravity pulling is simple regarding its mechanical action on the soil and rock bond. Studies show that landslides are likely to occur in slopes with different gradients, implying that the equilibrium between the driving and resisting forces can easily be altered in steep slopes [14, 150].
Other topographic measures, such as TPI and curvature, also evaluate slope stability. They are giving information on the interaction between gravitational forces and hydrological conditions on land with the shape of a given landscape, which causes landslides. For example, it has been noted that small local topographic regions with little root reinforcement, often located in steeply sloped land, are prone to slides [63, 150]. Vegetation, in its turn, can serve as a compensatory effect in stabilizing slopes since it improves soil cohesion, thereby reducing the possibility of landslides in forested areas [63, 211].
Aspect
Aspect has been found to significantly influence the dynamics of landslides, mainly via its impact on microclimate parameters, i.e., the exposure of sunlight and maintenance of moisture. Those oriented southward in the Northern Hemisphere obtain more sunlight and thus could dry out faster, thus causing loss in soil cohesion and causing the soil to erode easily. On the other hand, north-side slopes can hold water and are prone to landslides during high precipitation levels or snow thawing [34]. This has been observed in the literature, expressing that this aspect is coupled with other geo-environmental factors that affect the occurrence of landsliding [34, 155]. Nonetheless, studies have indicated that this aspect can be obscured by other parameters, implying that it could have a more evident effect in definite geological conditions (i.e., in geological features where clay deposits are present [276].
Topographic position index (TPI)
The Topographic Position Index (TPI) is a significant quantitative mapping tool that indicates the degree of a location up or down relative to the surrounding terrain, hence determining the ridges, valleys, and plateaus. TPI values of the valleys are negative and can refer to the high landslide susceptibility, as more water accumulation results in higher soil saturation levels. On the other hand, the ridgelines that have positive values of the TPI are less prone to landslides, but the steep slopes transversing these areas might be susceptible to failures when under heavy water runoff conditions [73, 147]. It has also been noted that TPI plays an imperative role in risk assessment of landslides in models that combine diverse topography features and establish on their basis the landslide susceptibility [106] (Fig. 5d).
Plane curvature
Plane curvature is essential to the geotechnical stability of slopes, which is mainly manifested in the presence of landslides. Slope items are concave and thus have some inward curve that makes them accumulate water in the low areas. It leads to an increase in moisture content in the soil, hence weakening the shear strength of the soil, and making it more prone to sliding, especially in the case of heavy rains [8, 149, 235, 237]. On the other hand, convex slopes are outward curving and usually facilitate drainage because the water flows away as it moves down the slope. The accumulation of water is put off balance and the erosion minimized, increasing the stability of the hill [161, 226, 235, 237]. The studies have indicated that the concave shapes are more often associated with landslides because the additional moisture has harmful effects on the health of soil, whereas the convex ones are more stable due to a better drainage effect [8, 235, 237].
Profile curvature
Moreover, profile curvature, a characteristic of the profile curvature in the direction of the steepest descent, significantly influences the dynamics of the slope stability. The concave profile curve makes the water at the lower sections of the slant and increases the amount of moisture in the soil, thereby compromising the structure of the soil, as is the case when dealing with plane curvature [161, 235, 237]. In contrast, the curvature of the profile is more convex, and drainage capacity is increased, reducing the concentration of water and stabilizing the weight of the cliff [226, 235, 237]. Therefore, profile curvature, coupled with the susceptibility of landslides, demonstrates the significance of these topographic characteristics in slope stability analysis.
Standard curvature
Moreover, standard curvature combines plane or profile curvature, which plays a significant role in slope stability. The positive standard curvature is associated with easy drainage of water, which minimizes erosion and increases mass stability. Negative standard curvature slopes, on the other hand, tend to accumulate water more often than not, which weakens the soil structure all the more, making the soil prone to failure, particularly in instances of severe rains [161, 235, 237]. It has been established that through analysis, the curvatures can be of crucial significance to ascertain the conditions that precipitate landslides leading to potential that can be used as functional parameters in risk assessments and land management approaches [8, 149, 8, 149].
Conclusively, a complex look into the actions of plane, profile, and standard curvatures regarding slope dynamics is crucial to showing pendulums where landslides will happen and devising ways of taming down their formations. Hydrological processes induced by such curvatures directly affect the content and structural regime of soil moisture, which is vital in analysis on slope stability [161, 235, 235, 237, 237].
Roughness
Measurement of the unevenness of the terrain surface called the Roughness Index (RI) is a measurement that will help in the assessment of landslide susceptibility. Due to a higher level of roughness, in landscapes we are able to observe steeper and more unstable slopes and this increases the chances of failures [217]. The trend is backed by the fact that irregularity of ground may disrupt proper drainage of water leading to pool of waters. The additional weight on the ground as a result of such accumulation may lead to landslides in time of heavy precipitation [42, 171]. Moreover, rugged topographies are prone to erosion in the first place,erosion, in its turn, may gradually loosen the soil, leading to a more effective emergence of landslides, especially those following rainfalls [186, 186]. It is imperative to note, as part of this discussion, that the across rough terrain shear stress distribution will also affect the odds of slope slide, given that the presence of irregularities may introduce the weak spots within the material [153]. Besides, unfavorable hills could hamper the growth of vegetation. Plants are generally instrumental concerning stabilizing the slopes due to their roots which hold down the ground making it immune to dislodging factors of landslides [21]. Thus, the Roughness Index becomes a crucial part of the landslide risk assessment particularly when it is combined with numerous environmental and anthropogenic parameters [240].
Distance from lineament (DTL)
Besides surface roughness, the slopes' closeness to the geology lineaments, like faults and fractures, and other variables, determine landslides. Lineaments are regarded as the markers of weakness in the Earth's crustal structure. Slope expositions adjacent to such features are at an elevated likelihood of collapsing because the strength of the materials beneath them is weakened [267]. These geological structures may also promote the water infiltration intensity, which results in an elevated level of moisture in the soil due to the presence of a certain amount of moisture causes a considerable decrease in the binding strength of the ground, which increases the hazard of landslides [171, 240]. The accumulation of stress on the lineaments can cause disturbances within the neighboring slopes, especially those formed in an environment or prevailing conditions with high gradients or saturated with excessive precipitation [42]. The weakening of materials along these lineaments makes them orient themselves towards gravitational movements, increasing landslides [186]. In addition, the presence of active geological activities like seismic activity in the tectonically efficient areas may trigger landslides, particularly close to such structural lineaments, and, consequently, the observation of such impacts on slope deformation is fundamental [267]. The distance to these lineaments is a decisive measure in the landslide risk estimation, as the shorter the distance, the more liable and prone one can be to landslides [171].
Geology
Landslides in the Namchi district of Sikkim have been observed due to many geological factors that play essential roles. The region where the district is situated is between the plates (Indian and Eurasian tectonic plates), which makes it seismically active, making it capable of loosening the crust of the earth, thereby promoting the possible occurrence of landslides, especially during earthquakes [169]. This tectonic movement causes different types of geologic structures, including faults and fractures that may destabilize slopes and make them more prone to fail [169, 214].
The geological nature of the region plays a significant role in slope stability; the color of the rock in Namchi is an assortment of different types of stones, such as sedimentary rocks, metamorphic rocks, and igneous rocks. The sediments also become more susceptible to erosion, whereas the metamorphic ones, although more resilient, may have weak points vulnerable to stress fractures [12]. As a remark, the water flow in these rocks and the soil saturation level are primarily influenced by the hydrological situation impacted by the permeability of such rocks. More floodwater may result in weaker soil cohesion, increasing the possibility of landslides, particularly on bankal slopes [75, 215]. These conditions may be aggravated by extreme rain-related events, resulting in a fast saturation of the soil and the consequent generation of landslides [49, 215].
Road construction and deforestation, as other forms of anthropogenic influence, also escalate the danger of landslides due to the change in natural slopes and drainage systems [69, 230]. Such anthropogenic processes can potentially interfere with the stability of slopes, increasing the risk of landslides, especially when compared to the occurrences of a heavy precipitation event, which is widespread in the area [69]. The insights into the relationship of such geological and anthropogenic factors are crucial to justifying the risk of landslides and developing effective mitigation measures applicable to the uncovered Namchi district.
Geomorphology
Besides, the area's geomorphologic features, which are steep and rugged terrain typical of the Lesser Himalayas, make the district prone to landslides. Any slope with such a steep angle (more than 30 degrees) is highly prone to gravitational failures under severe rain or earthquakes [69, 234]. Geomorphology also interacts with stream channels and seasonal streams, which cause erosion of the bottom of slopes and contribute to landslides’ emergence [2, 75]. Moreover, the weathering processes lead to an increased amount of loose materials, which makes stability worse, and seismic activity can cause the ground shaking, making the slopes with poor stability even more unstable [43, 251, 253, 254].
In summary, combining the intricacies of geologic conditions, such as lithology, topography, hydrology, and the human factor, all contribute to the greatest extent to creating a landscape within the Namchi district that would significantly contribute to the landslide danger. It is essential to understand such interrelations to have effective hazard mapping and the realisation of prevention measures.
Lithology
The lithology plays an essential role in determining the rate and magnitude of landslides, especially in geological terrain such as the Namchi district in Sikkim, characterized by varied rock types and dynamic topography. The stability of the slopes in the area can be attributed to a highly complicated play of factors, which comprises the type of rocks, geologic structure, and environmental conditions. Rocks that are not prone to attack by the weathering processes include granite and gneiss, and they are, in most cases, less prone to landslides. Fractures may lead to weaknesses, mainly when seismic movements occur, as the study shows that the tectonic movement can weaken these structures, thus leading to slope failures [13, 191].
On the other hand, weaker weathering and erosion may occur quickly, eroding softer sedimentary rocks like shale and mudstone, reducing soil cohesion. The predisposition is enhanced by heavy rain, whereby excess water in the soil escalates the pore pressure in wet soil, thus destabilizing slopes [136, 191, 205]. Tendencies of these types of lithologies to erode contribute to their riskiness when there is rainfall during periods of monsoons, where large amounts of rainfall are experienced, thus adding risk of experiencing landslides in the area of Namchi [22].
In addition, lithology weathering processes aggravate the landslide risk. Sandstone and other rocks, which have a fast weathering process, may break apart to produce loose geological structures. This is supported by sensitivity concerning strong weather processes that may make the rocks inadequate to support geomorphic features, such that the state of affairs would favor landslides [12, 60]. Besides, weathering levels and clay minerals also significantly influence the hydrological behavior of slopes, as with increased moisture holding, slope stability decreases, especially in steep areas as at Namchi [190, 264].
The landslide risk is also affected by structural features of rock (faults and joints). The potential slip surfaces in smoothed-out topography are these natural weak points, which become favoured during destabilizing processes, such that the balance is tipped into failure. Such structural interactions and the increased tectonic activity of the area significantly contribute to the frequent landslides that can be activated suddenly. This shows the complex interactions between lithological composition, geological characteristics, and external influence by events like seismic vibrations [13, 191]. Such a synthesis highlights the importance of in-depth geological examinations within landslide hazard mitigation projects to help not only comprehend the threats of particular lithological situations but also adequately curtail them.
Human activities also worsen this risky situation. Infrastructure development and agricultural activities on unstable inclined slopes raise the susceptibility of places in Namchi to landslides. The land use planning without considering geological features and land stability increases the risk of landslides during heavy precipitation or earthquakes; therefore, geological information should be regarded in city and rural planning [112, 138]. Conclusively, it can be stated that landslide risk assessment and management techniques should focus on the intricate combination of the study of lithology, weathering processes, structural geology, and human impacts in Namchi and other areas.
Sediment transport index (STI)
The Sediment Transport Index (STI) is a critical parameter used to assess the processes of sediment occurrence and their effects on landslides. It combines various environmental conditions (including rainfall, slope steepness, soil properties, and vegetation cover) to estimate the risk of sediment mobilization in different landscapes. High STI values are usually linked to higher erosibility and slope instability. This relationship indicates that the circumstances of these events during heavy rain, especially those characterized by steep slopes and loose unarmored soils, favors the displacement of sediments and compromises the stability of the slope, as well as the fact that they enhance the possibility of the onset of landslides [42, 115].
Water-induced erosion is one factor leading to the effect of STI on the risk of landslides. The areas with high STI values are usually characterized by significant rainfalls and run-offs, which, in addition to increasing sediment transport efficiency, erode the surface soil and makes slope structures unstable in the long run. A priority that can cause additional destabilization refers to the accumulation of displaced materials at the base of a slope, resulting in the formation of conditions where landslides are likely to happen [101, 208]. Besides, vegetation in this dynamic is a mediating factor, the denser the vegetation, the more stable soil has due to protection against erosion and reducing the STI values. On the other hand, sparsely vegetated areas or those with damaged soil structure are more likely to experience erosion occurring underneath, as those areas depict new STI values and an increased risk of landslides [120, 230].
The results of STI on landslide susceptibility have long-term effects. The increased and sustained STI values may result in constant erosion and slow weakening of materials in the slopes, facilitating the occurrence and severity of the landslides. The STI data can be monitored systematically to provide sensitive areas so that researchers and planners have the information required to anticipate future risks of landslides and improve the land-use approaches [250, 257]. With the integration of STI tests into models of landslide hazards, stronger structures can generate risk evaluation and hazard prevention capabilities, which would eventually result in the development of better land management and a proper disaster preparedness policy [131, 132, 231].
To sum up, STI is an essential instrument for analyzing landslide dynamics that is part of a complicated system of interactions. This means that the extensive landslide susceptibility assessment conducted by integrating the various environmental factors is vital in offering researchers and land managers great ideas for developing effective strategies for mitigating landslides and land-use planning. The resulting STI values ranged from 7.37 to over 19,783.5, and we created a map (Fig. 5m) to represent these values’ geographic variability visually.
Topographic ruggedness index (TRI)
The Topographic Ruggedness Index (TRI) is an essential indicator of measuring terrain ruggedness and its bearing on landslide potential. The TRI is concentrated on measuring the elevation change in the concrete region, which will give information about the landscape's irregularity and ruggedness. The studies have shown that the values of the TRI are usually higher in more unsteady and uneven terrains, which may increase the risk of landslides [196]. In addition, the geographical area characterized by significant ruggedness is often more susceptible to exogenous stimuli like heavy rain, earthquakes, or artificial disruption that may trigger soil and rock motions. On the contrary, the landscape with smaller TRI will likely have less severe slopes, implying a lesser chance of landslides. However, it is necessary to remember that other aspects are far more crucial to possible landslide development, such as the soil type or the level of vegetation cover [187, 216].
Moreover, it is essential to consider other environmental factors in combination with TRI, which will better assess the risks of landslides. For example, the connections between soil mechanics and hydrological conditions are essential since soil moisture and pore water pressure variation may significantly impact the slope stability [27]. Many sources have mentioned the intricate relationship of landsliding causative processes. Although the topographical ruggedness becomes a proper investigation, it should be framed in the larger context involving geological, hydrologic, and human elements. Such a detailed study assists in establishing newer and better models to predict occurrences of landslides [96, 111].
Lastly, since the nature of terrains varies with location globally, it is possible to improve the incidence of risk locally by identifying the type of terrain that interacts with TRI and other components of landslides. With the help of elaborated lists of historical landslides and modeling methods that rely on the integration of various predisposing factors, scientists may enhance the possibility to predict landslides in different geographic settings accurately [51, 245]. The study area’s TRI value ranged from 18.19 to 1370 (Fig. 5n).
Stream power index (SPI)
Stream Power Index (SPI) and Topography Wetness Index (TWI) are crucial hydrology indices and geomorphology indices determining landslide risks in watersheds. The SPI is considered a product of the flow (discharge) and the steepness of the slope, which allows for projecting the danger of erosion and landslides in the area. Steep slopes and high water discharge are linked with high values of SPI, which increases the erosion capacity of flowing water, making the soils more prone to landslides when they are loose or saturated [42]. On the other hand, the low SPI values reflect the weaker slopes and reduced water intensity, which usually implies the decreased likelihood of central erosions and landslides. However, the impact of other variables, including soil composition and anthropogenic activity, plays a prominent role in landslides [179]. Stream power index (SPI) as calculated in the present study is as follows:
where, is the description of the particular upslope region (catchment area), and is the slope of the area The Raster Calculator tool in the “Spatial Analyst tools” was used to design SPI of the Study area. It was observed that the values of calculated SPI ranged between 0.0001 and 936,247 which gave a significant insight on the erosional power of the landscape (Fig. 5o).
Topographic wetness index (TWI)
The topography information is used to decide the TWI which shows the possibility of water accumulation in a region. It is mathematically reliant on the area of the grounds displacing it and the angle of the slope which is displayed by the equation:
In case a is the particular upslope area or catchment area, and the slope of the region is represented by . Also, a denotes the total area of a basin, and L denotes the linear distance of the contour [23, 174]. Large values of TWI may indicate a low slope and large contributing areas that are likely to make an area prone to the accumulation of water, enhancing the level of soil moisture and, therefore, susceptibility to landslides, in particularly and particularly the heavy rainfall regime [42, 184]. On the other hand, the areas with lower TWI values are generally steeper or convex,hence, they have reduced water gathering and usually lower landslide hazards. However, other environmental conditions, including vegetation and soil properties, are also significant determinants [161]. TWI map is created through the tool of ArcGIS 10.4 Spatial Analyst Tools and the Raster Calculator tool was used. In the study area, the values of computed TWI ranged between 0.08 and 23.79 (Fig. 5p).
The topography information is used to decide the TWI which shows the possibility of water accumulation in a region. It is mathematically reliant on the area of the grounds displacing it and the angle of the slope which is displayed by the equation:
In case a is the particular upslope area or catchment area, and the slope of the region is represented by 9. Also, A/L A denotes the total area of a basin and L denotes the linear distance of the contour [23, 174] (Fig. 5p).
Rainfall
Precipitation Rainfall also contributes to major factors that cause landslides, such as changing water content in the soil and destabilizing the slopes. The relationship between the amount of rainfall and the level of saturation of soil conditions the vulnerability of landslides; excess rain may erode the soil quickly and put a lot of weight and friction between grains, which accelerates the onset of landslides [28, 236]. Moreover, rain effects depend on strength and cumulative conditions: intense rain can produce a surface overland flow. It can form shallow landslides, and extensive rainwater can overwhelm the ground [226]. Areas with poor drainage or other safeguarding vegetation cover are particularly vulnerable during intense climate conditions, which further discloses the relationship between rainfall levels and landslides [97]. The Indian Meteorological Department conducted this research and access to mean rainfall data was acquired over 123 years (1901–2024). The study computed the mean annual precipitation using gridded precipitation data and prepared a rainfall map of the region of study. To come up with the final rain map, Kriging interpolating method was applied. It has been considered an adequate approach because of the possibility to generate solid and precise forecasts of the spatially correlated data. The rainfall distribution map showed that the southern parts of the district had the best precipitation of the year (3031.9 mm/year). Comparatively, the least amount was recorded in the northern and western regions (2, 622 mm/year) (Fig. 5q).
Distance from drainage (DTD)
The Distance to Drainage (DTD) is an essential spatial aspect of analyzing the distance between places to the drains characterized by a network of rivers, streams, or drainage ditches. This is a critical measurement for work that entails erosion, flooding, and landslide hazard. The nearer is the location to these water features, the more likely soil will be saturated and, as a result, its structures weaken and the landslide risk increases especially on steep slopes where water can enhance the destabilization of slopes [124, 177, 180]. The increase in moisture levels can usually create conditions that support landslides because it can decrease soil cohesion and add weight to soil [124], which will finally cause it to fail. Moreover, drainage features may wash out the adjoining slopes and destabilize the landscape [124, 177]. On the other hand, areas that lie farther away from drainage forms tend to be less wet and therefore pose a lesser imminent risk of landslides. Nevertheless, they can remain exposed to these risks in extreme weather conditions [113].
Distance from roads (DTR)
Another important spatial variable is the Distance to Roads (DTR), which depicts the nearness of a site to transport structure, with enormous implications for assessing landslides. Roadsides may be especially susceptible to landslides because road-building activities and their upkeep are destabilizing and may involve excavation and other forms of terrain manipulation [165, 166, 170]. Such destabilization undermines the natural cohesion of soil, and it may even hamper water drainage and result in more water accumulation on slopes, leading to slope failures [166, 170]. Scientific studies also show that the likelihood of landslides increases significantly when the roads are concentrated in a particular region or have been used to tear the natural contour that forms the landscape [61, 137]. On the other hand, sites that are more distant to infrastructure are associated with a lower level of human-caused disturbance, which improves terrain stability, but sites are susceptible to other environmental influences that would impact landslide activity, including rainfall intensity and geological attributes [139, 250].
The knowledge of both DTD and DTR is a key factor in the situation of the landslide hazard zoning as it leads to the better identification of susceptible areas by the resource managers and planners and helps to employ more enlightened land-use, erosion management, and flood control measures (Shrestha et al., 2018; Rawat & Joshi, 2011). Using these distance measures in geographical information systems (GIS) has been decisive towards predicting prospective landslides and the mitigation strategies to be adopted [170], (Shrestha and Poudel 2018). The closeness to roadway is the main reason why the roads are a vital element in the mapping of the landslides susceptibility because it is proven in various publications that areas that are near the road have a high probability of bearing the landslide [1, 7, 122] (Fig. 5s).
Modified normalized difference water index (mNDWI)
mNDWI is a remote sensing resource that helps identify and analyze water bodies within satellite images, especially in situations where other land cover forms hinder water bodies' visibility. This index uses shortwave infrared (SWIR) and green bands to clarify water detailing appeal by augmenting water reflectance signs and reducing the effects of the neighboring vegetation and the ground. Research on mNDWI has emphasized its potential for demonstrating changes in the soil moisture, which plays a vital role in landslide vulnerability and intensity. Areas with high mNDWI readings are also known to be the sites of high humidity, which may destabilize soil tenure and risk causing landslides [99, 193].
The mNDWI is useful in landslide studies since it can easily point out a change in surface water due to precipitation and isolate areas likely to experience runoff where the sources might affect slope stability. The tool becomes especially useful when segmenting waterlogged regions and dense vegetation to narrow down the definition of the vulnerabilities of landslides within different terrains [72, 157]. Inspecting the mNDWI report before and after significant weather effects will enable the researchers to evaluate the hydrological parameters that can lead to landslides as a aid to improved responses to disasters and land management planning [93, 250] (Fig. 5t).
Normalized difference vegetation index (NDVI)
In addition to the mNDWI, Normalized Difference Vegetation Index (NDVI) is a critical remote sensing index dedicated to artificial monitoring of the vegetation density and health status based on the values of reflection about the red and the near-infrared spectrums. It is especially applicable when undertaking landslide analysis since it gives information about vegetation cover, which affects slope stability. Regions covered with dense and excellent vegetation are more resistant to soil erosion, thus limiting the chances of the land being eroded. On the other hand, reduced NDVI can indicate poor or scarce vegetation, implying a lack of root systems that bind the land, hence becoming prone to landslides [128].
NDVI is also used to monitor the health of the vegetation that may be in reaction to environmental stress effects. Research also shows that stressors like drought or deforestation directly worsen the vitality of the vegetation, making landslides more risky due to the loose soil integrity. Through this relationship, the index has been found necessary in hazard assessment where vegetation distribution mapping can aid in determining areas to undergo restoration to reduce the chances of landsliding hazards [126, 262].
Altogether, the mNDWI and NDVI are the two indices that complement each other and increase our knowledge of hydrological processes and the state of our vegetation, which is very important considering landslide risk assessment. They find use even in remote sensing that helps not only identify the hazard but also in land planning that is informed and prepared in case of disasters.
A common remote sensing method in measuring vegetation coverage and its health that is a critical landslide-susceptibility factor is NDVI [1, 50, 249, 250]. NDVI is calculated by returning the difference between the reflectances of the near-infrared (NIR) and the red reflectance of the vegetation, with a range between − 1 and 1, the greater the value of NDVI, the more abundant the level of vegetation [277]. An increased NDVI indicates that the foliage is healthy and thick and thus is more likely to avert landslides since they offer soil stability as well as absorb rainfall. Conversely, when NDVI is low, it would indicate that there is a lower amount of vegetation and the region is more likely to trigger landslides as there is no strength due to lack of roots to hold the soil to the ground. Also alterations in NDVI with time can be used to determine alterations in the cover of the vegetation that can be used to identify changes in slope stability and landslide susceptibility.
NIR signifies the close to infrared band as well as RED signifies the red band [52]. The calculation of the NDVI was made by Dell ArcGIS through the raster calculator accessed on the spatial analyst tools. In the present survey the NDVI value is between − 0.60 and + 0.84 (Fig. 5u).
Soil
Soil is essential in the occurrence and severity of landslides; its composition, moisture, structure, and interactions with the environment are the main determinants of slope stability. As such, clay soils are known to be highly impermeable and hence have high water retention abilities. This feature predisposes them to saturation, especially, resulting in heightened instability and enhanced chances of occurrence of landslides [98, 116]. Sandy soils, on the contrary, are exposed to the risk of erosion, and in some circumstances, this can lead to landlides even though they are more permeable than clay soils [68, 168]. The physical structure of soils is also of great importance, the poorest structured soils with loosely attached particles are also more susceptible to displacement, particularly under heavy rainfall, which may worsen the conditions of landslides [35, 192].
A critical parameter is moisture content. Flooded soils become highly disjointed and heavier, hence, lacking the strength to withstand the pull of gravity [127, 246]. The complexity of this phenomenon is also enlarged by the realisation that shallow soils, commonly being located on top of rocks or material of denser nature, are much more likely to face a risk of stress-induced sliding, whereas deeper soils may absorb more water, which raises the likelihood of failure [44, 210]. More so, vegetation helps stabilize the slope; plants grow their roots to bind them and reduce the probability of erosion. Nevertheless, deforestation drastically increases the risk of landsliding by removing this natural protective ingredient [95, 229]. Moreover, the slope angle contributes significantly, the steeper the slope the more vulnerable it is to sliding, especially those without solid rooted materials, which tend to become dangerous under the effect of gravity once the soil is destabilized by either infiltration of water, or human activity [244, 275].
Land use and land cover
Land use and land cover interaction also influence landslide risk and frequency. Land use describes human behavior, including agricultural and urban development, and deforestation, whereas land cover is about the physical manifestation of the landscape [229, 230]. The anthropogenic factor, especially urbanization and deforestation, can dramatically worsen the landslides by eradicating plants that stabilize the ground and disturbing the natural drainage systems [59, 168]. The use of agricultural methods, particularly in steep terrain, usually interferes with the soil structures, making such soils susceptible to erosion [92, 192]. Additionally, the risks of instability increase due to the possibility of introducing significant weight and changing the patterns of hydrology when developing a city [116, 210].
In summary, it should be noted that knowledge regarding soil properties and land use limitations on slope stability is crucial to forming and mitigating landslides. To cope with these environmental issues, it is vital to use sustainable land management techniques [246, 258].
Slope instability can be caused by land use and land cover modifications of mountains and increase the risk of landslides. Examples include cutting down forests or vegetation covers which may make trees and plants that provide root strength to hold the soil grimace easier and erode and slopes become unstable. The hydrological features of the soil can also be distorted by land use changes like transformation of forests into food or urban settlements hence the soil moisture regime and the pore pressure can fluctuate and raise the occasion of landslide activities. Type of vegetation may affect rainfall, surface flow and the rate of infiltration so that scrubs and grass surfaces may produce limited rainfall runoff and elevated pore water pressure which may set off landslides. Meanwhile, ERDAS Imagine software version 15 was used to validate the six different types of LULCs in the study area with accuracy measurement of 86 percent i.e. 1. water, 2. vegetation cover, 3. agricultural lands, 4. built-up areas, 5. bare ground, 6. snow cover (Fig. 5w). On the other hand, the management of land use through afforestation, reforestation activities and proper agriculture measures may stabilize slopes and mitigate landslides. Therefore, understanding of the relationship between land use and land cover changes and susceptibility to landslides plays a central role in landslide management in mountainous areas.
Preprocessing of explanatory variables
To improve the effectiveness of landslide susceptibility predictions, we purposefully selected 23 landslide conditioning factors (LCFs), which capture the distinct features of the geographical area being studied. These LCFs play a crucial role in understanding landslide events, as previous research has established their significant impact on such disasters. Therefore, it is essential to analyze how these factors correlate with landslide occurrences and their spatial distribution. While the specific roles of these factors can vary by region, it is clear that a combination of geo-environmental elements acts as a key regulator of landslides.
The careful selection of relevant LCFs is a critical step in landslide hazard modeling, as it improves prediction accuracy and minimizes potential interference, thus enhancing the overall performance of the model. However, it remains widely accepted that there is no universally agreed-upon guideline for selecting LCFs. In this study, 23 LCFs were chosen as independent variables to evaluate landslide susceptibility in the Namchi district of Sikkim, India. Once the key LCFs were identified, the dataset, which included data from 1020 landslide and non-landslide locations, was randomly split into two parts: 70% for training and 30% for testing. This 70:30 ratio is consistent with common practices, as studies have shown that using 20–30% of the data for testing and the remaining 70–80% for training typically yields the best results.
To mitigate the risks of overfitting or underfitting due to dataset size, various training/testing splits (70:30, 75:25, and 80:20) were experimented with. Ultimately, the 70:30 ratio proved to be the most effective, offering better performance than the other ratios.
Scaling of features is an important data preprocessing procedure so that each feature would have equal input on the learning of a model. Several machine learning algorithms are based on the distance measures, which are used as a feature classification tool (Euclidean distance). Due to this, feature magnitude can have a great effect on how such features influence training of the model. In case the model is dominated by feature that have higher values compared to small values the model will be biased and poor in performance.
To address this, normalization, a widely used feature scaling method, was applied to transform the explanatory variables into a consistent range between 0 and 1. This process aids in faster convergence during optimization and enhances the interpretability of the model, as described in the equation below:
Nonetheless, feature scaling is an optional step to all machine learning models. The algorithms like logistic regression, support vector machine (SVM), multilayer perceptrons (MLP), and k-nearest neighbors (kNN) work better when the feature scaling is performed. Conversely, tree-based models such as decision trees, random forests and gradient boosting do not usually require feature scaling and their performance does not usually suffer without the preprocessing step.
One-hot encoding (OHE) is yet another important method of machine learning that effectively translates categorical variables to numerical ones. In our research, we encode categorical variables such as geology, geomorphology, lithology, land use and land cover (LULC) and soil using one-hot encoding to form binary vectors. To give an example, lithology feature, having 13 categories, was turned into columns (binary columns), with one column displaying each lithology category. The column representing the lithology category of each observation was assigned ‘1’, whereas any other column had a ‘0’. The output of this process is different columns (one column of each category) that bring more dimensionality to the dataset.
Although dimensionality is growing, one-hot encoding will increase the capacity of the model in terms of describing the peculiarities of different geological, lithological, land cover, or soil types. This, consequently, results in better and more precise interpretation of information relative to the machine learning system. But we should take into consideration that one-hot encoding might lead to the curse of dimensionality and it makes the memory and computationally expensive. The entire methodology of the study is illustrated in the form of the methodological flow diagram, which is presented in Fig. 3.
Ensemble feature selection with RFECV
In machine learning, feature selection is a critical process that boosts the quality of models and their understanding. This is particularly important in the case when such methods as one-hot encoding are implemented, as the number of features in a dataset is likely to be augmented. The feature selection is useful to eliminate other less important variables in favor of a small number contributing to the predictability of the model. This is a robust feature selection method, which is effective in large-dimensional data to remove some insignificant or redundant features; it is known as the recursive feature elimination with cross-validation (RFECV).
Once we used one-hot encoding to create a numeric representation of the five categorical variables, we created 38 new features in addition to the 17 numerical features that were already on the data set. We applied the 6 machine learning models, consisting of logistic regression (LR), the support vector machine (SVM), a random forest (RF), an extra trees (ET), a gradient boosting (GB) and an extreme gradient boosting (XGBoost) to make sure that our feature choice procedure was complete and robust. Rigorous selection of meaningful features, as well as evaluation of the significance of all the features, was carried out under the conditions of fivefold RFECV on each model, selecting the most valuable features in different algorithms and across the validation subsets.
RFECV performs its duties in a loop of ranking, removal, and approval of features. The model is trained in every round using all the features and identify important features. A single feature gets deleted every time and the performance of a model is re-assessed after the usage of cross-validation with data split in several folds (to train and test the model). This is repeated until an optimal performance of the model is achieved, satisfying some pre-set halting constraint.
Supervised classification procedure
Model development is the next important step in the machine learning pipeline after carrying out feature scaling and selection. During this stage, a predictive model is trained based upon the cleaned and processed dataset. The selection of the model is based on the problem of interest and the nature of the data. We thought that deep learning models were not suitable in our estimation because our dataset is not large (1020 data points). Such models are characterized by the need of huge data to be able to learn intricate patterns and avoid overfitting. The deep learning models also have a disadvantage of being less interpreted, since they have been called black boxes quite often, which is a vital weakness in the mapping of landslide susceptibility where the importance of each feature has to be known.
Instead, we chose to train a number of various supervised classification algorithms: Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Extra Trees (ET), Gradient Boosting (GB) and Extreme Gradient Boosting (XGBoost) to 6 different iterations. In contrast to deep learning models, these models are more interpretable and suitable to smaller to medium-sized data in tables. In order to enhance the accuracy of prediction further, we created a meta-classifier, that is, we merged all the single machine learning models, where ensemble techniques often give better outcomes than single models.
In order to obtain the best hyper parameters per classifier, we ran our hyper parameter tuning by means of a fivefold cross validation mechanism and utilized the RandomizedSearchCV class of scikit-learn Python package. The Table 2 shows final set of hyperparameters chosen per model.
Table 2. Optimized Hyperparameters for Classifiers Selected via RandomizedSearchCV
Classifier | Hyperparameters | Values |
|---|---|---|
Logistic Regression | Solver | liblinear |
Penalty | l1 | |
Max Iterations | 100 | |
Class Weight | balanced | |
Regularization Strength (C) | 100 | |
Support Vector Machine | Kernel | rbf |
Gamma | 1 | |
Regularization Strength (C) | 100 | |
Random Forest | Number of Estimators | 50 |
Minimum Samples for Split | 8 | |
Minimum Samples per Leaf | 1 | |
Maximum Sample Ratio | 0.75 | |
Maximum Features | 0.2 | |
Maximum Depth | 10 | |
Criterion | gini | |
Bootstrap | True | |
Extra Trees | Number of Estimators | 75 |
Minimum Samples for Split | 4 | |
Minimum Samples per Leaf | 1 | |
Maximum Sample Ratio | 0.5 | |
Maximum Features | 0.6 | |
Maximum Depth | 10 | |
Criterion | gini | |
Bootstrap | True | |
Gradient Boosting | Subsampling Ratio | 0.3 |
Number of Estimators | 300 | |
Minimum Samples for Split | 4 | |
Minimum Samples per Leaf | 8 | |
Maximum Features | 0.2 | |
Maximum Depth | 4 | |
Learning Rate | 0.1 | |
Extreme Gradient Boosting | Subsampling Ratio | 1 |
Number of Estimators | 150 | |
Minimum Child Weight | 1 | |
Maximum Depth | None | |
Learning Rate | 0.05 | |
Gamma | 0.1 | |
Colsample By Tree | 1 | |
Meta Classifier | Base Models | Logistic Regression, SVM, Random Forest, Extra Trees, Gradient Boosting, Extreme Gradient Boosting |
Final Estimator | Logistic Regression | |
Cross Validation Scheme | StratifiedKFold(5) |
Model performance evaluation
Understandable interpretation of machine learning by Shapely method
Shapley values are the tools of cooperative game theory that offer an effective means to deconstruct the overall performance of a predictive ensemble in terms of contributions of each of the individual models. The method can be highly valuable in sophisticated ensemble practices, in which it is important to understand the particular contribution of each detail towards the increase of the accuracy and clarity of the model. The Shapley value is , and is computed as given the following formula:
In this formula, is a subset of the features of the model but without including , is the total features of the model. The function is the prediction output when the feature set is used for prediction. The Shapley value quantifies the marginal contribution of each feature by considering all possible subsets and averaging the changes in the model's prediction when each feature is included or excluded. This method offers a deep insight into the individual contribution of each feature, making it an essential tool for model interpretability, especially in the case of complex models where transparency is crucial.
Statistical measures criteria
Accuracy
The precision is also an important indicator to evaluate the overall work of the landslide susceptibility models since it reflects the passage of properly identified cases of landslide and non-landslide. It is measured by dividing the correctly predicted landslide (True Positives) and correctly predicted non-landslide (True Negatives) with the total amount of instances including the True Positives, False Positives, True Negatives, and False Negatives. The accuracy takes the following form:
In regions of unbalanced data however, like in Sikkim-Himalayan region where there is only a small number of landslides that occur as compared to non-events, high accuracy can be a deceiving measure. The model with the majority of the predictions of non-landslides can have a high accuracy without predicting landslides very well. The reason is that accuracy fails to consider the damage which may be inflicted by the landslides which are missed. Thus, use of accuracy as the single measure of performance may give a false indication of model effectiveness when it comes to landslides susceptibility mapping. This brings out the necessity of the extra measures that could improve the lack of balance and consider the dangerous threats of not addressing landslides.
Precision
Precision is an important aspect with regard to landslide susceptibility mapping because it provides accuracy of landslide prediction by the model. It can be defined as the fraction between the number of correctly forecast landslides (True Positives) and all of the landslides predicted (True positives and False positives). The formula for precision is:
TP is a True Positive (actual landslides called by the model) and FP is a False Positive (incorrect landslides called by the model). A high precision clinical that in case the model says there is a landslide there are high chances that it is true. When there is likelihood of landslides in an area e.g. Sikkim-Himalayan region, resources can be very scarce, so evacuation plans should be carefully done and hence the importance of minimizing false alarms. The error of forecasting a landslide where the model is not correct is called a false positive, and it may cause unneeded evacuations and the diversion of resources as well as creating panic amongst the population. Therefore, disaster response should be optimized through high precision, and this is focused on the minimization of social and economic costs of false prediction.
Precision plays a crucial role in landslide susceptibility mapping as it assesses the reliability of the model’s predictions for landslides. It is defined as the proportion of correctly predicted landslides (True Positives) out of all instances predicted as landslides (True Positives + False Positives).
where, TP (True Positive): Correctly predicted landslides and FP (False Positive): Incorrectly predicted landslides.
A high precision indicates that when the model predicts a landslide, it is likely to be correct. In landslide-prone regions like the Sikkim-Himalayan region, where resources may be limited and evacuation plans must be carefully managed, minimizing false alarms is essential. False positives, which occur when the model incorrectly predicts a landslide, can result in unnecessary evacuations, resource diversion, and public distress. Therefore, ensuring high precision helps in optimizing disaster response and minimizes the social and economic costs associated with false predictions.
Recall
Recall also known as sensitivity or True Positive Rate is a measure of the ability of the model to identify all of the real landslides present in the study area. It can be computed as the ratio of correctly forecasted landslides (True Positives) versus all the real landslides encompassing both True Positives and False Negatives. The following formula is used in measuring recall:
In which we write TP = the True Positives (correct predicted landslides), and FN = False Negatives (incorrectly predicted no landslides, or false misses among landslides). In landslide susceptibility mapping, the recall has to have a high level as this is to guarantee that the maximum number of actual occurrences of landslides will be identified. A failed finding of a landslide event (false negative) can lead to the loss of human lives and property damage in addition to evacuation delays. Maximizing recall is essential in regions such as the Sikkim-Himalayan area where there is high probability of natural disaster in the form of landslides and failure to forecast would result in catastrophic consequences in the loss of lives and overall damage caused by natural calamities.
F1 Score
F1 score is a balanced parameter that is used to combine both precision and recall into one value, which provides an all-inclusive reflection of how the model is coping up. It is especially helpful when used in mapping landslides susceptibility when false negatives (low recall) are often traded off with false positives (high precision). The formula for the F1 score is:
In which precision is computed by TP/ (TP + FP) and recall is TP/ (TP + FN). When the F1 score is high, this means that the landslides can be predicted more accurately by the model, and, in addition, detect actual events effectively. The F1 score is relevant in disaster management and planning especially in regions such as the Sikkim-Himalayan where both false positive and negative can be life threatening and where a decision should not be made to favor either of these two forms of errors. F1 integrates precision and recall, which gives more balanced characteristics of the model in its prediction of landslides, ensuring a maximized level of public safety and optimal provision of resources.
Accuracy assessment of the models
Regarding the accuracy of evaluation, the cross-validation of the effectiveness of the model is a vital piece of data analysis. Data validation is one of the major tricks to this end. This paper used the Receiver Operating Characteristic (ROC) curve or the Area Under the Curve (AUC) in investigating whether maps of landslides were precise. The balance of sensitivity and specificity is captured using these maps. The ROC plot in two-dimensionning is such that the false positive rate (1 - specificity) is on the x axis and the true positive rate (sensitivity) on the y-axis. These have the following formulas:
In the current scenario, the TN represents true negatives, FP represents false positives, TP represents true positives and FN represents false negatives. Also, AUC is employed to determine the performance of machine learning models implemented in the study area in quantitative view. The results predicted are compared to ground truth survey data in order to verify accuracy of the model. In order to measure the capability of the model, the AUC-ROC approach is frequently applied and the assessment output is in the (0,1) range. The 1 of AUC denotes that there is indeed a 1-to-1 correlation with predictions, which is a perfect relationship towards the actual results and 0 means no relationship at all.
Results
Feature accuracy and correlation analysis
The analysis of the correlation matrix revealed several significant relationships between the landslide causative factors in the Namchi district, based on a threshold of |r|> 0.5. A strong positive correlation was observed between Elevation and Distance to Road (0.83), suggesting that higher elevations tend to be farther from roads. This could indirectly influence landslide vulnerability through differences in land use or human intervention. There was also a significant negative correlation between Elevation and Soil FAO (-0.54). The terrain curvature variables, specifically Profile Curvature and Standard Curvature, showed a strong negative correlation (-0.87), while Plane Curvature and Standard Curvature had a strong positive correlation (0.79). These highlight that these variables capture related aspects of the terrain's morphology, which is fundamental in controlling water flow and slope stability; their high intercorrelation suggests potential multicollinearity if all are used in modeling. SPI and TWI showed a significant positive correlation (0.60). Distance to Road also showed a significant negative correlation with Lithology (− 0.53). The most prominent correlation was the powerful negative relationship between NDVI (vegetation index) and MNDWI (water index) (− 0.94). This strong inverse correlation is highly significant for landslide assessment, as it highlights the trade-off between vegetation cover, which typically enhances slope stability, and the presence of water, which significantly reduces soil strength and increases landslide risk. Their sparsely vegetated and water rich areas are more probable to be covered with landslides and it is valid to take both factors into account to determine the susceptibility. All these findings support the idea of a complicated overall relationship between topographic, environmental and human and further landslide susceptibility prediction and spatial analysis based on identification of the most important factors.
Figure 7 shows the association between the outcome of the number of features and the accuracy of the achievement of each model. Random Forest (RF) and Support Vector Machine (SVM) did take the most features since they needed 40 and 29, respectively, to reach their maximum accuracy. On the contrary, the performance of other models such as Gradient Boosting (GB), Logistic Regression (LR), Extra Trees (ET), and XGBoost was more effective with the model requiring 12, 19, 20, and 22 features used, respectively. Gradient Boosting (GB) was a candidate with the minimum cross-validation variance, which means that this method is very stable, and its performance does not depend heavily on the chosen combination of features.
The ranking was created according to each model in order to achieve the most crucial features. A voting system was adopted and this counted which position out of the two choices tagged as either anything TRUE or FALSE. The last process involved selecting the final features where features were selected on the basis of a pre-determined cut-off of 3, that is, a feature was found relevant when it was found to be necessary by at least 3 of the 6 models. To ease understanding, the categorical characteristics have been shortened in that they had long names. The Table 3 provides an overview of each model's feature rankings and selection status, along with the ensemble vote count and final selection.
Table 3. Overview of each model’s feature rankings and selection status
LIF | Encoded Name | LR Status | LR Rankings | SVM Status | SVM Rankings | RF Status | RF Rankings | ET Status | ET Rankings | GB Status | GB Rankings | XGB Status | XGB Rankings | Voting Count |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Elevation | Elevation | F | 9 | F | 11 | T | 1 | T | 1 | T | 1 | T | 1 | 4 |
Slope | Slope | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | 6 |
Aspect | Aspect | F | 4 | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | 5 |
Profile curvature | Profile curvature | F | 27 | F | 13 | T | 1 | T | 1 | T | 1 | F | 11 | 3 |
Roughness | Roughness | F | 16 | F | 2 | T | 1 | T | 1 | F | 11 | F | 6 | 2 |
SPI | SPI | F | 33 | F | 25 | T | 1 | T | 1 | T | 1 | F | 4 | 3 |
STI | STI | F | 34 | F | 24 | T | 1 | F | 5 | T | 1 | F | 2 | 2 |
TPI | TPI | F | 23 | F | 7 | T | 1 | T | 1 | F | 6 | F | 10 | 2 |
Plane curvature | Plane curvature | F | 31 | F | 18 | T | 1 | T | 1 | F | 13 | T | 1 | 3 |
TWI | TWI | F | 32 | F | 21 | T | 1 | T | 1 | F | 15 | F | 8 | 2 |
Rainfall | Rainfall | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | 6 |
Dist to drainage | Dist to drainage | F | 2 | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | 5 |
DTL | DTL | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | 6 |
Dist to road | Dist to road | F | 5 | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | 5 |
Standard curvature | Standard curvature | F | 28 | F | 14 | T | 1 | T | 1 | F | 12 | F | 7 | 2 |
NDVI | NDVI | F | 22 | F | 5 | T | 1 | T | 1 | F | 10 | F | 3 | 2 |
MNDWI | MNDWI | F | 7 | T | 1 | T | 1 | T | 1 | F | 8 | F | 12 | 3 |
Geology 1 | Chungthang fm central crystalline gneissic complex gp | T | 1 | T | 1 | F | 4 | F | 8 | F | 19 | F | 14 | 2 |
Geology 2 | Daling gp Buxa fm | F | 3 | T | 1 | T | 1 | F | 4 | F | 5 | F | 13 | 2 |
Geology 3 | Daling gp Gorubathan fm | F | 8 | F | 9 | T | 1 | T | 1 | F | 3 | T | 1 | 3 |
Geology 4 | Daling gp Reyang fm | T | 1 | T | 1 | T | 1 | F | 9 | F | 31 | F | 18 | 3 |
Geology 5 | Kanchenjunga gneiss central crystalline gneissic complex | T | 1 | T | 1 | T | 1 | T | 1 | F | 2 | T | 1 | 5 |
Geology 6 | Lower Gondwana gp Bhareli Damuda fm | F | 19 | F | 8 | F | 3 | F | 6 | F | 20 | F | 17 | 0 |
Geology 7 | Lower Gondwana gp Rangit pebble slate fm | F | 11 | F | 22 | F | 5 | F | 12 | F | 21 | F | 16 | 0 |
Geomorphology 2 | Highly dissected hills and valleys | F | 25 | F | 28 | F | 14 | F | 19 | F | 34 | F | 34 | 0 |
Geomorphology 3 | Mass wasting products | F | 29 | F | 17 | T | 1 | F | 3 | F | 7 | F | 33 | 1 |
Geomorphology 4 | Moderately dissected hills and valleys | F | 20 | F | 15 | F | 9 | F | 18 | F | 33 | F | 32 | 0 |
Geomorphology 5 | Snow cover | F | 13 | F | 3 | T | 1 | T | 1 | F | 30 | F | 31 | 2 |
Geomorphology 6 | Geomorphology 6 | F | 26 | F | 27 | F | 13 | F | 20 | F | 32 | F | 30 | 0 |
Lithology 2 | Boulder slate conglomerate phyllite | T | 1 | T | 1 | T | 1 | T | 1 | F | 23 | F | 29 | 4 |
Lithology 3 | Calc granulite with without quartzite | F | 15 | F | 23 | F | 8 | F | 15 | F | 9 | F | 28 | 0 |
Lithology 4 | Chlorite sericite schist and quartzite | F | 10 | F | 20 | F | 15 | F | 14 | F | 29 | F | 27 | 0 |
Lithology 6 | Mylonitic granite gneiss | F | 30 | F | 10 | T | 1 | F | 2 | F | 14 | F | 26 | 1 |
Lithology 7 | Quartz arenite black slate cherty phyllite | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | F | 25 | 5 |
Lithology 9 | Mylonitic granite gneiss | F | 18 | F | 6 | F | 11 | F | 17 | F | 24 | F | 24 | 0 |
Lithology 11 | Quartz arenite, black slate, cherty phyllite | F | 24 | F | 19 | F | 7 | F | 11 | F | 28 | F | 23 | 0 |
Lithology 13 | Sandstone, shale with minor coal | F | 12 | F | 12 | F | 2 | F | 10 | F | 16 | F | 22 | 0 |
LULC 2 | Vegetation | T | 1 | T | 1 | T | 1 | T | 1 | F | 27 | T | 1 | 5 |
LULC 4 | Built-up | F | 17 | F | 4 | F | 12 | F | 16 | F | 26 | F | 21 | 0 |
LULC 5 | Bare-land | F | 14 | F | 16 | T | 1 | T | 1 | F | 18 | F | 5 | 2 |
LULC 6 | Ice/snow | F | 21 | F | 26 | F | 10 | F | 13 | F | 25 | F | 20 | 0 |
Soil FAO 1 | I Bh U c | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | 6 |
Soil FAO 2 | GL | T | 1 | T | 1 | F | 6 | F | 7 | F | 22 | F | 19 | 2 |
Soil FAO 3 | Bd32 2bc | F | 6 | T | 1 | T | 1 | T | 1 | F | 17 | F | 9 | 3 |
Soil FAO 4 | Ah12 2bc | T | 1 | T | 1 | T | 1 | T | 1 | F | 4 | F | 15 | 4 |
Soil FAO 5 | Rd29 1a | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | T | 1 | 6 |
Figure 8 illustrates the relationship between features and their total vote counts. Features that met the selection threshold of 3 or more votes were ultimately selected. A total of 21 features were chosen, split evenly between 11 numerical and 10 encoded categorical features, as shown in Fig. 8. This balanced approach highlights the importance of numerical and categorical variables.
Landslide susceptibility zones
To develop landslides susceptible maps in this research, 10 state-of-the-art Remote Sensing (RS) and Geographic Information System (GIS) techniques related to landslides susceptibility using seven well-advanced machine learning (ML) models, including Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Extremely Randomized Trees (ExtraTrees), Gradient Boosting (GB), Extreme Gradient Boosting (XGBoost), and Meta Classifier (MC) were applied. The landslide susceptibility was zoned into five categories i.e. very low, low, moderate, high and very high.
The evaluation of the performance of every ML model was carried out by comparing the number (in km2) and a percentage of the total number of cells covered by each of the susceptibility zones (Figs. 9 and 10). In case of Support Vector Machine (SVM), the distribution was such that very low zone (290.34 km squared, 32.13 per cent), low zone (147.29 km squared, 16.30 per cent), moderate zone (151.36 km squared, 16.75 per cent), high zone (145.79 km squared, 16.13 per cent), and very high zone (168.81 km squared, 18.68 per cent). On the same note, the performance of other ML models was also computed. In Gradient Boosting (GB) the outcomes were: very low zone (580.93 km sq, 64.29%), low zone (50.11 km sq, 5.55%), moderate zone (32.26 km sq, 3.57 percent), high zone (38.54 km sq, 4.26%), and very high zone (201.76 km sq, 22.33 percent). The performance of the Extreme Gradient Boosting (XGBoost) model indicated as the following: very low zone (290.34 km 2, 32.13%), low zone (147.29 km 2, 16.30%), moderate zone (151.36 km 2, 16.75%), high zone (145.79 km 2, 16.13%), and very high zone (168.81 km 2, 18.6 Random Forest (RF) produced: very low zone (580.93 km2, 64.29%), low zone (50.11 km2, 5.55%), moderate zone (32.26 km2, 3.57%), high zone (38.54 km2, 4.26%), and very high zone (201.76 km2, 22.33%). The ExtraTrees (ET) model showed: very low zone (278.63 km2, 30.84%), low zone (251.67 km2, 27.85%), moderate zone (147.85 km2, 16.36%), high zone (77.79 km2, 8.61%), and very high zone (147.59 km2, 16.33%). Logistic Regression model (LR) results were: very low zone (362.10 km2, 40.07%), low zone (136.78 km2, 15.14%), moderate zone (93.79 km2, 10.38%), high zone (92.57 km2, 10.24%), and very high zone (218.35 km2, 24.16%). The Meta Classifier (MC) yielded: very low zone (182.24 km2, 20.17%), low zone (138.24 km2, 15.30%), moderate zone (163.36 km2, 18.08%), high zone (246.22 km2, 27.25%), and very high zone (173.48 km2, 19.20%) (Table 4 & Fig. 6).
Table 4. Areal distribution of landslide susceptibility zones
Model | Area | Landslide susceptibility zone | |||||
|---|---|---|---|---|---|---|---|
Very low | Low | Moderate | High | Very High | Total Area | ||
SVM | Area in sq. km | 290.34 | 147.29 | 151.36 | 145.79 | 168.81 | 903.59 |
Area (%) | 32.13 | 16.30 | 16.75 | 16.13 | 18.68 | 100.00 | |
GB | Area in sq. km | 580.93 | 50.11 | 32.26 | 38.54 | 201.76 | 903.59 |
Area (%) | 64.29 | 5.55 | 3.57 | 4.26 | 22.33 | 100.00 | |
XGB | Area in sq. km | 290.34 | 147.29 | 151.36 | 145.79 | 168.81 | 903.59 |
Area (%) | 32.13 | 16.30 | 16.75 | 16.13 | 18.68 | 100.00 | |
RF | Area in sq. km | 580.93 | 50.11 | 32.26 | 38.54 | 201.76 | 903.59 |
Area (%) | 64.29 | 5.55 | 3.57 | 4.26 | 22.33 | 100.00 | |
ET | Area in sq. km | 278.63 | 251.67 | 147.85 | 77.79 | 147.59 | 903.54 |
Area (%) | 30.84 | 27.85 | 16.36 | 8.61 | 16.33 | 100.00 | |
LR | Area in sq. km | 362.10 | 136.78 | 93.79 | 92.57 | 218.35 | 903.59 |
Area (%) | 40.07 | 15.14 | 10.38 | 10.24 | 24.16 | 100.00 | |
MC | Area in sq. km | 182.24 | 138.24 | 163.36 | 246.22 | 173.48 | 903.54 |
Area (%) | 20.17 | 15.30 | 18.08 | 27.25 | 19.20 | 100.00 | |
These findings offer crucial information concerning the abilities and performances of every ML model to map landslide susceptibility (Figs. 9 and 10).
The most vulnerable area is found within a little spot in South-western corner of the Namchi District, namely Jorethang and South-Western region of the Namchi Sub-division. The Eastern, South-eastern and Western part of Namchi Subdivision, Southern part of the Ravangla and Yangang subdivision are found in high susceptibility zone. The moderate high susceptibility zone stretches in a small patch in the middle part of the Namchi subdivision, in a small patch in the Jorethang subdivision, in the Eastern part of the Namchi subdiv, Southern part of the Ravangla and Yangang subdivisions. Width of the Namchi District that is under moderate susceptibility is the largest and here, it is probable that every year land slide would occur once in every square kilometer. The low susceptibility zone is located in northwest corner of Ravangla Subdivision and in the northerrn most corner of Yangang Subdivision (Fig. 10).
There are various reasons that led to the occurrence of landslides in this area and they include poor geological condition, excessive rainfall, unstable soil, insufficient road construction, intensive urbanization and increase in demands of deforestation and mining operations (Fig. 6).
[See PDF for image]
Fig. 6
Proportional Distribution of Landslide Susceptibility across different class: A Comparison of SVM, GB, XGB, RF, ET, LR and MC Models
Model validation
In this paper, the performance of different machine learning models in the mapping of landslide susceptibility has been evaluated. The Logistic Regression (LR) has a background of 0.863, that is, the landslides are predicted with 86.3 percent accuracy, implying an average degree of credibility. LR has a precision of 0.882 which means that 88.2 percent of its prediction by LR is correct. It possesses also an moderate recall of 0.836, reflecting 83.6 percentage of the real landslide occurrences. The F1 Score of 0.858 is a good indicator of positively good balance between precision and recall, although not the best in the models.
Support Vector Machine (SVM) has an accuracy of 0.886 which is very high and this means 88.6 accuracy in terms of accuracy. SVM performed well in its precision (0.909) and recall (0.855), achieving the F1 Score of 0.881 and therefore indicating a strong holding between these two measures.
Random Forest (RF) has an accuracy of 0.941 that is slightly averagely more accurate than SVM, which predicts an accuracy. The precision (0.935) of RF is also high and the RF also exhibits recall (0.947), an F1 Score of 0.941 exhibits an excellent balance between precision and recall.
ExtraTrees (ET) comes close behind at the accuracy of 0.928, which is similar to GB and XGBoost. It also provides high precision (0.917) and excellent recall (0.941) this makes the F1 Score to be 0.929.
Gradient Boosting (GB) comes next with the precision of 0.935. It also shows as stable (0.934) and outstanding (0.934) in precision and recall with the best F1 Score of 0.934 pointing to the most overall optimal balance on precision and recall.
XGBoost comes in close second with an accuracy of 0.931 which is similar to ET, GB and MC. It has high precision (0.931) with excellent recall (0.94) which is the F1 Score of 0.921.
Meta Classifier (MC) was the best model that achieved the highest accuracy of 0.941 when compared to other models. It also has high confidence in being precise (0.941) and high recall (0.941), whereas the F1 Score is maximized to 0.941, representing an all-time balance between precision and recall (Table 5).
Table 5. Model validation matrix
Landslide susceptibility models | Accuracy | Precision | Recall | F1 Score | AUC |
|---|---|---|---|---|---|
Logistic Regression (LR) | 0.863 | 0.882 | 0.836 | 0.858 | 0.961 |
Support Vector Machine (SVM) | 0.886 | 0.909 | 0.855 | 0.881 | 0.95 |
Random Forest (RF) | 0.941 | 0.935 | 0.947 | 0.941 | 0.977 |
Extra Trees (ET) | 0.928 | 0.917 | 0.941 | 0.929 | 0.975 |
Gradient Boosting (GB) | 0.935 | 0.934 | 0.934 | 0.934 | 0.982 |
XGBoost (XGB) | 0.931 | 0.94 | 0.921 | 0.93 | 0.98 |
Meta Classifier (MC) | 0.941 | 0.941 | 0.941 | 0.941 | 0.984 |
All models demonstrate strong capabilities in distinguishing between different classes, as reflected in their AUC values (Figs. 7, 8, 9, 10). The highest AUC values were obtained by MC (0.984), GB (0.982), and XGBoost (0.980), followed closely by RF (0.977), ET (0.975), LR (0.961), and SVM (0.95) (Fig. 11a).
[See PDF for image]
Fig. 7
Assessing Model Performance Through Feature Selection: A Comparative Evaluation of Logistic Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Extra Trees (ET), Gradient Boosting (GB), and Extreme Gradient Boosting (XGBoost)
[See PDF for image]
Fig. 8
Tallying the Ensemble Votes for Numerical and Encoded Categorical Features
[See PDF for image]
Fig. 9
Landslide susceptibility maps (a) Support Vector Machine (SVM), (b) Gradient Boosting (GB), (c) XGBoost (XGB), (d) Random Forest (RF), (e) Extra Trees (ET), and (f) Logistic Regression (LR)
[See PDF for image]
Fig. 10
Landslide susceptibility map generated using the ensemble Meta classifier (MC) model
[See PDF for image]
Fig. 11
Model validation (a) ROC-AUC Curve Comparison: SVM, GB, XGB, RF, ET, and LR Models, and (b) Visualization of the Confusion Matrix for Meta Classifier
Discussion
Landslide prediction and its effective management are critical for the sustainable development of mountainous and hilly areas, where landslides are common natural hazards. The challenge of accurately predicting landslides involves considering numerous contributing factors, such as terrain, soil, weather, and human activities. Landslide susceptibility models rely on various methodologies and datasets, from physically-based models, statistical methods, and machine learning (ML) approaches to more complex deep learning models. Despite extensive research in this area, there is still some disagreement among experts regarding the best approach. This underlines the importance of exploring innovative methods that can enhance the accuracy of landslide prediction. This study introduces new machine learning-based techniques for analyzing and predicting landslides and demonstrates how combining geospatial data with machine learning can significantly improve landslide susceptibility modeling.
Another important finding of the present study was the tremendous role of Advanced Land Observing Satellite (ALOS) Phased Array type L-band Synthetic Aperture Radar (PALSAR) Digital Elevation Models (DEMs) in enhancing accuracy of landslide prediction. Very specific topographical data of an area is imparted by DEMs, and this is critical in comprehension of the physical nature of the terrain which may be a catalyst to land slides. The slope data is an important predictor since the steeper the slopes the more prone it is to landslides. By using DEM data, it was possible to derive some of the important landslide conditioning factors (LCFs) including slope, aspect, curvature, roughness and stream power index (SPI). Topographic position index (TPI), topographic ruggedness index (TRI), and topographic wetness index (TWI) are other topographic indices yielded by DEMs which are very essential in determining landslide susceptibility. The low-elevation zones form most of the study area with very low and higher to high-elevation zones coming next. According to the analysis, the risk of which of the two elevation zones (down and shallow) are likely to undergo landslides is high on the down elevation zones. The sides of the central valley are covered by disintegrated and decomposed materials at lower elevations which may cause slope movement and landslides.
The Eastern Himalayan receives monsoon rains especially those of south and southeast. The south facing sides of the mountains are normally facing directly these moist winds that are associated with high rainfall. Long spells of rains may soil up, weaken the soil and expose it to the risk of landslides. Exposure of slopes to the adjacent path of the prevailing monsoons increases the susceptibility of the soil in the slopes to be saturated and washed away leading to the frequent process of landslides. The slope aspect map we provided in this study showed that the most likely occurrences of slope failure events were in the south facing slope (157.5–202.5) and 41.75 percent (107,100) of the landslide cells lied in this area. The probability of landslides on southwest-oriented pitches (202.5–247.5) occurred in second place and was equal to 25.61% whereas Southeast-oriented pitches (112.5–157.5) earned landslides with 17.54%; West-oriented pitches (247.5–292.5) resulted in 8.77% of landslides and the East oriented slopes (67.5–112.5) 5.26%, Different factors which include soil erosion, run-off, drainage, sediment transport, slope angle and others, control the geomorphic features of every part of the slopes. These factors are affected by micro-climatic changes in the area. South facing slopes, south west, and southeast have significant orographic rainfall, leading to development of drainage network, enhancing the surface run-off, and favoring the erosion processes. Also, the disintegration and decomposition of slope materials are in all the aspect of slope since the forces of maximum solar radiation tends to occur at most days of the year.
The landslide dominates elevations in the study area which are 3701–5683 m. The southern tip of the district occupying much of the Darjeeling-Sikkim Himalaya is quite low lying and very prone to landslides. The research site has moderate elevation patterns in the eastern and western sides. In the analysis, the down-to-shallow elevation zones have increased probability of landslides. The sides of the central valley are covered with disintegrated and decomposed materials at low elevation and, as a result thereof, the valley sides may move causing landslide. Moreover, elevation category of 2506 m to 3700 m exhibited high association with landslides.
The slope angle in the Namchi district is between 1 and 85 as discovered in this research. The slope angle of 29-36degrees had a landslide pixel highest percentage of 38.60 among the total as chances of re-landslides in the future was maximum. The steepness of these slopes in this range makes them highly prone to landslide as the level of gravitational pull and soil erosion which contribute to the land sliding process are highly heightened in such slopes hence the areas have high chances of instability when put under pressure. Under some conditions (due to heavy rainfall or earthquakes), this soil or rock stability is destabilized, which is in the slope category of 190 280 and the second highest number of landslide pixels (23.86% of the total) were found, which means that the potential of landslides is serious. It was also found that other ranges of the slope angle i.e. 37–47 and 1–18 degrees had probabilities of landslides to some extent. The analysis indicates that slope protection and the prevention of landslides should target the slopes that are broader than 29 o, which tends to experience a high frequency of landslides. Large drainage system and slope morphology influence the formation of surface runoff, soil erosion, and concentration of the drainage to a great extent in the region. The slopes steepness affects the SPI and TWI. Simultaneously, such results with elevated values of curvature are associated with drainage concentration on both the convexes and concaves of the slopes, which saturate pitch and degrade the soil cohesion and internal friction.
In the work, the researchers realized that regions, which possess high concavity and which fall between − 3.29 and − 0.33 standard curvature, possess certain characteristics that make them sensitive to the convergence of water in the area of hillslope location. This was the most common landslide class at 56.05 percent and it was largely associated with future landslides. Conversely, the other form 1.66–5.65 that shows the convexity areas was moderately associated with the following occurrence of landslide in 9.87%.
We found out that the immediate locality around 300-m radius of the stream, road and DTL was prone to the occurrence of landslides the most. The construction of roads also plays a great role in occurrence of landslides in the Namchi District. Cutting or excavation of slopes to make even surfaces during construction of a road causes the balance to be disturbed and this process weakens slope stability. This slope undermining also enhances shear stress and exposes them to failure mainly in locations that contain loose, unconsolidated soils. In both regions, nearly 79.65 percent of the total number of landslides fall inside the 1000 m radius of roads. This concentration is especially high near the National and State Highways around Jorethang, Namchi, Melli, and other central hubs due to traffic pressure in these areas. Heavy traffic on these mountain roads generates ground vibrations, which further exacerbate slope instability. The vibrations can loosen rock and soil on already vulnerable slopes, especially those weakened by previous landslides or seismic activity, increasing the likelihood of slope failures. Roads often disrupt natural drainage patterns, leading to the accumulation of water along roadways or on adjacent slopes. Without proper culverts or drains, water builds up and adds pressure to slopes, especially during the monsoon season. Stream channels significantly impact landslide occurrences, particularly in mountainous region like Namchi district. Flowing water in rivers and streams erodes the base of adjacent slopes, a process called undercutting. This erosion removes support from the lower sections of slopes, making them more susceptible to failure. The continuous erosional activity weakens the slope’s stability, increasing landslide risk, especially during high flow events like monsoons or glacial melts.
The classes of 2928–3032 mm annual rainfall which were recorded to have a strong correlation with future slope failure cases since the sources of water are uphill and water flows with high velocities. In the TWI interval of 7.71 to 11.14, we had the greatest area of landslide as 39.30 percent of total landslide area.
Landslides are one of the prime triggering conditions and the Namchi district receives high monsoonal rains. Its susceptibility areas are in tandem with regions which experience more rainfall form especially between June and September which is the monsoon season. This factor has caused most of the landslides occur in the lower altitudes in the south area that gets higher rainfalls per year; that is more than 2900 mm of rainfall annually. Landslides and rainfall are related positively and slope is very vital in the precipitation divisibility in the Sikkim Himalayas. This area receives orographic rains on slopes that face south, south east and the south west and this makes the ground saturated and the pore-water pressure high thus making the slope materials lack cohesion making the slope unsteady. The rainfall map showed that 2928–3032 mm had a greater likelihood of occurrence of the landslides. Landslide often reduces to the north because rain gradually decreases in this area of the study area. High-intensity rainfall enhances infiltration and makes soil layers to be saturated, which reduces slope strength and causes slope collapse. The change in the patterns of rainfalls brought about by climate change can further compound the events of landslides and thus there is an urge to foster dynamic and adaptive methodologies of doing landslide susceptibility analysis taking into account the changing climatic conditions.
The effects of human disturbances on the stability of the slopes in the district of Namchi can also be traced, and this has been caused by deforestation and terracing construction which generates a runoff accumulation and elevated pore-water pressure. It ends up in erosion of slope materials and higher vulnerability to landslides. The responsible parameters in the analysis showed that the NDVI of 0.03–0.20 had closely matched the occurrence of landslide. The evidence indicated that the increase in quality of vegetation health came along with the reduction in chances of slope failure as caused by human activity like tea and orange plantation.
Transformation in land cover and change in land use because of man has a lot of effect in causing susceptibility to landslides in Namchi district. Expansion of agriculture, roadworks and urbanization tend to cause slope destabilization and deforestation. The land use and land cover (LULC) assessment of forested lands and bare lands had more chances of landslide occurrence with 57.54 percent and 41.40 percent la Although forests normally stabilize slopes by their roots and vegeation covers, there are a number of things that have led to this kind of high frequency of landslides. Landslides are also mostly experienced at lower elevated forest lands, which is more intervened by humans. The geology underneath also contributes a lot to render forested land susceptible to landslides. The land disposal that comprises a large part in this region is made up of loose and unconsolidated soil, thus the roots of the trees cannot effectively bind the soil. As a result, trees on this area are not able to support the soil enough to enrich it; there is a possibility of increasing the unstable slopes that is, increasing the burdens of these slopes. Any digging up as an impact of anthropogenic activities only makes such lands more prone to landslides. Conversely, the future slope failure events were less prevalent in the public utility and facility type.
Some lithological aspects especially soft rock compositions like phyllites, schists, etc. are often marked with high vulnerability of landslide since they have poor shear strength, and could easily be weathered. Unconsolidated soils of the Daling group form loose soil that is prone to landslide as compared to consolidated rock. Also the occurrence of weak or highly fractured rock along roads contributes to land sliding inclination. Most of the geographical areas in this greek district are covered by the chlorite sericult and quartzite categories of rocks with 78.60 percent landslides being located in this category. Of all the units of the Daling Group, the Gorubathan formation is most concentrated with landslides, and second is lower Gondwana group (Bhareli, Damuda formation). This Damuda formation of Gondwana super group consists of coal, dolomite and other materials which are very prone to rain water infiltration. The Damuda rocks make up the Namchi town and the surrounding areas and hence the region is highly likely to encounter landslide during the monsoon seasons. Quartz arenite, black slate, cherty phyllite are found in the middle and western region, where the proportion of landslide happens is 7.37%. This region is quite at the proximity of main central thrust (MCT) I and II and is therefore seismically active. Moreover, glaciers cover this area, and, therefore, landslides in this area are linked with the processes of freeze and thaw effect and glacial lake outburst flood (GLOF) effect, as well as other cryogenic processes. These results indicate that the type of rocks and their topographies and structures contribute majorly in slopes stability as well as susceptibility of a land to landslides.
Besides DEMs, implementation of satellite imagery Landsat 8/9 data was also involved in the research to make the predictions more accurate. Satellite photos contain great insights on the land cover, vegetation and water bodies areas that are crucial in determining the landslide risks. An example includes the land cover type that can affect soil stability and vegetation that can give some vital information regarding anchoring and protection of soil against soil erosion. It was identified that Normalized Difference Vegetation Index (NDVI) and Modified Normalized Difference Water Index (mNDWI) bear a specific significance in the landslides prediction because these indicators define a clear image of vegetation cover and water presence which are key factors during landslides process.
Precipitation data was also characterized as an important component towards predicting landslides. The connection between landslides and rains is no longer doubted and is known to be much prevalent in steep surfaces and unconsolidated soils. Excessive rainfall causes the soils to become water-logged due to saturation thereby inducing the slope to become less stable and therefore experience a landslide. The addition of rainfall data to the predictive models enhanced a better picture of the landslide-prone areas.
The data of the geology also was very important in the prediction models that included lithology, geomorphology, and the lineament data. Lithology also gives knowledge of how the rocks and soil are built up so that they can recognize regions that have soft or unstable geologic constituents. Geomorphological features such as scarps, steep slopes, and oversteepened hillslopes are subject to failure, at the same time. Lineament analysis assists in finding faults, cracks and other slope vulnerabilities that may cause landslides. Areas of dense lineaments could be more susceptible to ground movement and this data would be of great importance in the landslide susceptibility analysis.
Major precursor of landslides is soil erosion, which relates directly with soil texture. Erosion is simpler in those regions that have sandy soils where the soils will be more mobile in water flow thus will cause the slope to initiate slope instability. Conversely, soil with more clay is less prone to erosions and contributes to more stability to the slope. The soil texture is therefore very important in determining the location prone to more erosion and by extension landslides.
The performance of the model was confirmed by Area Under the Curve (AUC) of Receiver Operating Characteristic (ROC) curve and the rest of the statistical indicators including accuracy, precision, recall, and F1 score. The models used in research were quite good, as the AUC measure of all the machine learning models exceeded 0.85, which means that they had powerful performance characteristics in making landslide susceptibility models.
A comparative analysis between the study and the previous studies shows that there has been an excellent performance of the models employed in this study. Though various datasets and geographical conditions have been utilized in the earlier studies, the outcome of the current study is of high reliability in the context of AUC and accuracy. To demonstrate, Meta Classifier (MC) and Gradient Boosting (GB) and extreme Gradient Boosting (XGBoost) models had an AUC value up to 0.984, 0.982, 0.980 respectively, a significant step up as compared to the earlier methods used, namely, Random Subspace Logistic Model Tree (RSLMT) and Bayesian Optimization models. These findings support the expediency of applying ensemble machine learning models in the prediction of landslides.
All the machine learning models implemented in the research will have their own advantages. Though simpler, logistic Regression (LR) goes to provide the minimum threshold in predicting landslides and provides simple patterns in the data. One way that Support Vector Machine (SVM) can work is when the data to hand is in high dimensions as is the case with landslide studies since it is able to model complex, non-linear relationships between the predictor variables and landsides occurrence. Random Forest (RF) is a group of decision trees and it is very good at overfitting and is very good at finding non-linear relationships between variables. It also gives essential information on the significance of features which is eminent in the identification of what causes landslides. Extra Trees (ET) model improves on the prediction by adding extra randomness of the tree buildings which enables the model to pick the patterns that other models fail to do. Gradient Boosting (GB) uses trees one by one where the new tree aims at reducing the mistake committed by the preceding trees, and it results in a series of increasingly improved forecasts. XGBoost is an optimized version of GB which is efficient and scalable and thus suitable with large and complex datasets. Finally, all the predictions of individual models are combined through Meta Classifier (MC), a model whose strength builds on each other, yet its weakness is substituted via strength of others and in the end produces a more dependable and reliable prediction of landslide susceptibility.
Additional optimization of the models was carried out by the Ensemble Recursive Feature Elimination (RFE) to find out the most crucial features that play their roles of landslide susceptibility. The RFE process removes successively the least significant features at each iteration, which is to make sure that the models take into consideration the significant variables and work with high accuracy in prediction. This method of feature selection is necessary where the number of potential predictors is large, as in the case of the Sikkim-Himalayan region, where several environmental variables come into account to increase the probability of landslides.
The experience of employing a meta-learning framework, especially, using the MC, has been priceless. The MC integrates the results of many models thus taking the advantages of each model and producing a more accurate and dependable prediction than many models could have done singularly. This will provide more accurate landslide susceptibility map and provides more sound analysis that could be adopted when using land-use planning and risk management in practice.
In order to evaluate the significance of the single features within the landslide prediction models, the SHAP (SHapley Additive exPlanations) method was used. SHAP values can help us to realize the effect of each feature on the final prediction, explaining which variables have the most effect on landslides. As it was found in the analysis, one of the most important factors included elevation, rainfall, slope and proximity to roads. One of the variables was elevation which was the most important and steep slopes were most prone to landslides since the stability was weakened.
The knowledge obtained in this investigation is useful in enhancing landslide management measures. The presence of key factors can be identified; these key factors can be elevation, rainfall, and land use thereby causing more specific land-use planning to avoid development in high-risk areas. The engineering solution that can be applied in the highly vulnerable places against landslides includes the construction of retaining walls, drainage arrangement, and slope stabilization methods. Moreover, some early warning systems could be established to predict the possibility of landslides, and the information should be relayed in time to minimize the losses of human life and substitution of property.
Advanced machine models, ensemble techniques as well as feature selection techniques, upon integrating into landslide prediction have greatly enhanced the accuracy. The present research is a big step in the process of mapping the landslide susceptibility that is an effective way to declare any geography risk-free and to draw a perfect disaster management and mitigation plan in geologically sensitive areas such as the Sikkim-Himalayan region. The results may be seen as a good guideline of perspective studies and practice of landslide-prone areas in the planet.
Validation
In the assessment of a classification model (especially when applied in susceptible mapping of landslides), the Receiver Operating Characteristic (ROC) curve and Area Under the Curve (AUC) are among the common measures [11, 55, 252, 272]. The ROC curve is a mathematical graphical plot that illustrates the performance of binary classification models that is based on relative trade-off between the True Positive Rate (TPR) and the False Positive Rate (FPR) using different thresholds. AUC, which is one number based on ROC curve, summarizes the performance in general. The validity of the model is determined by undertaking a comparison of the predicted output with the actual world output to ensure that the model achieves the goals sought and that it is applicable to the purpose intended [175, 176, 174, 189].
In the present paper, the ROC curve was used to validate the accurate effects of the ensemble EBF and FR models, and the value of the AUC gave an answer to the correctness of the type of classification or forecasting [154]. We applied these models on the field survey data as well as, Google Earth coordinates, the inventory of landslides data exposed by the Geological Survey of India (GSI) and the NASA Global Landslide Catalog. The ROC and AUC are useful means of assessing the performance of the landslide susceptibility model but it will enable researchers to see how accurate the model is in terms of identifying landprone areas. The ROC curve plots the TPR (sensitivity) on the y-axis against the FPR (1 -specificity) on the x-axis and measures the tradeoffs of accurate identification of the landslide-prone and incorrect characterization of the safe areas. The calculations of sensitivity and specificity can be specified by the following equations Eqs. 21 and 22:
21
22
These are common formulae adopted in research (Rasool et al., 2022; Pan et al., 2022; Swets, 1988) to compute criteria to be applied in ROC analysis such as true and false positive rates. Moreover, Arabameri et al. (2019) and Ali et al. (2020) also discuss the sensitivity and specificity as the key determinants of model performance.
The performance of the landslide susceptibility model can be evaluated with the help of the AUC score that has a scale of 0–1. The model with a score of more than 0.5 is considered weak and it should be improved, whereas when it is near 1, this means that this is a good and reliable model [30]. The ROC generated in our study had an AUC that was above 0.984 thereby authenticating the efficacy of the model. On the basis of these findings, we highly suggest application of this validation technique to study of landslide susceptibility mapping (Figs. 11, 12).
[See PDF for image]
Fig. 12
(a) Distribution of Shapley Values for all sample features and (b) Plot depicting Feature Importance
Conclusion
This research will be a very important breakthrough towards landslide susceptibility modeling in the Namchi region of the Indian state of Sikkim owing to the application of leading-edge geospatial predictive modeling practices. Through adoption of the latest machine learning (ML) techniques, a multi-layered framework was constructed to predict landslides in a topographically-challenging area with complicated environmental parameters. Since Logistic Regression (LR) was a prototypical model, which determines simple and linear relations, Support Vector Machine (SVM) was the best model to comprehend difficult nonlinear patterns in a greater number of dimensions. Random Forest (RF) and Extra Trees (ET) models had good performance due to the attention of complex, non-linear dependence and good construction against overfitting. With Gradient Boosting (GB) and Extreme Gradient Boosting (XGBoost) the majority of advances were achieved in the level of accuracy of the model and broke the restrictions of the previous models. The predictive performance was improved further by the implementation of a Meta Classifier (MC) ensemble approach where the advantages of all the models were used and provided detailed interpretation on what influential factors contributed to landslides. The optimization of feature selection was also used as additional ensemble Recursive Feature Elimination (RFE) to improve model performance in geologically complex region Sub-Himalaya.
The main lesson of this research is that there is inherent value in an integrated approach, the use of which complements the traditional methods (not its substitution). Although the models showed very accurate results, interpreting complex predictions have not been an easy task. Such tools as SHAP (SHapley Additive exPlanations) were helpful in the context of post-hoc interpretations, however, more advancements are necessary in the area of prediction-cause-action transition. In the future work, it would be advisable to base predictive models on richer geospatial data such as data obtained using Lidar and InSAR with a view of improving their accuracy. Also, the climate change effects on landslide processes must be discussed, and factors such as intensified precipitation and land use change and evolution are to be introduced into the models to make them more fitting to the problems related to climate-associated risks.
To understand the soundness of the proposed framework, it is important to validate it across the regions to understand its usefulness in different terrains, types of soils, and in different climatic conditions. This would encourage the flexibility of the model at large. A further advancement that may be undertaken to enhance the accuracy of the framework and its transparency is to incorporate the combination of machine learning with physically-based models, which integrates the results of data-driven analysis with the contribution of the physical aspects that affect the landslides. Model improvements As models are better with citizens-generated data on the occurrence of landslides, they can be incorporated into regions with limited data, but generate local engagement in solutions and resilience in localities. These innovations would add to stronger, interpretable and widely applicable landslide prediction algorithms. The developments of the future should further aim at finding the balance between model accuracies and transparency so that frameworks can both pass the scientific rigor standards and serve policy objectives.
This study is also of great potential at the international level. The danger of landslides is significant in many globally insecure geological areas, and the model created here is a scalable solution that is able to be used everywhere there are similar concerns. This research helps to improve land-use planning, allocation of resources, and disaster management through strategies used to reduce the impact they have. This is done by ensuring that the land-use planning functions in a more predictable manner. The hybrid approach that combines both ensemble learning and feature optimization gives a new technology that would be adjusted to other areas with complicated environmental and geospatial challenges. Finally, the current study also deepens our knowledge of the landslide processes and provides a new level on which machine learning and cross-knowledge integration can be used to resolve the important challenges facing the whole world.
Expectant geological mapping and explorations in subsurface are other paramount significance of landslide risk assessment as the study notes. Integrating geospatial information, remote sensing tools, and statistics, we have identified those regions that are the most prone to landslides, and it is worth noting that geology, hydrography, and anthropology aspects should be taken into account. The main insights show that South facing steep slopes, especially along roads and rivers are one of such vulnerable areas. Moreover, the formation of certain geological structures, including Daling group and Gondwana supergroup rocks, are more susceptible to landslides based on the presence of schist, phyllite, quartzite, coal and dolomite in its structure.
The management of landslides entails a multidisciplinary approach which includes engineering, environmental science and geology. Sustainable solutions need to be planned and implemented by collaboration between researchers, engineers, policymakers and the local communities. The development of resilience and anticipation of the possible landslides also depends on the existence of a public education and awareness. As long as the climate change-related events and urbanization continue affecting the landscape processes, it is uppermost critical to perform constant evaluation and modification of landslide risk alleviation procedures. Disaster preparedness can be improved by gaining a better understanding of the processes of landslides, and better predictive models, to lessen the devastating effects of landslides. Moreover, the use of sustainable or green mitigation measures, i.e. bioengineering, vegetative restoration, natural slope reinforcement, will be very crucial in alleviating vulnerability without altering the ecology of such landslide-prone regions.
Acknowledgements
We extend our heartfelt gratitude to the University Grants Commission (UGC), India, for their generous Junior Research Fellowship, which made this research possible. We also wish to acknowledge the Geological Survey of India (GSI), the Indian Meteorological Department (IMD), and the United States Geological Survey (USGS) for their invaluable support in data curation.
Author contributions
The author contributions to this research are as follows: Ranjan Roy was responsible for the conceptualization of the study, while Indrajit Poddar played a central role in the experimental design, investigation, and data curation. Indrajit also contributed to writing the original draft, reviewing and editing the manuscript, and creating the visualizations. Both authors collaborated extensively throughout the research process, and after thorough review, they have both read and agreed to the final published version of the manuscript.
Funding
No funding was received for this research.
Data availability
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
1. Abedini, M; Ghasemian, B; Shirzadi, A; Shahabi, H; Chapi, K; Pham, BT; Ahmad, BB; Bui, DT. A novel hybrid approach of Bayesian logistic regression and its ensembles for landslide susceptibility assessment. Geocarto Int; 2019; 34,
2. Addis, A. Gis-based landslide susceptibility mapping using frequency ratio and Shannon entropy models in Dejen district, Northwestern Ethiopia. J Eng; 2023; 2023, pp. 1-14. [DOI: https://dx.doi.org/10.1155/2023/1062388]
3. Agrawal, N; Dixit, J. Assessment of landslide susceptibility for Meghalaya (India) using bivariate (frequency ratio and Shannon entropy) and multi-criteria decision analysis (AHP and fuzzy-AHP) models. All Earth; 2022; 34,
4. Ahmed, A et al. Extra trees: a robust ensemble learning model for remote sensing applications. J Geophys Res Atmos; 2020; 125,
5. Ahmed, A; Song, W; Zhang, Y; Haque, MA; Liu, X. Hybrid BO-XGBoost and BO-RF models for the strength prediction of self-compacting mortars with parametric analysis. Materials; 2023; 16,
6. Ali, SA; Parvin, F; Pham, QB; Khedher, KM; Dehbozorgi, M; Wahid Rabby, Y; Anh, DT; Nguyen, DH. An ensemble random forest tree with SVM, ANN, NBT, and LMT for landslide susceptibility mapping in the Rangit River watershed, India. Nat Hazards; 2022; 113,
7. Ali, Z; Hayat, MF; Shaukat, K; Alam, TM; Hameed, IA; Luo, S; Basheer, S; Ayadi, M; Ksibi, A. A proposed framework for early prediction of schistosomiasis. Diagnostics; 2022; 12,
8. Alkhasawneh, MS; Ngah, UK; Tay, LT; Isa, NAM; Al-batah, MS. Determination of important topographic factors for landslide mapping analysis using MLP network. Sci World J; 2013; 2013,
9. Alotaibi, E; Nassif, N. Artificial intelligence in environmental monitoring: in-depth analysis. Discov Artif Intell; 2024; [DOI: https://dx.doi.org/10.1007/s44163-024-00198-1]
10. Al-Rimmawi, H. Prediction of type 2 diabetes using logistic regression techniques. Turk J Comput Math Educ.; 2024; [DOI: https://dx.doi.org/10.61841/turcomat.v15i1.13875]
11. Althuwaynee, OF; Pradhan, B; Park, H-J; Lee, JH. A novel ensemble bivariate statistical evidential belief function with knowledge-based analytical hierarchy process and multivariate statistical logistic regression for landslide susceptibility mapping. CATENA; 2014; 114, pp. 21-36. [DOI: https://dx.doi.org/10.1016/j.catena.2013.10.011]
12. Anbalagan, R; Kumar, R; Lakshmanan, K; Parida, S; Neethu, S. Landslide hazard zonation mapping using frequency ratio and fuzzy logic approach, a case study of Lachung Valley, Sikkim. Geoenviron Disasters; 2015; [DOI: https://dx.doi.org/10.1186/s40677-014-0009-y]
13. Anbarasu, K; Sengupta, A; Gupta, S; Sharma, SP. Mechanism of activation of the Lanta Khola landslide in Sikkim Himalayas. Landslides; 2010; 7,
14. Andresz, S; Zéphir, A; Bez, J; Karst, M; Danieli, J. Artificial intelligence and radiation protection. A game changer or an update?. Radioprotection; 2022; 57,
15. Arabameri, A; Pradhan, B; Rezaei, K; Lee, S; Sohrabi, M. An ensemble model for landslide susceptibility mapping in a forested area. Geocarto Int; 2020; 35,
16. Arslan, AK; Yagin, FH; Algarni, A; Karaaslan, E; Al-Hashem, F; Ardigò, LP. Enhancing type 2 diabetes mellitus prediction by integrating metabolomics and tree-based boosting approaches. Front Endocrinol; 2024; 15, [DOI: https://dx.doi.org/10.3389/fendo.2024.1444282] 1444282.
17. Awaji, B; Senan, EM; Olayah, F; Alshari, EA; Alsulami, M; Abosaq, HA; Alqahtani, J; Janrao, P. Hybrid techniques of facial feature image analysis for early detection of autism spectrum disorder based on combined CNN features. Diagnostics; 2023; 13,
18. Bagheri, SAM; Mojaradi, B; Kamboozia, N; Faizi, M. Analyzing the effects of streetscape and land use on urban accidents and predicting future accidents by using machine learning algorithms. Heliyon; 2024; 10,
19. Baral, P; Haq, MA. Spatial prediction of permafrost occurrence in Sikkim Himalayas using logistic regression, random forests, support vector machines and neural networks. Geomorphology; 2020; 371, [DOI: https://dx.doi.org/10.1016/j.geomorph.2020.107331] 107331.
20. Batar, AK; Watanabe, T. Landslide susceptibility mapping and assessment using geospatial platforms and weights of evidence (WoE) method in the Indian Himalayan region: recent developments, gaps, and future directions. ISPRS Int J Geo-Inf; 2021; 10,
21. Bello Yamusa I, Ismail MS. Integration of lineament and strain analysis to assess landslide vulnerability along Taiping to Ipoh highway, Malaysia. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLVIII-4/W6-2022, 57–74. 2023. https://doi.org/10.5194/isprs-archives-XLVIII-4-W6-2022-57-2023.
22. Bera, A; Mukhopadhyay, BP; Das, D. Landslide hazard zonation mapping using multi-criteria analysis with the help of GIS techniques: a case study from eastern Himalayas, Namchi, South Sikkim. Nat Hazards; 2019; 96,
23. Beven, KJ; Kirkby, MJ. A physically based, variable contributing area model of basin hydrology / Un modèle à base physique de zone d’appel variable de l’hydrologie du bassin versant. Hydrol Sci J; 1979; 24,
24. Bhasin, R; Grimstad, E; Larsen, J; Dhawan, AK; Singh, R; Verma, S; Venkatachalam, K. Landslide hazards and mitigation measures at Gangtok, Sikkim Himalaya. Eng Geol; 2002; 64,
25. Bhutia, KDO; Manna, H; Guria, R; Santos, CAG; Sarkar, S; Silva, RMda; Laksono, FAT; Mishra, M. Exploring shifting patterns of land use and land cover dynamics in the Khangchendzonga Biosphere Reserve (1992–2032): a geospatial forecasting approach. Environ Monit Assess; 2025; 197,
26. Blagus, R; Lusa, L. Boosting for high-dimensional two-class prediction. BMC Bioinformatics; 2015; 16,
27. Bogaard, TA; Greco, R. Landslide hydrology: from hydrology to pore pressure. WIREs Water; 2016; 3,
28. Bogaard, T; Greco, R. Invited perspectives: hydrological perspectives on precipitation intensity-duration thresholds for landslide initiation: proposing hydro-meteorological thresholds. Nat Hazards Earth Syst Sci; 2018; 18,
29. Bombrun, M; Dash, JP; Pont, D; Watt, MS; Pearse, GD; Dungey, HS. Forest-scale phenotyping: productivity characterisation through machine learning. Front Plant Sci; 2020; 11, [DOI: https://dx.doi.org/10.3389/fpls.2020.00099] 99.
30. Bowers, AJ; Zhou, X. Receiver operating characteristic (ROC) area under the curve (AUC): a diagnostic measure for evaluating the accuracy of predictors of education outcomes. J Educ Stud Placed Risk (JESPAR); 2019; 24,
31. Breiman, L. Random Forests. Mach Learn; 2001; 45,
32. Brown G, et al. Stacking and meta-learning in machine learning models. In Ensemble Machine Learning: Methods and Applications (pp. 35–53). Springer. 2001. https://doi.org/10.1007/978-1-4419-5867-2_2.
33. Bucci, F; Santangelo, M; Cardinali, M; Fiorucci, F; Guzzetti, F. Landslide distribution and size in response to Quaternary fault activity: the Peloritani Range, NE Sicily, Italy. Earth Surf Process Landforms; 2016; 41,
34. Capitani, M; Ribolini, A; Bini, M. The slope aspect: a predisposing factor for landsliding?. Comptes Rendus Géoscience; 2013; 345,
35. Casagli, N; Dapporto, S; Ibsen, ML; Tofani, V; Vannocci, P. Analysis of the landslide triggering mechanism during the storm of 20th–21st November 2000, in Northern Tuscany. Landslides; 2006; 3,
36. Chauhan, S; Sharma, M; Arora, MK. Landslide susceptibility zonation of the Chamoli region, Garhwal Himalayas, using logistic regression model. Landslides; 2010; 7,
37. Chauhan, V; Gupta, L; Dixit, J. Landslide susceptibility assessment for Uttarakhand, a Himalayan state of India, using multi-criteria decision making, bivariate, and machine learning models. Geoenviron Disasters; 2025; 12,
38. Chen, C; Fan, L. Selection of contributing factors for predicting landslide susceptibility using machine learning and deep learning models. Stoch Environ Res Risk Assess; 2023; [DOI: https://dx.doi.org/10.1007/s00477-023-02556-4]
39. Chen C, Fan L. Interpretability of statistical, machine learning, and deep learning models for landslide susceptibility mapping in Three Gorges Reservoir area. 2024. arXiv. https://doi.org/10.48550/arXiv.2405.11762.
40. Chen, G; Suhail, SA; Bahrami, A; Sufian, M; Azab, M. Machine learning-based evaluation of parameters of high-strength concrete and raw material interaction at elevated temperatures. Front Mater; 2023; 10, [DOI: https://dx.doi.org/10.3389/fmats.2023.1187094] 1187094.
41. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016;785–794. https://doi.org/10.1145/2939672.2939785.
42. Chen, W; Zhao, X; Shahabi, H; Shirzadi, A; Khosravi, K; Chai, H; Zhang, S; Zhang, L; Ma, J; Chen, Y; Wang, X; Li, R. Spatial prediction of landslide susceptibility by combining evidential belief function, logistic regression and logistic model tree. Geocarto Int; 2019; 34,
43. Cheng, W; Wang, N; Zhao, M; Zhao, S. Relative tectonics and debris flow hazards in the Beijing mountain area from DEM-derived geomorphic indices and drainage analysis. Geomorphology; 2016; 257, pp. 134-142. [DOI: https://dx.doi.org/10.1016/j.geomorph.2016.01.003]
44. Clark, KE; West, AJ; Hilton, RG; Asner, GP; Quesada, CA; Silman, MR; Saatchi, SS; Farfan-Rios, W; Martin, RE; Horwath, AB; Halladay, K; New, M; Malhi, Y. Storm-triggered landslides in the Peruvian Andes and implications for topography, carbon cycles, and biodiversity. Earth Surf Dyn; 2016; 4,
45. Cortes, C; Vapnik, V. Support Vector Networks. Mach Learn; 1995; 20,
46. Cui, H-Z; Tong, B; Wang, T; Dou, J; Ji, J. A hybrid data-driven approach for rainfall-induced landslide susceptibility mapping: Physically-based probabilistic model with convolutional neural network. J Rock Mech Geotech Eng; 2024; [DOI: https://dx.doi.org/10.1016/j.jrmge.2024.08.005]
47. Cui, H-Z; Tong, B; Wang, T; Dou, J; Ji, J. A hybrid data-driven approach for rainfall-induced landslide susceptibility mapping: physically-based probabilistic model with convolutional neural network. J Rock Mech Geotech Eng; 2025; 17,
48. Cui, K; Lu, D; Li, W. Comparison of landslide susceptibility mapping based on statistical index, certainty factors, weights of evidence and evidential belief function models. Geocarto Int; 2017; 32,
49. Dahal, RK. Rainfall-induced landslides in Nepal. Int J Eros Control Eng; 2012; 5,
50. Dahigamuwa, T; Yu, Q; Gunaratne, M. Feasibility study of land cover classification based on normalized difference vegetation index for landslide risk assessment. Geosciences; 2016; 6,
51. Das, R; Wegmann, KW; Tien, PV. Event-based mapping and spatial pattern analysis of landslides in parts of central Vietnam. Quat Sci Adv; 2024; 13, [DOI: https://dx.doi.org/10.1016/j.qsa.2023.100150] 100150.
52. Debnath, M; Poddar, I; Saha, A; Islam, N; Roy, R. Analysis of land surface temperature and its correlation with urban, water, vegetation, and surface indices in the Siliguri Urban Agglomeration Area of the Himalayan Foothill Region. Geomat Nat Hazards Risk; 2025; 16,
53. Dey, D; Haque, MS; Islam, MM; Aishi, UI; Shammy, SS; Mayen, MSA; Noor, STA; Uddin, MJ. The proper application of logistic regression model in complex survey data: a systematic review. BMC Med Res Methodol; 2025; 25,
54. Dietterich TG. Ensemble Methods in Machine Learning. In International Workshop on Multiple Classifier Systems (pp. 1–15). Springer. 2000. https://doi.org/10.1007/3-540-45014-9_1.
55. Ding, Q; Chen, W; Hong, H. Application of frequency ratio, weights of evidence and evidential belief function models in landslide susceptibility mapping. Geocarto Int; 2016; [DOI: https://dx.doi.org/10.1080/10106049.2016.1165294]
56. Dolatyabi, P. Ensemble machine learning models for predictive analysis: application to seismic ground motion data. World J Adv Res Rev; 2025; 27,
57. Du, Y; Liu, Y; Yan, Y; Fang, J; Jiang, X. Risk management of weather-related failures in distribution systems based on interpretable extra-trees. J Mod Power Syst Clean Energy; 2023; 11,
58. Dudek, G. Ensemble learning by stacking for deterministic and probabilistic short-term load forecasting. Res Square.; 2023; [DOI: https://dx.doi.org/10.21203/rs.3.rs-3697164/v1]
59. Eko Budianto, D; Limbong Pamuttu, D; Hairulla, H; Andang Pasalli, D. Geotextile reinforcement model laboratory test on silt soil. Tech Rom J Appl Sci Technol; 2023; 17, pp. 46-51. [DOI: https://dx.doi.org/10.47577/technium.v17i.10045]
60. Emberson, R; Galy, A; Hovius, N. Weathering of reactive mineral phases in landslides acts as a source of carbon dioxide in mountain belts. J Geophys Res Earth Surf; 2018; 123,
61. Emmer, A; Hölbling, D; Abad, L; Štěpánek, P; Zahradníček, P; Emmerová, I. Landslides associated with recent road constructions in the Río Lucma catchment, Eastern Cordillera Blanca, Peru. An Acad Bras Cienc; 2022; 94,
62. Eraku, SS; Permana, AP; Baruadi, MN. Landslide potential analysis using unmanned aerial vehicle in South Leato Village, Gorontalo City, Indonesia. Nat Environ Pollut Technol; 2023; 22,
63. Evans, JR; Imaizumi, F; Ohsaka, O; Ogawa, S. Relationship between tree height and landslide characteristics obtained by GIS assessment. Earth Surf Process Landforms; 2020; 45,
64. Fang, G; Xu, P; Liu, W. Automated ischemic stroke subtyping based on machine learning approach. IEEE Access; 2020; 8, pp. 118426-118432. [DOI: https://dx.doi.org/10.1109/ACCESS.2020.3004977]
65. Fang, J; Xu, R; Liu, Q. An improved Extra Trees model for estimating the severity of forest fires based on remote sensing data. Sci Total Environ; 2020; 706, [DOI: https://dx.doi.org/10.1016/j.scitotenv.2019.135891] 135891.
66. Fang, Z; Wang, Y; Peng, L; Hong, H. A comparative study of heterogeneous ensemble-learning techniques for landslide susceptibility mapping. Int J Geogr Inf Sci; 2021; 35,
67. Feby, B; Achu, AL; Jimnisha, K; Ayisha, VA; Reghunath, R. Landslide susceptibility modelling using integrated evidential belief function-based logistic regression method: a study from southern Western Ghats, India. Remote Sens Appl Soc Environ; 2020; 20, [DOI: https://dx.doi.org/10.1016/j.rsase.2020.100411] 100411.
68. Ferrara, C; Barone, PM; Salvati, L. Unravelling landslide risk: soil susceptibility, agro-forest systems and the socio-economic profile of rural communities in Italy. Soil Use Manage; 2015; 31,
69. Froude, MJ; Petley, DN. Global fatal landslide occurrence from 2004 to 2016. Nat Hazards Earth Syst Sci; 2018; 18,
70. Gakaev, R. Exogenous relief-forming processes and phenomena on the territory of the Chechen Republic. SHS Web Conf; 2021; 128, 03001. [DOI: https://dx.doi.org/10.1051/shsconf/202112803001]
71. Ganaie, MA; Hu, M; Malik, AK; Tanveer, M; Suganthan, PN. Ensemble deep learning: a review [Review of Ensemble deep learning: a review]. Eng Appl Artif Intell; 2022; 115, [DOI: https://dx.doi.org/10.1016/j.engappai.2022.105151] 105151.
72. Gao, P; Belletti, B; Piégay, H; You, Y; Li, Z. Can water-detection indices be reliable proxies for water discharges in mid-sized braided rivers using coarse-resolution Landsat archives?. Remote Sens; 2023; 16,
73. García-Rodríguez, MJ; Malpica, JA; Benito, B; Díaz, M. Susceptibility assessment of earthquake-triggered landslides in El Salvador using logistic regression. Geomorphology; 2008; 95,
74. Geurts, P et al. Extremely randomized trees. Mach Learn; 2006; 63,
75. Ghimire, M; Timalsina, N. Landslide distribution and processes in the hills of central Nepal: geomorphic and statistical approach to susceptibility assessment. J Geoscience Environ Prot; 2020; 08,
76. Ghimire, S et al. Machine learning for landslide susceptibility mapping: a systematic review. Landslides; 2021; 18,
77. Giorgi, F et al. Machine learning for climate change prediction and impact assessment. Nat Commun; 2020; 11,
78. Go, VDM. Communicable disease surveillance through predictive analysis: a comparative analysis of prediction models. Ho Chi Minh City Open Univ J Sci Eng Technol; 2023; 13,
79. Golshanrad, P; Rahmani, H; Karimian, B; Karimkhani, F; Weiss, G. MEGA: predicting the best classifier combination using meta-learning and a genetic algorithm. Intell Data Anal; 2021; 25,
80. Grinsztajn L, Oyallon E, Varoquaux G. Why do tree-based models still outperform deep learning on tabular data? 2022. arXiv. https://doi.org/10.48550/arXiv.2207.08815.
81. Guo, K; Zhu, B; Zha, L; Shao, Y; Liu, Z; Gu, N; Chen, K. Interpretable prediction of stroke prognosis: SHAP for SVM and nomogram for logistic regression. Front Neurol; 2025; 16, [DOI: https://dx.doi.org/10.3389/fneur.2025.1522868] 1522868.
82. Halder, K; Srivastava, AK; Ghosh, A; Das, S; Banerjee, S; Pal, SC; Chatterjee, U; Bisai, D; Ewert, F; Gaiser, T. Improving landslide susceptibility prediction through ensemble recursive feature elimination and meta-learning framework. Sci Rep; 2025; [DOI: https://dx.doi.org/10.1038/s41598-025-87587-3]
83. Hamada, MS; Zaqoot, HA; Sethar, WA. Using a supervised machine learning approach to predict water quality at the Gaza wastewater treatment plant. Environ Sci Adv; 2024; 3,
84. Hanifinia, A; Nazarnejad, H; Najafi, S; Kornejady, A; Pourghasemi, HR. Landslide susceptibility assessment and mapping using statistical and data mining models in Iran. Res Square.; 2021; [DOI: https://dx.doi.org/10.21203/rs.3.rs-239985/v1]
85. Haq, I; Aidi, MN; Kurnia, A; Efriwati, E. A comparison of logistic regression and geographically weighted logistic regression (GWLR) on COVID-19 data in West Sumatra. BAREKENG Jurnal Ilmu Matematika Dan Terapan; 2023; 17,
86. He, Q; Jiang, Z; Wang, M; Liu, K. Landslide and wildfire susceptibility assessment in Southeast Asia using ensemble machine learning methods. Remote Sens; 2021; 13,
87. He Z, Lin D, Lau T, Wu M. Gradient boosting machine: a survey. 2019. arXiv. https://doi.org/10.48550/arXiv.1905.11883.
88. Hosmer, DW; Lemeshow, S. Applied logistic regression; 2000; 2 Wiley-Interscience: [DOI: https://dx.doi.org/10.1002/0471722146]
89. Hossain, MA; Ahammad, I; Ahmed, MK; Imtiaz, A. Prediction of the computer science department’s educational performance through machine learning model by analyzing students’ academic statements. Artif Intell Evol; 2023; [DOI: https://dx.doi.org/10.37256/aie.4120232569]
90. Hossain, Z et al. Applications of Extra-Trees in various sectors: a review. J Big Data; 2023; 10,
91. Howell-Moroney, M. Inconvenient truths about logistic regression and the remedy of marginal effects. Public Adm Rev; 2024; 84,
92. Hu, X; Bürgmann, R. Rheology of a debris slide from the joint analysis of UAVSAR and LiDAR data. Geophys Res Lett; 2020; 47,
93. Hu, X; Mei, H; Zhang, H; Li, Y; Li, M. Performance evaluation of ensemble learning techniques for landslide susceptibility mapping at the Jinping County, Southwest China. Nat Hazards; 2021; 105,
94. Huang, D; Wang, S; Liu, Z. A systematic review of prediction methods for emergency management. Int J Disaster Risk Reduction; 2021; 62, 102412. [DOI: https://dx.doi.org/10.1016/j.ijdrr.2021.102412]
95. Huang, F; Chen, J; Liu, W; Huang, J; Hong, H; Chen, W. Regional rainfall-induced landslide hazard warning based on landslide susceptibility mapping and a critical rainfall threshold. Geomorphology; 2022; 408, [DOI: https://dx.doi.org/10.1016/j.geomorph.2022.108236] 108236.
96. Iverson, RM. Landslide triggering by rain infiltration. Water Resour Res; 2000; 36,
97. Jacobs, L; Dewitte, O; Poesen, J; Sekajugo, J; Nobile, A; Rossi, M; Thiery, W; Kervyn, M. Field-based landslide susceptibility assessment in a data-scarce environment: the populated areas of the Rwenzori Mountains. Landslides; 2017; 14,
98. Jiang, J; Xiang, W; Rohn, J; Zeng, W; Schleier, M. Research on water–rock (soil) interaction by dynamic tracing method for Huangtupo landslide, Three Gorges Reservoir, PR China. Environ Earth Sci; 2015; 74,
99. Jiao, C; Wang, S; Song, S; Fu, B. Long-term and seasonal variation of open-surface water bodies in the Yellow River Basin during 1990–2020. Hydrol Process; 2023; 37,
100. Joshi, DC; Kayastha, RB; Shrestha, KL; Kayastha, RB. A hybrid approach to enhance streamflow simulation in data-constrained Himalayan basins: combining the glacio-hydrological degree-day model and recurrent neural networks. Proc IAHS; 2024; 387, [DOI: https://dx.doi.org/10.5194/piahs-387-17-2024] 17.
101. Jourgholami, M; Labelle, ER. Effects of plot length and soil texture on runoff and sediment yield occurring on machine-trafficked soils in a mixed deciduous forest. Ann For Sci; 2020; 77,
102. Ju, X; Tian, Y; Liu, D; Qi, Z. Nonparallel hyperplanes support vector machine for multi-class classification. Procedia Comput Sci; 2015; 51, pp. 1574-1582. [DOI: https://dx.doi.org/10.1016/j.procs.2015.05.287]
103. Kadiyala, A; Kumar, A. Applications of Python to evaluate the performance of decision tree-based boosting algorithms. Environ Prog Sustain Energy; 2018; 37,
104. Kainthura, P; Sharma, N. Hybrid machine learning approach for landslide prediction, Uttarakhand, India. Sci Rep; 2022; [DOI: https://dx.doi.org/10.1038/s41598-022-22814-9]
105. Kalambukattu, JG; Kumar, S; Das, B; Roy, T. Digital mapping of soil organic carbon in the hilly and mountainous landscape of Indian Himalayan region employing machine-learning techniques. Discov Soil.; 2025; [DOI: https://dx.doi.org/10.1007/s44378-025-00060-5]
106. Kamp, U; Growley, BJ; Khattak, GA; Owen, LA. GIS-based landslide susceptibility mapping for the 2005 Kashmir earthquake region. Geomorphology; 2008; 101,
107. Katuwal, R; Suganthan, PN; Zhang, L. Heterogeneous oblique random forest. Pattern Recognit; 2020; 99, [DOI: https://dx.doi.org/10.1016/j.patcog.2019.107078] 107078.
108. Kavzoğlu, T; Teke, A. Predictive performances of ensemble machine learning algorithms in landslide susceptibility mapping using random forest, extreme gradient boosting (XGBoost) and natural gradient boosting (NGBoost). Arab J Sci Eng; 2022; 47,
109. Keating, KA; Cherry, S. Use and interpretation of logistic regression in habitat-selection studies. J Wildlife Manage; 2004; 68,
110. Khiari J, Olaverri-Monreal C. Boosting algorithms for delivery time prediction in transportation logistics. In 2020 International Conference on Data Mining Workshops (ICDMW) (pp. 251–258). IEEE. 2020. https://doi.org/10.1109/ICDMW51313.2020.00053.
111. Kim, J; Kim, I; Choi, B. Development of soil erosion susceptibility model using UAV photogrammetry in a timber harvesting area, South Korea. Geosciences; 2023; 13,
112. Koley, B; Nath, A; Bhattacharya, S; Saraswati, S; Ray, BC. GIS based landslide hazard zonation mapping by weighted overlay method on the road corridor of North Sikkim Himalayas, India. J Geol Soc India; 2020; 97,
113. Kubwimana, D; Ait Brahim, L; Nkurunziza, P; Dille, A; Depicker, A; Nahimana, L; Abdelouafi, A; Dewitte, O. Characteristics and distribution of landslides in the populated hillslopes of Bujumbura, Burundi. Geosciences; 2021; 11,
114. Kulsoom, I; Hua, W; Hussain, S; Chen, Q; Khan, G; Shihao, D. SBAS-InSAR based validated landslide susceptibility mapping along the Karakoram Highway: a case study of Gilgit-Baltistan, Pakistan. Sci Rep; 2023; 13,
115. Kusuma, RJ; Meilano, I; Sadisun, IA; Fitri, IH. Spatial analysis of causative factors for landslide susceptibility on Java Island. IOP Conf Ser Earth Environ Sci; 2023; 1276,
116. Lan, Q; Tang, J; Mei, X; Yang, X; Liu, Q; Xu, Q. Hazard assessment of rainfall-induced landslide considering the synergistic effect of natural factors and human activities. Sustainability; 2023; 15,
117. Le T, Tran D, Ma W, Pham T, Duong P, Nguyen M. Robust support vector machine. In 2014 International Joint Conference on Neural Networks (IJCNN) (pp. 4137–4144). IEEE. 2014. https://doi.org/10.1109/IJCNN.2014.6888017.
118. Le, X-H; Eu, S; Choi, C; Nguyen, DH; Yeon, M; Lee, G. Machine learning for high-resolution landslide susceptibility mapping: case study in Inje County, South Korea. Front Earth Sci; 2023; [DOI: https://dx.doi.org/10.3389/feart.2023.1268501]
119. Lebedev, AV; Westman, E; Van Westen, GJP; Kramberger, MG; Lundervold, A; Aarsland, D; Soininen, H; Kłoszewska, I; Mecocci, P; Tsolaki, M; Vellas, B; Lovestone, S; Simmons, A. Random forest ensembles for detection and prediction of Alzheimer’s disease with a good between-cohort robustness. NeuroImage Clin; 2014; 6, pp. 115-125. [DOI: https://dx.doi.org/10.1016/j.nicl.2014.08.023]
120. Lee, CT. Landslide trends under extreme climate events. Terr Atmos Ocean Sci; 2017; 28,
121. Li, C; An, X; Li, R. A chaos embedded GSA-SVM hybrid system for classification. Neural Comput Appl; 2015; 26,
122. Li, Y; Chen, W. Landslide susceptibility evaluation using hybrid integration of evidential belief function and machine learning techniques. Water; 2019; 12,
123. Liang, Z; Liu, W; Peng, W; Chen, L; Wang, C. Improved shallow landslide susceptibility prediction based on statistics and ensemble learning. Sustainability; 2022; 14,
124. Liu, CC; Luo, W; Chen, MC; Lin, YT; Wen, HL. A new region-based preparatory factor for landslide susceptibility models: the total flux. Landslides; 2016; 13,
125. Liu, C; Li, W; Wu, H; Lu, P; Sang, K; Sun, W; Chen, W; Hong, Y; Li, R. Susceptibility evaluation and mapping of China’s landslides based on multi-source data. Nat Hazards; 2013; 69,
126. Liu R, Nie X. A new combination of water index model based on TM image. DEStech Transactions on Computer Science and Engineering (icaic). 2019. https://doi.org/10.12783/dtcse/icaic2019/29434.
127. Liu, W; Bai, R; Sun, X; Yang, F; Zhai, W; Su, X. Rainfall- and irrigation-induced landslide mechanisms in loess slopes: an experimental investigation in Lanzhou, China. Atmosphere; 2024; 15,
128. Liu, Z; Pontius, RG. The total operating characteristic from stratified random sampling with an application to flood mapping. Remote Sens; 2021; 13,
129. Luo, S; Chen, T. Two derivative algorithms of gradient boosting decision tree for silicon content in blast furnace system prediction. IEEE Access; 2020; 8, pp. 196112-196122. [DOI: https://dx.doi.org/10.1109/ACCESS.2020.3034566]
130. Luo, X; Lin, F; Chen, Y; Zhu, S; Xu, Z; Huo, Z; Yu, M; Peng, J. Coupling logistic model tree and random subspace to predict the landslide susceptibility areas with considering the uncertainty of environmental features. Sci Rep; 2019; 9,
131. Ma, N; Halley, S; Ramaiyan, K; Garzon, F; Tsui, LK. Comparison of machine learning algorithms for natural gas identification with mixed potential electrochemical sensor arrays. ECS Sens Plus; 2023; 2,
132. Ma, S; Shao, X; Xu, C. Landslides triggered by the 2016 heavy rainfall event in Sanming, Fujian Province: distribution pattern analysis and spatio-temporal susceptibility assessment. Remote Sens; 2023; 15,
133. Mallick, J; Alqadhi, S; Talukdar, S; Alsubih, M; Ahmed, M; Khan, RA; Kahla, NB; Abutayeh, SM. Risk assessment of resources exposed to rainfall-induced landslide with the development of GIS and RS-based ensemble metaheuristic machine learning algorithms. Sustainability; 2021; 13,
134. Mameno, T; Wada, M; Nozaki, K; Takahashi, T; Tsujioka, Y; Akema, S; Hasegawa, D; Ikebe, K. Predictive modeling for peri-implantitis by using machine learning techniques. Sci Rep; 2021; 11,
135. Mandal, K; Saha, S; Mandal, S. Applying deep learning and benchmark machine learning algorithms for landslide susceptibility modelling in Rorachu river basin of Sikkim Himalaya, India. Geosci Front; 2021; 12,
136. Marc, O; Behling, R; Andermann, C; Turowski, JM; Illien, L; Roessner, S; Hovius, N. Long-term erosion of the Nepal Himalayas by bedrock landsliding: the role of monsoons, earthquakes, and giant landslides. Earth Surf Dyn; 2019; 7,
137. May, C; Roering, J; Eaton, LS; Burnett, KM. Controls on valley width in mountainous landscapes: the role of landsliding and implications for salmonid habitat. Geology; 2013; 41,
138. McDuie-Ra, D; Chettri, M. Himalayan boom town: rural–urban transformations in Namchi, Sikkim. Dev Change; 2018; 49,
139. Meena, SR; Ghorbanzadeh, O; Blaschke, T. A comparative study of statistics-based landslide susceptibility models: a case study of the region affected by the Gorkha earthquake in Nepal. ISPRS Int J Geo-Inf; 2019; 8,
140. Megahed, K; Mahmoud, N; Abd-Rabou, SEM. Application of machine learning models in the capacity prediction of RCFST columns. Sci Rep; 2023; [DOI: https://dx.doi.org/10.1038/s41598-023-48044-1]
141. Meharie, MG; Mengesha, WJ; Gariy, ZA; Mutuku, RNN. Application of stacking ensemble machine learning algorithm in predicting the cost of highway construction projects. Eng Constr Archit Manage; 2022; 29,
142. Meng, L; Treem, W; Heap, G; Chen, J. Predicting clinical outcomes of alpha-1 antitrypsin deficiency-associated liver disease using a stacking ensemble machine learning model based on UK Biobank data. Sci Rep; 2022; 12,
143. Merghadi, A; Yunus, AP; Dou, J; Whiteley, J; Pham, BT; Bui, DT; Avtar, R; Boumezbeur, A. Machine learning methods for landslide susceptibility studies: a comparative overview of algorithm performance. Earth-Sci Rev; 2020; 207, [DOI: https://dx.doi.org/10.1016/j.earscirev.2020.103225] 103225.
144. Mienye, ID; Sun, Y. A survey of ensemble learning: concepts, algorithms, applications, and prospects. IEEE Access; 2022; 10, pp. 99129-99949. [DOI: https://dx.doi.org/10.1109/ACCESS.2022.3207287]
145. Mishra, PK; Rai, A; Abdelrahman, K; Chand, S; Tiwari, A. Analysing challenges and strategies in land productivity in Sikkim Himalaya, India. Sustainability; 2021; 13,
146. Mitchell R, Adinets A, Rao T, Frank E. XGBoost: Scalable GPU Accelerated Learning. arXiv (Cornell University). 2018. https://doi.org/10.48550/arxiv.1806.11248.
147. Moazzam, MFU; Vansarochana, A; Boonyanuphap, J; Choosumrong, S; Rahman, G; Djuueyep, GP. Spatio-statistical comparative approaches for landslide susceptibility modeling: case of Mae Phun, Uttaradit Province, Thailand. SN Appl Sci; 2020; 2,
148. Monga D, Ganguli P. Moisture-Driven Landslides and Cascade Hazards in the Himalayan Region: A Synthesis on Predictive Assessment. In Advances in natural and technological hazards research (p. 267). Springer Nature (Netherlands). 2024. https://doi.org/10.1007/978-3-031-56591-5_10.
149. Montgomery, DR; Dietrich, WE. A physically based model for the topographic control on shallow landsliding. Water Resour Res; 1994; 30,
150. Montrasio, L; Valentino, R; Losi, GL. Towards a real-time susceptibility assessment of rainfall-induced shallow landslides on a regional scale. Nat Hazards Earth Syst Sci; 2011; 11,
151. Mood, C. Logistic regression: why we cannot do what we think we can do, and what we can do about it. Eur Sociol Rev; 2010; 26,
152. Moore, ID; Grayson, RB; Ladson, AR. Digital terrain modelling: a review of hydrological, geomorphological, and biological applications. Hydrol Process; 1991; 5,
153. Morad, D; Lyakhovsky, V; Hatzor, YH; Sagy, A. Stress heterogeneity and the onset of faulting along geometrically irregular faults. Geophys Res Lett; 2022; 49,
154. Muschelli, J. Roc and AUC with a binary predictor: a potentially misleading metric. J Classif; 2020; 37,
155. Nathania, B; Muira, F. Remote sensing and GIS approach for landslide susceptibility mapping: a case study in Hofu City, Yamaguchi, Japan. Int J Environ Geosci; 2017; [DOI: https://dx.doi.org/10.24843/ijeg.2017.v01.i01.p04]
156. Naznin, S; Uddin, MJ; Kabir, A. Identifying determinants of under-5 mortality in Bangladesh: a machine learning approach with BDHS 2022 data. PLoS ONE; 2025; 20,
157. Nguyen, LDH; Nguyen, DK. Using Landsat satellite images for assessing riverbank changes in the Mekong and Bassac Rivers in the An Giang Province. Sci Technol Dev J Eng Technol; 2020; [DOI: https://dx.doi.org/10.32508/stdjet.v2iSI2.444]
158. Nguyen, TS; Yang, K-H; Wu, Y-K; Teng, F; Chao, W-A; Lee, W-L. Post-failure process and kinematic behavior of two landslides: case study and material point analyses. Comput Geotech; 2022; 148, [DOI: https://dx.doi.org/10.1016/j.compgeo.2022.104797] 104797.
159. Nguyen, TV; Nguyen, Q-V; Nguyen, HTT; Nguyen, NT; Luong, KD; Thị, LP; Nguyen, TC; Vo, T; Le, PH; Tran, PT; Le, TD. STEMI-OP in-hospital mortality prediction algorithms: frailty-integrated machine learning in older patients undergoing primary PCI. NPJ Aging; 2025; 11,
160. Nocentini, N; Rosi, A; Piciullo, L; Liu, Z; Segoni, S; Fanti, R. Regional-scale spatiotemporal landslide probability assessment through machine learning and potential applications for operational warning systems: a case study in Kvam (Norway). Landslides; 2024; [DOI: https://dx.doi.org/10.1007/s10346-024-02287-9]
161. Nseka, D; Kakembo, V; Bamutaze, Y; Mugagga, F. Analysis of topographic parameters underpinning landslide occurrence in Kigezi Highlands of southwestern Uganda. Nat Hazards; 2019; 99,
162. Oh, H-J; Kadavi, PR; Lee, C-W; Lee, S. Evaluation of landslide susceptibility mapping by evidential belief function, logistic regression, and support vector machine models. Geomat Nat Hazards Risk; 2018; 9,
163. Omar, ED; Mat, H; Karim, AZA; Sanaudi, R; Ibrahim, FH; Omar, MA; Ismail, MZH; Hafiz, MZHI; Jayaraj, VJ; Goh, BL. Comparative analysis of logistic regression, gradient boosted trees, SVM, and random forest algorithms for prediction of acute kidney injury requiring dialysis after cardiac surgery. Int J Nephrol Renovasc Dis; 2024; 197, pp. 197-204. [DOI: https://dx.doi.org/10.2147/ijnrd.s461028]
164. Othman, A; Gloaguen, R. River courses affected by landslides and implications for hazard assessment: a high resolution remote sensing case study in NE Iraq–W Iran. Remote Sens; 2013; 5,
165. Owen, LA; Kamp, U; Khattak, GA; Harp, EL; Keefer, DK; Bauer, MA. Landslides triggered by the 8 October 2005 Kashmir earthquake. Geomorphology; 2008; 94,
166. Pantha, BR; Yatabe, R; Bhandary, NP. GIS-based landslide susceptibility zonation for roadside slope repair and maintenance in the Himalayan region. Episodes; 2008; 31,
167. Park, H. An introduction to logistic regression: from basic concepts to interpretation with particular attention to nursing domain. J Korean Acad Nurs; 2013; 43,
168. Patra, P; Devi, R. Assessment, prevention and mitigation of landslide hazard in the Lesser Himalaya of Himachal Pradesh. Environ Socio-Econ Stud; 2015; 3,
169. Pei, Y; Qiu, H; Hu, S; Yang, D; Zhang, Y; Ma, S; Cao, M. Appraisal of tectonic-geomorphic features in the Hindu Kush-Himalayas. Earth Space Sci; 2021; 8,
170. Petley, DN; Hearn, GJ; Hart, A; Rosser, NJ; Dunning, SA; Oven, K; Mitchell, WA. Trends in landslide occurrence in Nepal. Nat Hazards; 2007; 43,
171. Pham, BT; Bui, DT; Dholakia, MB; Prakash, I; Mehmood, K. A novel ensemble classifier of rotation forest and Naïve Bayer for landslide susceptibility assessment at the Luc Yen district, Yen Bai Province (Viet Nam) using GIS. Geomat Nat Hazards Risk; 2017; 8,
172. Pham, BT; Nguyen-Thoi, T; Qi, C; Phong, TV; Dou, J; Ho, LS; Le, HV; Prakash, I. Coupling RBF neural network with ensemble learning techniques for landslide susceptibility mapping. CATENA; 2020; 195, [DOI: https://dx.doi.org/10.1016/j.catena.2020.104805] 104805.
173. Piccialli, V; Sciandrone, M. Nonlinear optimization and support vector machines. 4OR; 2018; 16,
174. Poddar, I; Roy, R. Application of GIS-based data-driven bivariate statistical models for landslide prediction: a case study of highly affected landslide-prone areas of Teesta River Basin. Quat Sci Adv; 2024; 13, [DOI: https://dx.doi.org/10.1016/j.qsa.2023.100150] 100150.
175. Poddar I, Alam J, Basak A, Mitra R, Das J. Application of a geospatial-based subjective MCDM method for flood susceptibility modeling in Teesta River Basin, West Bengal, India. In Monitoring and Managing Multi-hazards, GIScience and Geo-environmental Modelling (pp. 135–152). Springer International Publishing. 2023. https://doi.org/10.1007/978-3-031-15167-5_11.
176. Poddar I, Basak A, Alam J, Das J, Alam A. Application of RS-GIS-based multi-criteria decision-making model (MCDM) on site suitability analysis for potato cultivation in Jalpaiguri District, West Bengal, India. In Advancement of GI-Science and Sustainable Agriculture, GIScience and Geo-environmental Modelling (pp. 81–98). Springer Nature Switzerland. 2023. https://doi.org/10.1007/978-3-031-25404-0_6.
177. Pokharel, B; Thapa, PB. Landslide susceptibility in Rasuwa District of Central Nepal after the 2015 Gorkha earthquake. J Nepal Geol Soc; 2019; 59, pp. 79-88. [DOI: https://dx.doi.org/10.3126/jngs.v59i0.24992]
178. Pokhrel P. A LightGBM based forecasting of dominant wave periods in oceanic waters. 2021. arXiv:2105.08721 https://doi.org/10.48550/arXiv.2105.08721.
179. Pourghasemi, HR; Kerle, N. Random forests and evidential belief function-based landslide susceptibility assessment in Western Mazandaran Province, Iran. Environ Earth Sci; 2016; 75,
180. Prashad Bhatt, B; Awasthi, KD; Heyojoo, BP; Silwal, T; Kafle, G. Using geographic information system and analytical hierarchy process in landslide hazard zonation. Appl Ecol Environ Sci; 2013; 1,
181. Punn, NS; Dewangan, DK. Ensemble meta-learning using SVM for improving cardiovascular disease risk prediction. Int J Cardiovasc Res; 2024; 9,
182. Putra, AN; Jaenudin,; Prasetya, NR; Sugiarto, MT; Sudarto, S; Prayogo, C; Maritimo, F; Admajaya, FT. Utilizing remote sensing and random forests to identify optimal land use scenarios and address the increase in landslide susceptibility. Sustainability; 2025; 17,
183. Qi, T; Zhao, Y; Meng, X; Chen, G; Dijkstra, T. AI-based susceptibility analysis of shallow landslides induced by heavy rainfall in Tianshui, China. Remote Sens; 2021; 13,
184. Qin, C-Z; Zhu, A-X; Pei, T; Li, B-L; Scholten, T; Behrens, T; Zhou, C-H. An approach to computing topographic wetness index based on maximum downslope gradient. Precis Agric; 2011; 12,
185. Ragam, P; Kumar, N; Ajith, J; Karthik, G; Himanshu, VK; Machupalli, DS; Murlidhar, BR. Estimation of slope stability using ensemble-based hybrid machine learning approaches. Front Mater; 2024; 11, [DOI: https://dx.doi.org/10.3389/fmats.2024.1330609] 1330609.
186. Ramli, MF; Yusof, N; Yusoff, MK; Juahir, H; Shafri, HZM. Lineament mapping and its application in landslide hazard assessment: a review. Bull Eng Geol Environ; 2010; 69,
187. Rana, K; Ozturk, U; Malik, N. Landslide geometry reveals its trigger. Geophys Res Lett; 2021; 48,
188. Randell, H; Jiang, C; Liang, X-Z; Murtugudde, R; Sapkota, A. Food insecurity and compound environmental shocks in Nepal: implications for a changing climate. World Dev; 2021; 145, [DOI: https://dx.doi.org/10.1016/j.worlddev.2021.105511] 105511.
189. Rane, NL; Achari, A; Saha, A; Poddar, I; Rane, J; Pande, CB; Roy, R. An integrated GIS, MIF, and TOPSIS approach for appraising electric vehicle charging station suitability zones in Mumbai, India. Sustain Cities Soc; 2023; 97, [DOI: https://dx.doi.org/10.1016/j.scs.2023.104717] 104717.
190. Rawat, MS. Statistical analysis of landslide in South District, Sikkim, India: using remote sensing and GIS. IOSR J Environ Sci Toxicol Food Technol; 2012; 2,
191. Regmi, AD; Yoshida, K; Dhital, MR; Devkota, K. Effect of rock weathering, clay mineralogy, and geological structures in the formation of large landslides, a case study from Dumre Besei landslide, Lesser Himalaya Nepal. Landslides; 2013; 10,
192. Remelli, S; Petrella, E; Chelli, A; Conti, FD; Lozano Fondón, C; Celico, F; Francese, R; Menta, C. Hydrodynamic and soil biodiversity characterization in an active landslide. Water; 2019; 11,
193. Rifai, M; Harintaka. ,. Analysis of water quality dynamics of Sentarum Lake, Indonesia, with water index application and water parameter algorithm methods using Google Earth Engine. IOP Conf Ser Earth Environ Sci; 2025; 1443,
194. Roccati, A; Paliaga, G; Luino, F; Faccini, F; Turconi, L. GIS-based landslide susceptibility mapping for land use planning and risk assessment. Land; 2021; 10,
195. Roy, J; Saha, S; Arabameri, A; Blaschke, T; Bui, DT. A novel ensemble approach for landslide susceptibility mapping (LSM) in Darjeeling and Kalimpong Districts, West Bengal, India. Remote Sens; 2019; 11,
196. Różycka, M; Migoń, P; Michniewicz, A. Topographic wetness index and terrain ruggedness index in geomorphic characterisation of landslide terrains, on examples from the Sudetes, SW Poland. Z Geomorphol; 2017; 61,
197. Ruhai L, Kong X, Wang X. An ensemble SVM using entropy-based attribute selection. In 2010 Chinese Control and Decision Conference (pp. 802–805). IEEE. 2010. https://doi.org/10.1109/CCDC.2010.5495049.
198. Saeed, MJ et al. A comparative analysis of machine learning techniques for early diagnosis of diabetes using clinical data. Front Endocrinol; 2021; 12, [DOI: https://dx.doi.org/10.3389/fendo.2021.628905] 628905.
199. Saeed, U; Lee, Y-D; Jan, SU; Koo, I. CAFD: context-aware fault diagnostic scheme towards sensor faults utilizing machine learning. Sensors; 2021; 21,
200. Sagi, O; Rokach, L. Ensemble learning: a survey. WIREs Data Min Knowl Discov; 2018; 8,
201. Saha, A; Saha, S. Comparing the efficiency of weight of evidence, support vector machine and their ensemble approaches in landslide susceptibility modelling: a study on Kurseong region of Darjeeling Himalaya, India. Remote Sens Appl Soc Environ; 2020; 19, [DOI: https://dx.doi.org/10.1016/j.rsase.2020.100323] 100323.
202. Saha, S; Roy, J; Pradhan, B; Hembram, TK. Hybrid ensemble machine learning approaches for landslide susceptibility mapping using different sampling ratios at East Sikkim Himalayan, India. Adv Space Res; 2021; 68,
203. Saied, M; Guirguis, SK. Explainable artificial intelligence for botnet detection in the Internet of Things. Sci Rep; 2025; 15,
204. Sainani, KL. Logistic regression. PM R; 2014; 6,
205. Sajinkumar, KS; Anbazhagan, S; Pradeepkumar, AP; Rani, VR. Weathering and landslide occurrences in parts of Western Ghats, Kerala. J Geol Soc India; 2011; 78,
206. Sari-Ahmed, B; Benzaamia, A; Ghrici, M; Baig Moghal, AA. Strength prediction of fiber-reinforced clay soils stabilized with lime using XGBoost machine learning. Civ Environ Eng Rep; 2024; 34,
207. Sarmah, U; Borah, P; Bhattacharyya, DK. Ensemble learning methods: an empirical study. SN Comput Sci; 2024; [DOI: https://dx.doi.org/10.1007/s42979-024-03252-y]
208. Sato, T; Shuin, Y. Relationship between landslides and long-term rainfall trends. Arab J Geosci; 2022; 15,
209. Sedgwick, P. Pearson’s correlation coefficient. BMJ; 2012; 345,
210. Sekarlangit, N; Tewal, STR; Sinaga, EK. Landslide susceptibility mapping of Menoreh Mountain using logistic regression. J Appl Geol; 2022; 7,
211. Setyawan, A; Alina, A; Suprapto, D; Gernowo, R; Suseno, JE; Hadiyanto, H. Analysis slope stability based on physical properties in Cepoko Village, Indonesia. Cogent Eng; 2021; 8,
212. Shafique, M et al. Prediction of cardiovascular disease using ensemble learning techniques. IEEE Access; 2019; 7, pp. 159986-160000. [DOI: https://dx.doi.org/10.1109/ACCESS.2019.2950529]
213. Shafique, R; Mehmood, A; Ullah, S; Choi, GS. Cardiovascular disease prediction system using extra trees classifier. Environ Prog Sustain Energy; 2019; [DOI: https://dx.doi.org/10.1002/ep.13128]
214. Shahi, YB; Kadel, S; Dangi, H; Adhikari, G; Kc, D; Paudyal, KR. Geological exploration, landslide characterization and susceptibility mapping at the boundary between two crystalline bodies in Jajarkot, Nepal. Geotechnics; 2022; 2,
215. Shankar, R; Satyam, GP; Singh, PK; Paswan, RK. Impact of geomorphometric parameters on landslide occurrence and distribution in Yamuna River Basin, North-Western Himalaya, India. Landslides; 2021; 18,
216. Shao, W; Nie, W; Ni, J. Research on landslide hydrology and hydrogeological disaster monitoring. Water; 2023; 15,
217. Shi, Z; Day, SM. Rupture dynamics and ground motion from 3‐D rough‐fault simulations. J Geophys Res Solid Earth; 2013; 118,
218. Shim, J-H; Lee, Y-K. Generalized partially linear additive models for credit scoring. Korean J Appl Stat; 2011; 24,
219. Shirzadi, A; Bui, DT; Pham, BT; Solaimani, K; Chapi, K; Kavian, A; Shahabi, H; Revhaug, I. Shallow landslide susceptibility assessment using a novel hybrid intelligence approach. Environ Earth Sci; 2017; 76,
220. Shrestha, M; Sharma, S; Shrestha, RP. Landslides in the Himalayas: a comprehensive review of hazards, impacts, and adaptive strategies. Rural Regional Dev; 2025; 3,
221. Shrestha, S; Kang, T; Suwal, MK. An ensemble model for co-seismic landslide susceptibility using GIS and random forest method. ISPRS Int J Geo-Inf; 2017; 6,
222. Shrestha, S; Kang, T-S; Suwal, MK. "An Ensemble Model for Co-Seismic Landslide Susceptibility Using GIS and Random Forest Method.". ISPRS Int J Geo-Inf; 2017; 6,
223. Sinčić, M; Gazibara, SB; Krkač, M; Arbanas, SM. Landslide susceptibility assessment of the city of Karlovac using the bivariate statistical analysis. Rudarsko-Geološko-Naftni Zbornik; 2022; 37,
224. Singh, A; Dhiman, N; Niraj, KC; Shukla, DP. Ensembled transfer learning approach for error reduction in landslide susceptibility mapping of the data scare region. Sci Rep; 2024; [DOI: https://dx.doi.org/10.1038/s41598-024-76541-4]
225. Singh, O; Kumar, M. Flood occurrences, damages, and management challenges in India: a geographical perspective. Arab J Geosci; 2017; 10,
226. Skilodimou, HD; Bathrellos, GD; Koskeridou, E; Soukis, K; Rozos, D. Physical and anthropogenic factors related to landslide activity in the northern Peloponnese, Greece. Land; 2018; 7,
227. Srivastava, V. Sinking Himalayan town leaves thousands of homes at risk. Nat India; 2023; [DOI: https://dx.doi.org/10.1038/d44151-023-00002-6]
228. Sucahyo, CB; Rizqini, FQ; Naufal, A; Yandratama, H; Ash Shiddiqy, J; Utama, ABP; Fanany Putri, NS; Wibawa, AP. Performance analysis of random forest on quartile classification journal. Appl Eng Technol; 2024; 3,
229. Suhairiani, S; Panjaitan, NH; Sinaga, EK. Testing the difference value of compressive strength for disturbed and undisturbed soil in Sibolga Hill landslide. J Phys: Conf Ser; 2024; 2908,
230. Sujatha, ER; Sridhar, V. Landslide susceptibility analysis: a logistic regression model case study in Coonoor, India. Hydrology; 2021; 8,
231. Sukristiyanti,; Wikantika, K; Sadisun, IA; Yayusman, LF; Tohari, A; Zaenal Putra, MH. Evaluation of parameter selection in the bivariate statistical-based landslide susceptibility modeling (Case study: the Citarik Sub-Watershed, Indonesia). Int J Adv Sci Eng Inf Technol; 2022; 12,
232. Sulastriningsih, HS; Tewal, STR; Suoth, GFE. Evaluation of landslide based settlement distribution in Manado City. IOP Conf Ser Mater Sci Eng; 2021; 1125,
233. Sur, U; Singh, P; Meena, SR; Singh, TN. Predicting landslides susceptible zones in the Lesser Himalayas by ensemble of per pixel and object-based models. Remote Sens; 2022; 14,
234. Suwarno, M; Mujiarto,. Analysis of static morphostructure conditions with dynamic morphostructure (landslide type). Geographia Technica; 2020; 15,
235. Talebi, A; Troch, PA; Uijlenhoet, R. A steady-state analytical slope stability model for complex hillslopes. Hydrol Process; 2008; 22,
236. Talebi, A; Uijlenhoet, R; Troch, PA. Soil moisture storage and hillslope stability. Nat Hazards Earth Syst Sci; 2007; 7,
237. Talebi, A; Uijlenhoet, R; Troch, PA. A low-dimensional physically based model of hydrologic control of shallow landsliding on complex hillslopes. Earth Surf Process Landforms; 2008; 33,
238. Tang, Q; Chen, Y; Yang, H; Liu, M; Xiao, H; Wu, Z; Chen, H; Naqvi, SR. Prediction of bio-oil yield and hydrogen contents based on machine learning method: effect of biomass compositions and pyrolysis conditions. Energy Fuels; 2020; 34,
239. Tarek, S; Noaman, HM; Kayed, M. Enhancing question pairs identification with ensemble learning: integrating machine learning and deep learning models. Int J Adv Comput Sci Appl; 2023; [DOI: https://dx.doi.org/10.14569/ijacsa.2023.01411100]
240. Thapa, PB; Lamichhane, S; Joshi, KP; Raj Regmi, A; Bhattarai, D; Adhikari, H. Landslide susceptibility assessment in Nepal’s Chure region: a geospatial analysis. Land; 2023; 12,
241. Tian, Y; Shi, Y; Liu, X. Recent advances on support vector machines research. Technol Econ Dev Econ; 2012; 18,
242. Tiwari, RN; Kushwaha, VK. Watershed prioritization based on morphometric parameters and PCA technique: a case study of Deonar River Sub Basin, Sidhi area, Madhya Pradesh, India. J Geol Soc India; 2021; 97,
243. Tomita, K; Yamasaki, A; Katou, R; Ikeuchi, T; Touge, H; Sano, H; Tohda, Y. Construction of a diagnostic algorithm for diagnosis of adult asthma using machine learning with random forest and XGBoost. Diagnostics; 2023; 13,
244. Uluç Keçik, A; Çiftçi, C; Gülcen Eren, Ş; Tepecik Diş, A; Rizzo, A. Determination and evaluation of landslide-prone regions of Isparta (Turkey): an urban planning view. Sustainability; 2023; 15,
245. Uwihirwe, J; Hrachowitz, M; Bogaard, TA. Landslide precipitation thresholds in Rwanda. Landslides; 2020; 17,
246. Vallet, A; Bertrand, C; Fabbri, O; Mudry, J. A new method to compute the groundwater recharge for the study of rainfall-triggered deep-seated landslides. Application to the Séchilienne unstable slope (Western Alps). Nat Hazards Earth Syst Sci; 2014; 14,
247. Vanschoren J. Meta-learning. In Automated Machine Learning, The Springer Series on Challenges in Machine Learning (pp. 35–61). Springer International Publishing. 2019. https://doi.org/10.1007/978-3-030-17628-6_2.
248. Vassallo, D; Vella, V; Ellul, J. Application of gradient boosting algorithms for anti-money laundering in cryptocurrencies. SN Comput Sci; 2021; 2,
249. Viet LD, Chi CN, Tien CN, Quoc DN. The effect of the normalized difference vegetation index to landslide susceptibility using optical imagery Sentinel 2 and Landsat 8. 4th Asia Pacific Meeting on Near Surface Geoscience & Engineering (pp. 1–5). European Association of Geoscientists & Engineers. 2021.
250. Wang, G; Lei, X; Chen, W; Shahabi, H; Shirzadi, A. Hybrid computational intelligence methods for landslide susceptibility mapping. Symmetry; 2020; 12,
251. Wang, J; Wang, Z; Sun, G; Luo, H. Analysis of three-dimensional slope stability combined with rainfall and earthquake. Nat Hazards Earth Syst Sci; 2024; 24,
252. Wang, Q; Li, W; Wu, Y; Pei, Y; Xing, M; Yang, D. A comparative study on the landslide susceptibility mapping using evidential belief function and weights of evidence models. J Earth Syst Sci; 2016; 125,
253. Wang, Y-H; Wang, L-Q; Liu, S-L; Sun, W-X; Liu, P-F; Zhu, L; Yang, W-Y; Guo, T. Machine learning solution for regional landslide susceptibility based on fault zone division strategy. J Mt Sci; 2024; 21,
254. Wang, Y-H; Wang, L-Q; Zhang, W-G; Liu, S-L; Sun, W-X; Li, P-F; Zhu, L; Yang, W-Y. A physics-informed machine learning solution for landslide susceptibility mapping based on three-dimensional slope stability evaluation. J Central South Univ. Sci Technol; 2024; 55, pp. 5335-5361. [DOI: https://dx.doi.org/10.1007/s11771-024-5687-3]
255. Wang, Z; Shao, Y; Bai, L; Li, C; Liu, L; Deng, N. Insensitive stochastic gradient twin support vector machines for large-scale problems. Inf Sci; 2018; 462, pp. 114-131. [DOI: https://dx.doi.org/10.1016/j.ins.2018.06.007]
256. Wei, X; Zhang, L; Gardoni, P; Chen, Y; Tan, L; Liu, D; Du, C; Li, H. Comparison of hybrid data-driven and physical models for landslide susceptibility mapping at regional scales. Acta Geotech; 2023; 19,
257. Wubalem A. Landslide inventory, susceptibility, hazard and risk mapping. In Y. Zhang & Q. Cheng (Eds.), Landslides (pp. 1–25). IntechOpen. 2022. https://doi.org/10.5772/intechopen.96816.
258. Yang, B; Yin, K; Lacasse, S; Liu, Z. Time series analysis and long short-term memory neural network to predict landslide displacement. Landslides; 2019; 16,
259. Yang W, Yu P, Cai X, Zhong X, He Y. High wind speed retrieval from SAR images using random forest algorithm. P. 43 in International Conference on Remote Sensing, Mapping, and Geographic Systems (RSMG 2023). SPIE. 2023. https://doi.org/10.1117/12.2641602.
260. Yarveicy, H et al. Modeling the efficiency of carbon dioxide absorption using an extra-tree regression algorithm. J Environ Manage; 2019; 240, pp. 308-316. [DOI: https://dx.doi.org/10.1016/j.jenvman.2019.03.116]
261. Yarveicy, H; Saghafi, H; Ghiasi, MM; Mohammadi, AH. Decision tree-based modeling of CO2 equilibrium absorption in different aqueous solutions of absorbents. Environ Prog Sustain Energy; 2019; [DOI: https://dx.doi.org/10.1002/ep.13128]
262. Yilmaz, OS. Automatic detection of water surfaces using K-means++ clustering algorithm with Landsat-9 and Sentinel-2 images on the Google Earth Engine platform. Bilge Int J Sci Technol Res; 2023; 7,
263. Youssef, K; Shao, K; Moon, S; Bouchard, L. Landslide susceptibility modeling by interpretable neural network. Commun Earth Environ; 2023; [DOI: https://dx.doi.org/10.1038/s43247-023-00806-5]
264. Yu, H; Pei, W; Zhang, J; Chen, G. Landslide susceptibility mapping and driving mechanisms in a vulnerable region based on multiple machine learning models. Remote Sens; 2023; 15,
265. Zafari, A; Zurita-Milla, R; Izquierdo-Verdiguier, E. A comparison of machine learning classifiers for object-based land cover mapping: Random Forest and Extra-Trees. Remote Sens; 2020; 12,
266. Zafari, A; Zurita-Milla, R; Izquierdo-Verdiguier, E. Land cover classification using extremely randomized trees: a kernel perspective. IEEE Geosci Remote Sens Lett; 2020; 17,
267. Zahratunnisa, A; Saepuloh, A; Kriswati, E; Basuki, A. Application of modified segment tracing algorithm (mSTA) method to identify landslide susceptibility zones around Mt. Sinabung, Indonesia. IOP Conf Ser Earth Environ Sci; 2023; 1245,
268. Zhang, F; Bai, J; Li, X; Pei, C; Havyarimana, V. An ensemble cascading extremely randomized trees framework for short-term traffic flow prediction. KSII Trans Internet Inf Syst; 2019; [DOI: https://dx.doi.org/10.3837/tiis.2019.04.013]
269. Zhang, L et al. Predicting the stock market using wavelet transforms and extra trees. Expert Syst Appl; 2019; 118, pp. 454-467. [DOI: https://dx.doi.org/10.1016/j.eswa.2018.10.027]
270. Zhang, L et al. Geospatial machine learning for landslide hazard prediction in mountain regions. Environ Model Softw; 2020; 132, [DOI: https://dx.doi.org/10.1016/j.envsoft.2020.104763] 104763.
271. Zhang, W; Li, H; Liu, H; Chen, Y; Li, Y; Ding, X. Application of deep learning algorithms in geotechnical engineering: a short critical review. Artif Intell Rev; 2021; 56,
272. Zhang, Z; Yang, F; Chen, H; Wu, Y; Li, T; Li, W; Wang, Q; Liu, P. GIS-based landslide susceptibility analysis using frequency ratio and evidential belief function models. Environ Earth Sci; 2016; 75,
273. Zhou, W; Zhou, Y; Liang, S; Zhang, C; Dai, HL; Sun, X. A new framework for landslide susceptibility mapping in contiguous impoverished areas using machine learning and catastrophe theory. Sci Rep; 2025; 15,
274. Zhou, X; Wen, H; Zhang, Y; Xu, J; Zhang, W. Landslide susceptibility mapping using hybrid random forest with GeoDetector and RFE for factor optimization. Geosci Front; 2021; 12,
275. Zulkafli, SA; Abd Majid, N. Urban resilience in the face of natural hazards: leveraging machine learning to assess landslide risk in Kuala Lumpur, Malaysia. Int J Acad Res Bus Soc Sci; 2024; 14,
276. Zulkafli, SA; Abd Majid, N; Syed Zakaria, SZ; Razman, MR; Ahmed, MF. Influencing physical characteristics of landslides in Kuala Lumpur, Malaysia. Pertanika J Sci Technol; 2023; 31,
277. Barman, J; Poddar, I; Sarkar, A. Forest fire susceptibility zonation of Mizoram using GIS: A comparative analysis of bivariate models. Results Earth Science; 2025; 3, [DOI: https://dx.doi.org/10.1016/j.rines.2025.100126] 100126.
© The Author(s) 2026. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.