Content area

Abstract

Storm surges present a major hazard to coastal areas worldwide, a risk that is further amplified by ongoing sea-level rise associated with climate warming. The purpose of this study is to enhance the prediction performance of a storm surge height model by incorporating data resampling techniques into a multiple linear regression framework. Typhoon-related predictors, such as location and intensity-related parameters, were used to estimate observed storm surge heights at eleven tide gauge stations in southeastern Korea. To address the data imbalance inherent in storm surge height distributions, we applied combinations of over- and under-sampling methods across various threshold levels and evaluated them using four statistical metrics: root mean square error (RMSE), mean absolute error (MAE), mean squared error (MSE), and the coefficient of determination (R2). The results demonstrate that both threshold selection and sampling configuration significantly influence model accuracy. In particular, station-specific sampling strategies improved R2 values by up to 0.46, even without modifying the regression model itself, underscoring the effectiveness of data-level balancing. These findings highlight that adaptive resampling strategies—tailored to local surge characteristics and data distribution—can serve as a powerful tool for improving regression-based coastal hazard prediction models.

Full text

Turn on search term navigation

1. Introduction

From a coastal engineering perspective, sea level refers to the height of the sea surface relative to a defined vertical datum (e.g., mean sea level or local tidal datum) [1]. An increase in sea level can cause coastal inundation hazards in nearshore areas [2]. Sea level can be classified into components such as astronomical tide, meteorological tide (commonly referred to as storm surge), mean sea level, and other residual factors [1,3,4,5,6]. Among these components, the astronomical tide and mean sea level can be calculated with high accuracy, while other residual factors only have a minor influence on the overall sea level height. In contrast, storm surge exhibits large variability, can significantly elevate the sea surface, and remains difficult to predict. Because the predictive performance of storm surge strongly affects the overall accuracy of sea-level prediction, developing a highly reliable storm surge prediction model is essential for mitigating coastal inundation hazards.

Storm surge refers to an abnormal increase in sea level primarily caused by meteorological conditions, particularly strong winds and low atmospheric pressure [1]. It is often associated with tropical cyclones, as these systems simultaneously produce intense winds and significant pressure drops [7,8,9]. Globally, coastal inundation caused by storm surges associated with typhoons has resulted in severe damage to coastal regions. In particular, the Korean coastline is highly vulnerable to storm surges generated by typhoons, which have repeatedly caused severe loss of life and property. Major events since 2000 clearly illustrate the magnitude of this threat. For instance, Typhoon Rusa (2002; year of occurrence) produced up to 1.5 m of surge along the southern coast, resulting in over 200 fatalities and approximately KRW 5 trillion in damages [10]. Typhoon Maemi (2003) generated surges of about 2 m in Masan and Busan, inundating urban areas and crippling port facilities, with more than 130 deaths of missing persons reported [11]. Typhoon Bolaven (2012) caused surges of about 1.5 m along the west coast, accompanied by widespread blackouts and structural damage [12]. Typhoon Chaba (2016) led to severe urban flooding in Ulsan and Busan as storm surges coincided with river backflow [13,14]. Most recently, Typhoon Hinnamnor (2022) triggered record-breaking storm surges in Pohang, flooding thousands of buildings and inflicting economic losses on the order of several trillion KRW [15]. These recurring surge-related disasters along the entire Korean coastline demonstrate the high vulnerability of coastal regions. Therefore, it is necessary to develop a storm surge prediction model that accounts for typhoon characteristics.

Traditionally, numerical models have been used for the precise prediction of storm surge heights (hereafter, SSHs). However, recent studies [16,17,18,19,20,21] have consistently demonstrated that data-driven statistical models utilizing regression techniques are more suitable and effective for storm surge forecasting. Tadesse et al. (2020) [16] clearly showed the superior predictive performance of regression models on a global scale. Through extensive validation using 882 tide gauge stations worldwide, their regression model achieved excellent performance in mid-latitude regions, with a correlation coefficient of 0.79 and an RMSE of 75 cm. This result highlights the intrinsic strength of regression techniques in accurately predicting continuous surge heights. Even more noteworthy is their predictive capability for extreme events. In the same study, the data-driven regression model achieved correlation coefficients of 0.51 in mid-latitudes and 0.29 in tropical regions for extreme storm surge events exceeding the 95th percentile, substantially outperforming the existing Global Tide and Surge Reanalysis (GTSR) numerical model, which yielded 0.44 and 0.20, respectively. In a study conducted in Queensland, Australia, Matters (2019) [22] demonstrated that the random forest regression model achieved a correlation coefficient of 0.991 and was successfully implemented in an operational forecasting system with a 72 h prediction capability. Similarly, Tian et al. (2024) [17] highlighted that machine learning regression models offer considerable advantages over traditional numerical models in terms of flexibility, accuracy, and computational efficiency, while high-resolution numerical models require substantial computational resources and long simulation times. Regression models enable rapid predictions after a single training phase, as evidenced by their successful applications in operational systems. Moreover, Tian et al. (2024) [17] noted that these models can capture nonlinear interactions between typhoon characteristics and storm surge magnitudes, thereby overcoming oversimplifications of traditional linear methods. Furthermore, regression techniques should not be regarded as mere “black-box” models. Prior studies [16,17] demonstrated that regression approaches can provide predictor importance with physical interpretability. For example, sea-level pressure was identified as the most influential predictor at 65% of the stations, followed by meridional wind speed (12%) and sea surface temperature (10%), findings consistent with physical reasoning. Lastly, regression-based models demonstrated their applicability across a wide range of spatial scales, from regional to global. For instance, Ayyad et al. (2022) [18] focused on the New York metropolitan area, Sadler et al. (2018) [19] on Norfolk, Virginia, and Sun and Pan (2023) [20] on the coastal region of Hong Kong—all supporting the versatility of regression-based approaches for storm surge prediction.

Building on these advancements, Yang and Lee (2025) [21] developed a multiple linear regression (MLR)-based predictive model focusing on the southeastern coastline of Korea, employing typhoon characteristics such as location and intensity as predictors, and observed SSHs as the predictand. They enhanced model performance through threshold-based data classification according to both the distance between typhoon centers and observation sites and the magnitude of surge height, demonstrating substantial predictive skill for extreme events at Masan (SSH > 0.2 m, R2 = 0.82) and Gwangyang (within 500 km distance, R2 = 0.57). However, they also reported limitations in regional predictive performance, suggesting the need for future studies to explore different model-fitting approaches and data processing methods (e.g., data sampling techniques).

To improve the performance of a statistical model or to expand its applicability, three main approaches can be considered. The first approach involves modifying the input variables, which includes adding new predictors that may influence the dependent variable, assigning weights to input variables according to their relative importance on the dependent variable, or removing variables with low explanatory power. The second approach focuses on modifying the data, such as changing the data sources of the input or dependent variables or applying data preprocessing techniques, including data sampling methods. The third approach involves changing the model structure, which refers to adopting different statistical techniques for model development, such as linear regression, nonlinear regression, or deep learning-based methods. Each of these approaches has its own advantages and limitations. Since linear regression is the simplest modeling approach and also serves as the fundamental building block of deep learning models, this study focused not on improving model performance through advanced modeling techniques but rather on evaluating the potential for performance enhancement through data preprocessing, particularly the application of data sampling techniques.

Building upon our previous work (Yang and Lee (2025) [21]), the present study addresses a critical yet underexplored issue in data-driven hydrological modeling—specifically, the influence of data sampling strategies on model performance. In particular, we investigate how different combinations of over-sampling and under-sampling techniques affect the predictive accuracy of MLR-based SSH prediction models. The overall methodology of this study is depicted in Figure 1. We first identify an appropriate SSH threshold by evaluating regression accuracy under varying threshold values, and then, we assess the performance of nine sampling combinations under the selected condition. These eight sampling techniques are classified into three over-sampling methods—(1) RandomOverSampler (ROS), (2) SMOTE (Synthetic Minority Over-sampling Technique), and (3) BorderlineSMOTE (Border)—and five under-sampling methods—(1) RandomUnderSampler (RUS), (2) NearMiss, (3) TomekLinks (Tomek), (4) EditedNearestNeighbours (ENN), and (5) ClusterCentroids (Centroids). This study highlights the central role of data-level balancing in advancing regression-based approaches for coastal hazard prediction.

2. Data and Methodology

The research area, data, and MLP approach employed in this study are identical to those described by Yang and Lee (2025) [21] so that the effect of data sampling on model performance can be properly evaluated. As detailed explanations are provided in that study, only a brief description is presented here.

2.1. Research Area

The area of interest in this study covers the southeastern coast of the Korean Peninsula (Figure 2a), a region frequently affected by typhoon-induced storm surges and characterized by complex coastal topography with relatively limited tidal and wave effects. Eleven tide-gauge stations—Geomundo, Goheung, Yeosu, Gwangyang, Tongyeong, Masan, Gejedo, Gadoekdo, Busan, Ulsan, and Pohang—were chosen as target sites for improving the model (Figure 2b and Table 1).

2.2. Data

2.2.1. Predictors

The IBTrACS dataset [23], developed by NOAA/NCEI to unify regional datasets from RSMCs and TCWCs, was used to obtain typhoon track information. A total of 155 typhoons that traversed the geographical region bounded by latitudes 32° N to 40° N and longitudes 122° E to 132° E during the period 1979 to 2020 were defined as affecting the Korean Peninsula (the blue solid rectangle in Figure 3 and Table A1). Independent variables for the multiple linear regression model for predicting SSH were selected from the “TOKYO” dataset (typhoon records for the Northwest Pacific provided by the Japan Meteorological Agency) within IBTrACS. It includes typhoon location information represented by latitude and longitude, as well as typhoon intensity expressed by wind speed and central pressure. From these data, the authors derived additional variables—such as typhoon translation speed, the distance between the typhoon center and the target site, and the approach angle of the typhoon relative to the target site—which were used as input variables for the model.

2.2.2. Predictand

The observed SSHs used in this study were calculated from eleven tidal observation stations on the southeastern shoreline of the Korean Peninsula, maintained by the Korea Hydrographic and Oceanographic Agency [24]. The data have an hourly temporal resolution and have undergone quality control procedures, including gap filling and outlier removal. SSHs were calculated from the raw water level records at each station by subtracting the astronomical tidal constituents and the yearly mean sea level. The astronomical tidal components were calculated using the default settings of T_tide MATLAB R2025a toolbox [25].

2.3. Multiple Linear Regression Technique

This study employed a multiple linear regression framework (Equation (1)) to model SSHs using typhoon characteristics as predictors. To ensure statistical validity, the dataset was subjected to preprocessing, in which missing values and negative SSHs were excluded, as the latter correspond to “negative” or “reverse” surges, which are phenomena generated by mechanisms opposite to those of typical storm surges [26,27,28]:

(1)Y=β0+β1X1+β2X2++βnXn

where Y is the predictand, Xi i=1, , n denotes the predictors, and βi denotes the coefficients derived using the least squares method.

In this study, we adopted the event-based data splitting approach, which demonstrated the best predictive performance in our previous work [21]. This method groups the dataset for each typhoon case and randomly allocates the events into training and testing groups according to a 7:3 ratio.

Prior to model development, each explanatory variable was normalized by applying the z-score method to ensure numerical stability and comparability across predictors. Variance inflation factors (VIFs) were calculated for all candidate variables to assess potential multicollinearity. Only those with VIF values less than 5 were retained to minimize redundancy and enhance model interpretability. In addition, the statistical significance of each predictor was evaluated using two-sided p-values, and only variables with p-values below 0.05 were included in the final model to ensure the robustness of the estimated regression coefficients.

Data preprocessing, model construction, and performance evaluation were all carried out using Python 3.9. The complete source files and associated datasets are provided in the Supplementary Materials to ensure transparency and reproducibility.

2.4. Data Sampling Technique—Over-Sampling

In imbalanced datasets, the minority class is often underrepresented compared to the majority class, leading to biased predictions that favor the majority class [29]. A representative approach to address this issue is over-sampling, which artificially increases the number of minority-class samples to balance the class distribution, either by simple replication or by generating synthetic samples [30]. This method has the advantage of mitigating imbalance without data loss; however, simple replication can increase the risk of overfitting, while synthetic generation may introduce noise or unrealistic samples. In this study, three over-sampling techniques are considered: (1) RandomOverSampler, (2) SMOTE, and (3) BorderlineSMOTE.

RandomOverSampler (ROS) [31] balances the dataset by randomly replicating minority-class samples. It is simple to implement, computationally inexpensive, and preserves the original data distribution with minimal distortion. However, since identical samples are repeatedly used, it can increase the likelihood of overfitting to specific instances.

The Synthetic Minority Over-Sampling Technique (SMOTE) [32,33,34] generates new synthetic samples by interpolating between minority-class samples and their k-nearest neighbors. Unlike simple replication, this method creates new data points, thereby reducing the risk of overfitting and improving classifier generalization around the decision boundary. As one of the most widely used synthetic over-sampling methods, SMOTE has shown significant effectiveness. Nonetheless, synthetic samples may extend beyond the true feature space or introduce noise, and distance-based calculations may become distorted in sparse datasets.

BorderlineSMOTE (Border) [34] is a variant of SMOTE that focuses on minority samples located near the boundary of the majority class (i.e., in “danger” regions). By generating synthetic samples around these boundary instances, it enhances the classifier’s ability to learn decision boundaries, improves prediction performance in regions prone to misclassification, and responds more sensitively to the actual distribution. However, when boundary identification is ambiguous or noise levels are high, it may lead to the excessive generation of misleading samples and increased computational complexity.

2.5. Data Sampling Technique—Under-Sampling

Under-sampling is a technique used to mitigate the problem of class imbalance by removing or reducing majority-class samples to balance them with the minority class. This approach has the advantage of reducing the size of the training dataset, thereby improving computational efficiency and directly addressing imbalance. However, random or excessive removal of samples may result in the loss of important distributional information [35]. In this study, five representative under-sampling methods were applied: (1) RandomUnderSampler, (2) NearMiss, (3) TomekLinks, (4) Edited Nearest Neighbours, and (5) ClusterCentroids.

RandomUnderSampler (RUS) [31] balances the dataset by randomly eliminating the majority-class samples. It is simple, easy to implement, and computationally efficient but may cause the loss of critical information due to random removal.

NearMiss [36] is a distance-based sampling method that selects the majority-class samples closest to the minority class. NearMiss-1 selects the majority samples with the smallest average distance to minority instances, NearMiss-2 selects those with the largest average distance, and NearMiss-3 selects the nearest majority samples for each minority instance. This approach is effective for preserving decision boundaries, although it may increase data sparsity.

TomekLinks (Tomek) [37] identifies pairs of nearest neighbors from different classes (Tomek pairs) and removes the majority-class sample in each pair. This method refines class boundaries and helps eliminate noise and overlapping samples.

Edited Nearest Neighbours (ENN) [38] examines each sample’s k-nearest neighbours and removes majority-class samples that disagree with the majority of their neighbors. This approach effectively reduces noise near decision boundaries, enhances data consistency, and can improve classifier performance.

ClusterCentroids (Centroids) [35] applies K-means clustering to the majority-class samples and replaces them with the cluster centroids. This method preserves representative information rather than removing data at random, thereby maintaining the overall distribution while improving training efficiency.

2.6. Objective Functions

The predictive performance of the MLR models was objectively evaluated using four widely employed statistical metrics: mean absolute error (MAE), mean squared error (MSE), root mean square error (RMSE), and the coefficient of determination (R2).

MAE quantifies the average absolute deviation between prediction and observation, irrespective of error direction. It is given by the following:

(2)MAE=1nΣi=1nOiSi

where Oi, Si, and n denote the observed and predicted values and the number of samples, respectively. A smaller MAE reflects improved model performance. Owing to its simplicity, MAE is widely used as it explicitly represents the average error of prediction in the target variable’s original scale.

MSE quantifies the mean of the squared differences between observed and predicted values, and it is expressed as

(3)MSE=1nΣi=1nOiSi2

Because residuals are squared, larger errors are penalized more heavily than smaller ones, making MSE particularly affected by outliers. It functions as a conventional regression analysis metric applied to evaluate model calibration and performance differences.

RMSE is expressed as the square root of the MSE:

(4)RMSE=1nΣi=1nOiSi2

Unlike MSE, RMSE expresses prediction errors in the same units as the observed variable, which helps in understanding and practically assessing model performance. Smaller RMSE values correspond to higher predictive accuracy.

R2 measures the proportion of variance in the observed data that is explained by the model, and it is calculated as

(5)R2=Σi=1nOiO¯SiS¯Σi=1nOiO¯2Σi=1nSiS¯22

where O¯ refers to the average of observed values, and S¯ refers to that of predicted values. R2 ranges from 0 to 1, where larger values indicate a better fit of the model. In most applications, R2 values above 0.5 are regarded as acceptable [27,28,39].

3. Results

3.1. Effect of SSH Threshold Selection on Model Performance Under Over- and Under-Sampling Schemes

Table 2 presents the test R2 value of models under varying SSH thresholds with under-/over-sampling strategies. For each SSH threshold, under-sampling was applied to the observations within the low-SSH interval (SSH ≤ threshold), and over-sampling was applied to those within the high-SSH interval (SSH > threshold). The “Total” row represents the R2 value derived from regression using the combined dataset from all stations. To isolate the effect of threshold-based sampling alone, all models employed the random sampling method for both under- and over-sampling techniques (ROS and RUS).

As shown in Table 2, the model performance generally improved with increasing SSH thresholds. At threshold values of 0.35 and 0.4, the regression models exhibited relatively high predictive skill, achieving R2 values greater than 0.6 at nearly all stations except for Yeosu. Notably, the model achieved its highest overall R2 (0.6255) when the threshold was set to 0.4. However, a threshold of 0.35 was selected as the optimal value for further analysis, considering the balance between performance and data distribution. A higher threshold, such as 0.4, resulted in an insufficient number of high-SSH observations, which increased the risk of overfitting due to excessive over-sampling.

While the proportion of high-SSH samples above the 0.35 m threshold was relatively small (≈0.7% of the total dataset), this level provided the most stable sampling balance for model training. Lower thresholds (e.g., 0.3 m) included too many low-energy observations, weakening the regression’s sensitivity to surge-related variability, whereas higher thresholds (e.g., 0.4 m) further reduced the number of high-SSH samples (≈0.4%) and led to unstable model behavior due to excessive over-sampling. In particular, stations with very limited high-SSH data, such as Goheung (only 7 out of 1090 samples above 0.4 m), exhibited signs of overfitting, resulting in abnormally high R2 values (up to 0.9687). Therefore, the 0.35 m threshold was selected as a compromise that minimized sampling bias and ensured consistent predictive performance across stations.

The boxplots in Figure 4 further clarify the physical reason for the observed trend. At low SSH thresholds (≤0.25 m), the model performance is highly variable, as reflected by the wide interquartile range and occasional low-R2 outliers. This is primarily because these samples correspond to quiescent sea states where meteorological forcing is weak and the relationship between surge height, wind stress, and pressure gradients becomes less linear. As the threshold increases, the spread of R2 values narrows, and the median rises, indicating that the inclusion of more energetic events enhances the statistical representation of physically driven surges. When SSH exceeds approximately 0.3 m, the system response becomes more dominated by wind and pressure forcing, yielding more consistent predictive skill across stations. However, excessively high thresholds (e.g., 0.4 m) reduce the number of high-SSH samples and may introduce artificial variance through over-sampling, emphasizing the need for a balanced threshold selection.

Figure 5 displays the regression results at the selected threshold of 0.35 for each station. The scatter plots of observed versus predicted SSHs indicate generally strong agreement, though with some variations across stations. In the combined dataset (Figure 5i), a noticeable discontinuity is observed near the threshold value (0.35), reflecting the structural break between under- and over-sampled data. However, this discontinuity is less evident when examining individual stations (Figure 5a–k), suggesting that the effect is more prominent in the aggregated data.

In some stations (e.g., Geojedo, Gwangyang, Masan), the number of training samples appears visibly reduced due to the filtering of potentially uninformative or noisy observations during the under-sampling process. While this led to sparse scatter patterns in some plots, the retained data points contributed to more stable and interpretable regression outcomes by mitigating the influence of extreme or clustered low-SSH samples.

Notably, the application of threshold-based under-/over-sampling strategies alone led to substantial performance gains compared to the optimal results reported in previous studies [31] without applying data preprocessing (data sampling). Across most stations, the R2 values increased by 0.04 (Masan, Geomundo) to as high as 0.46 (Ulsan) solely due to data-level balancing without modifying the underlying regression structure. Significant improvements were also observed in Gadeokdo (+0.25), Geojedo (+0.19), Pohang (+0.22), and Goheung (+0.31), emphasizing the critical role of sample distribution in enhancing predictive accuracy for SSH modeling.

3.2. Effects of Over-/Under-Sampling Schemes at a Fixed SSH Threshold

In this section, the SSH threshold was fixed at 0.35 based on the finding in Section 3.1, where it was identified as the most appropriate value balancing model accuracy and data distribution. To further improve regression performance, various combinations of over- and under-sampling techniques were applied under this threshold condition, as summarized in Table 3.

Table 3 provides the test R2 values obtained using nine different sampling combinations across all stations. The over-sampling methods included Random, Border, and SMOTE, while under-sampling methods consisted of Centroids, ENN, NearMiss, Random, and Tomek. For each station, the performance of the multiple linear regression model trained with each sampling pair was evaluated, and the best-performing combination was identified by the highest R2 in each row.

Notably, some stations such as Geojedo (R2 = 0.9599) and Ulsan (R2 = 0.8712) exhibited remarkably high performance using Random over-sampling with Centroid and ENN under-sampling, respectively. Conversely, stations like Yeosu, which had relatively low baseline R2 values in Section 3.1, still showed modest improvement across all sampling schemes, with the best result at SMOTE-ENN (R2 = 0.5908).

The statistical comparison in Figure 6 summarizes the overall influence of sampling combinations on model performance. Combinations that preserve representative data structures, such as Random over-sampling with Centroids or ENN under-sampling, yield higher and more stable R2 values, whereas aggressive under-sampling methods like NearMiss reduce the diversity of extreme-event samples and degrade performance. SMOTE-based combinations moderately improve model skill by enriching rare high-SSH cases. These results quantitatively demonstrate that sampling strategies maintaining a balanced representation of both frequent and rare surge conditions achieve more physically consistent regression behavior, a pattern that is further visualized in Figure 7.

Figure 7 illustrates scatter plots comparing predicted and observed SSHs under nine different combinations of over- and under-sampling techniques. Each subplot (a–i) corresponds to a distinct pairing of sampling strategies applied at a fixed SSH threshold of 0.35. The combinations include the following: (a) Random–Centroids, (b) Random–ENN, (c) Random–NearMiss, (d) Random–Random, (e) Random–Tomek Links, (f) Border–ENN, (g) Border–Random, (h) SMOTE–ENN, and (i) SMOTE–Random.

Distinct data distribution patterns can be observed across the sampling configurations. In panels (a), (d), (g), and (i)—all of which use Centroids or Random under-sampling—a noticeable sparsity of points is evident around the threshold value (SSH ≈ 0.35). This reflects the nature of aggressive under-sampling techniques (Centroids or Random), which tend to remove borderline or less-representative samples near the decision boundary in order to balance class distributions.

In contrast, panels (b), (e), (f), and (h)—which involve ENN or Tomek Links under-sampling—exhibit smoother transitions across the threshold without a sharp data discontinuity. These methods are designed to retain boundary-adjacent samples or to refine class separation by eliminating ambiguous instances, resulting in a more continuous distribution. Furthermore, in (f) Border–ENN and (h) SMOTE–ENN, there is a sharp concentration of synthetic points in the higher SSH range. This tapering shape at extreme predicted values is a typical artifact of over-sampling techniques like Border and SMOTE, which generate new samples in sparsely populated minority regions. While such methods enhance model exposure to rare events, they may also introduce artificial patterns that alter regression characteristics near the extremes.

These results highlight that not only the choice of threshold but also the specific sampling combination can significantly impact model behavior—particularly around critical regions of the SSH distribution.

Figure 8 presents scatter plots of predicted versus observed SSHs for each tide gauge station using the optimal combination of over- and under-sampling techniques, selected based on the highest test R2 values from Table 3. The diversity of configurations ranging from Random-Centroids to SMOTE–ENN highlights the station-specific nature of SSH data distributions and the necessity of adapting sampling strategies to local characteristics. For instance, Geojedo (Figure 8c) and Pohang (Figure 8k) achieved their best performance using Random–Centroids, which effectively preserved informative low-SSH observations while enhancing sensitivity in higher ranges. In contrast, Yeosu (Figure 8h) and Tongyeong (Figure 8j) benefited from SMOTE–ENN, where synthetic minority sampling likely mitigated data sparsity in the high-SSH regime.

Compared to the baseline strategy in Section 3.1 (Random–Random with a fixed threshold of 0.35), the tailored configurations led to notable improvements in regression accuracy. For example, Ulsan’s R2 increased from 0.8600 to 0.8712, and Pohang’s increased from 0.7500 to 0.7984, even though the model structure remained unchanged. These results reaffirm that proper data-level balancing is a critical determinant of regression performance in SSH prediction. Although some panels (e.g., Figure 8h,j) still exhibit synthetic sampling artifacts near the threshold, the overall model fit is significantly enhanced through station-specific sampling optimization. This underscores that sampling configuration is not merely a preprocessing choice but a strategic modeling decision in threshold-sensitive hydrological prediction tasks.

4. Discussion

The results of this study underscore the critical role of data-level balancing in enhancing the accuracy of SSH prediction models. Across the eleven tide-gauge stations, the application of tailored over- and under-sampling strategies consistently improved model performance, particularly in terms of R2. Unlike the uniform baseline configuration (Random–Random) applied to all stations in Section 3.1, the station-specific optimization approach in Section 3.2 revealed that no single combination performed best across all cases. Instead, the optimal configuration varied depending on the local characteristics of SSH distributions and the relative frequency of extreme surge events.

This heterogeneity in sampling effectiveness can be attributed to spatial variations in SSH distributions. Some stations, such as Yeosu and Tongyeong, exhibited significant class imbalance due to the limited number of high-SSH events, for which synthetic minority over-sampling techniques like SMOTE were effective in filling sparsely populated upper ranges. In contrast, stations with well-distributed low-SSH events or relatively stable physical dynamics, including Geojedo and Pohang, benefited from centroid-based under-sampling, which helped preserve representative samples while reducing redundancy and noise.

The effectiveness of these sampling strategies is further evidenced by the substantial performance gains achieved without altering the underlying regression model structure. For instance, at Ulsan, the R2 increased from 0.8600 to 0.8712, and at Pohang, it increased from 0.7500 to 0.7984 solely through optimized data balancing. These results suggest that data-level sampling is not merely a preparatory step but a strategic design element in regression-based hydrological modeling. Although certain sampling configurations, such as SMOTE–ENN, produced synthetic concentration patterns or discontinuities near the SSH threshold, the overall model fit was significantly improved through targeted design. Collectively, these findings indicate that appropriate sample composition can serve as a key determinant of predictive accuracy in threshold-sensitive regression models.

While the proposed sampling approach demonstrates clear performance improvements, several limitations should be acknowledged. First, the reliance on synthetic sample generation, such as SMOTE and BorderlineSMOTE, introduces a potential risk of overfitting, particularly in regions with very few high-SSH events. Although these synthetic samples are beneficial for model training, they may distort the true physical distribution of extreme surge events. In addition, certain aggressive under-sampling techniques, including RandomUnderSampler and ClusterCentroids, may inadvertently remove important borderline cases, thereby reducing model sensitivity near the threshold boundary. Second, the study focused on static, station-wise optimization using the “fixed” threshold (SSH = 0.35), which was identified as the most effective threshold value based on the experimental results. While this facilitated controlled analysis, real-world storm surge systems are often dynamic and spatially correlated. Therefore, extending the framework to account for spatiotemporal generalization—for instance, by developing regionalized or adaptive data sampling strategies—would enhance its applicability in operational forecasting. An adaptive thresholding mechanism that responds to changes in baseline surge conditions or data density could further improve robustness and transferability.

5. Conclusions

This study investigated the effects of data preprocessing—specifically storm surge height (SSH) threshold-based over- and under-sampling strategies—on the performance of multiple linear regression models for predicting SSH. By systematically varying the SSH threshold and sampling combinations, the analysis revealed that both the choice of the threshold and the configuration of sampling techniques significantly influenced model accuracy across stations.

The results demonstrated that applying a station-specific sampling strategy led to substantial improvements in predictive performance, with R2 values increasing by up to 0.46 compared with the baseline configuration. These enhancements were achieved without altering the underlying regression structure, underscoring the effectiveness of data-level balancing in threshold-sensitive prediction tasks. Moreover, the optimal sampling combination differed among stations, highlighting the importance of adapting data preprocessing methods to local surge conditions.

Overall, this study shows that appropriate sample composition—guided by informed threshold selection and tailored sampling design—can serve as one of the determinants of predictive accuracy in SSH modeling. These findings provide practical insights for improving regression-based hydrological forecasting systems through strategic data-level interventions.

In future work, we plan to evaluate the effect of various model structures (e.g., nonlinear regression, random forest) and the modification of input variables (e.g., adding parameters that reflect topographic characteristics) on improving model performance. Through this, we aim to identify the most effective approach for further enhancing the predictive capability of the storm surge model.

Author Contributions

Conceptualization, J.-A.Y. and Y.L.; methodology, J. Yang and Y. Lee; software, Y.L.; validation, Y.L.; formal analysis, Y.L.; investigation, J.-A.Y. and Y.L.; resources, J.-A.Y. and Y.L.; data curation, J.-A.Y. and Y.L.; writing—original draft preparation, J.-A.Y. and Y.L.; writing—review and editing, J.-A.Y. and Y.L.; visualization, J.-A.Y. and Y.L.; supervision, J.-A.Y.; funding acquisition, J.-A.Y. and Y.L. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

All data and code used in the manuscript are openly available at the Zenodo repository: https://doi.org/10.5281/zenodo.17156709. The repository titled “Storm surge resampling framework code and data” contains all datasets and code necessary to reproduce the results presented in this manuscript.

Acknowledgments

Special thanks to Eunhyeok Hur for his support in compiling the References and Abbreviations.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

BorderBorderline SMOTE;
CentroidsCluster Centroids;
ENNEdited Nearest Neighbours;
GTSRGlobal Tide and Surge Reanalysis;
IBTrACSInternational Best Track Archive for Climate Stewardship;
MAEMean Absolute Error;
MLRMultiple Linear Regression;
MSEMean Squared Error;
NCEINational Centers for Environmental Information;
NOAANational Oceanic and Atmospheric Administration;
R2Coefficient of Determination;
RMSERoot Mean Square Error;
ROSRandom Over Sampler;
RSMCRegional Specialized Meteorological Centre;
RUSRandom Under Sampler;
SMOTESynthetic Minority Over-Sampling Technique;
SSHStorm Surge Height;
TCWCTropical Cyclone Warning Center;
TomekTomek Links.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Figures and Tables

Figure 1 Workflow of the present study. The blue text highlights the unique features of this study compared with the previous study [21].

View Image -

Figure 2 (a) Study area; (b) location of tide gauge stations within the study area (from [21]).

View Image -

Figure 3 Typhoon tracks that passed through the region bounded by latitudes 32°N to 40° and longitudes 122°E to 132°E (the blue rectangular area) during the period from 1979 to 2020 were defined as those that affecting the Korean Peninsula (from [21]).

View Image -

Figure 4 Boxplots of the test R2 values for regression models under varying storm-surge-height (SSH) thresholds using random sampling methods (ROS and RUS). The boxes show the distribution of model performance across all stations, where the central line indicates the median, the “×” marks the mean value, and the whiskers represent the 1.5 × IQR range.

View Image -

Figure 5 Scatter plots illustrating model performance for each tidal station. Panels represent the following stations: (a) Gadeokdo, (b) Geomundo, (c) Geojedo, (d) Goheung, (e) Gwangyang, (f) Masan, (g) Busan, (h) Yeosu, (i) Ulsan, (j) Tongyeong, (k) Pohang, and (l) Total (combined dataset from all stations).

View Image -

Figure 6 Boxplots of the test R2 values for models trained using nine different combinations of over- and under-sampling techniques. Each box represents the distribution of model performance across all stations for a given sampling combination, where the central line denotes the median, the “×” indicates the mean value, and the whiskers correspond to the 1.5 × IQR range. The open circles represent outliers beyond the whisker range.

View Image -

Figure 7 Scatter plots of predicted versus observed storm surge heights under different combinations of over- and under-sampling techniques: (a) Random/Centroids, (b) Random/ENN, (c) Random/NearMiss, (d) Random/Random, (e) Random/Tomek Links, (f) Border/ENN, (g) Border/Random, (h) SMOTE ENN, and (i) SMOTE/Random.

View Image -

Figure 8 Scatter plots of predicted versus observed storm surge heights using the best-performing combination of over- and under-sampling techniques for each station. The sampling combinations used are as follows: (a) Gadeokdo–Random/Random, (b) Geomundo–Random/Random, (c) Geojedo–Random/Centroids, (d) Goheung–Border/ENN, (e) Gwangyang–Random/NearMiss, (f) Masan–Border/Random, (g) Busan–Border/ENN, (h) Yeosu–SMOTE/ENN, (i) Ulsan–Random/Centroids, (j) Tongyeong–SMOTE/ENN, (k) Pohang–Random/Centroids, and (l) Total (all stations combined)–SMOTE/ENN.

View Image -

Coordinates of the designated points.

Point Name Longitude [°] Latitude [°]
Geomundo 127.308889 34.02833
Goheung 127.342778 34.48111
Yeosu 129.387222 35.50194
Gwangyang 127.754722 34.90361
Tongyeong 128.434722 34.82778
Masan 128.588889 35.21
Geojedo 128.699167 34.80139
Gadeokdo 128.810833 35.02417
Busan 129.035278 35.09639
Ulsan 127.765833 34.74722
Pohang 129.383889 36.04722

Test R2 values of models under varying storm surge height (SSH) thresholds with under- and over-sampling strategies. The “Total” row represents the R2 value derived using the combined dataset from all stations.

Station Storm Surge Height Threshold [m]
0.2 0.25 0.3 0.35 0.4
Geomundo 0.3927 0.4741 0.5526 0.6246 0.6164
Goheung 0.5559 0.6464 0.9435 0.6556 0.9687
Yeosu 0.392 0.5164 0.561 0.4905 0.5494
Gwangyang 0.6442 0.8293 0.699 0.8083 0.7436
Tongyeong 0.4603 0.5424 0.5036 0.5985 0.6365
Masan 0.333 0.2695 0.8562 0.8597 0.8457
Geojedo 0.6967 0.7585 0.5712 0.7775 0.698
Gadeokdo 0.5502 0.5987 0.6716 0.7942 0.7854
Busan 0.5295 0.6307 0.713 0.6901 0.7799
Ulsan 0.6343 0.6693 0.5489 0.8593 0.7388
Pohang 0.5512 0.7137 0.7904 0.7469 0.9318
Total 0.4355 0.4965 0.5708 0.5808 0.6255

R2 values of models by combinations of over- and under-sampling techniques. The R2 value reported in the “Total” row corresponds to the model trained on the combined dataset across all stations.

Station Sampling Techniques (Over Sampling Technique–Under Sampling Technique)
ROS–Centroids ROS–ENN ROS–NearMiss ROS–RUS ROS–Tomek Border–ENN Border–RUS SMOTE–ENN SMOTE–RUS
Geomundo 0.5368 0.5956 0.2718 0.6246 0.5847 0.6202 0.5960 0.6078 0.5960
Goheung 0.6641 0.7692 0.4306 0.6556 0.7667 0.8235 0.6552 0.7892 0.6552
Yeosu 0.4611 0.5595 0.2125 0.4905 0.5489 0.5849 0.4977 0.5908 0.4977
Gwangyang 0.8960 0.8226 0.9059 0.8083 0.8199 0.8400 0.7360 0.8357 0.7360
Tongyeong 0.5234 0.5979 0.1649 0.5985 0.5879 0.6404 0.5813 0.6413 0.5813
Masan 0.6329 0.7817 0.4353 0.8597 0.7724 0.8217 0.8770 0.8027 0.8770
Geojedo 0.9599 0.8196 0.9510 0.7775 0.8167 0.8387 0.6082 0.8350 0.6082
Gadeokdo 0.6276 0.7076 0.2998 0.7942 0.6978 0.7388 0.7860 0.7188 0.7860
Busan 0.6513 0.6983 0.3990 0.6901 0.6910 0.7210 0.6974 0.7161 0.6974
Ulsan 0.8712 0.6288 0.3518 0.8593 0.6283 0.6647 0.8266 0.6571 0.8266
Pohang 0.7984 0.7534 0.3224 0.7469 0.7510 0.7790 0.7509 0.7801 0.7509
Total 0.4894 0.5614 0.0802 0.5808 0.5534 0.5691 0.5681 0.5870 0.5681

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/jmse13112173/s1.

Appendix A

Lists of typhoons that affected Korea and were defined as such based on their passage through the region from 32° N to 40° N and from 122° E to 132° E.

No. Typhoon Name Pmin Umax Typhoon Lifetime
1 IRVING 958 75 1979 8 7 ~ 1979 8 20
2 JUDY 980 50 1979 8 15 ~ 1979 8 27
3 KEN 991 43 1979 8 30 ~ 1979 9 10
4 IDA 996 NaN 1980 7 5 ~ 1980 7 15
5 NORRIS 1002 NaN 1980 8 23 ~ 1980 8 31
6 ORCHID 967 70 1980 9 1 ~ 1980 9 16
7 IKE 1006 NaN 1981 6 7 ~ 1981 6 17
8 JUNE 990 45 1981 6 15 ~ 1981 6 26
9 OGDEN 983 NaN 1981 7 26 ~ 1981 8 1
10 AGNES 970 55 1981 8 25 ~ 1981 9 6
11 CLARA 1004 NaN 1981 9 13 ~ 1981 10 2
12 CECIL 975 55 1982 8 1 ~ 1982 8 19
13 ELLIS 955 70 1982 8 17 ~ 1982 9 4
14 FORREST 968 70 1983 9 16 ~ 1983 9 30
15 ALEX 1004 NaN 1984 6 28 ~ 1984 7 6
16 HOLLY 965 70 1984 8 12 ~ 1984 8 23
17 GERALD 1002 NaN 1984 8 14 ~ 1984 8 24
18 JUNE 1002 NaN 1984 8 25 ~ 1984 9 3
19 HAL 996 NaN 1985 6 11 ~ 1985 6 28
20 JEFF 992 45 1985 7 18 ~ 1985 8 3
21 KIT 970 70 1985 7 30 ~ 1985 8 17
22 LEE 980 60 1985 8 8 ~ 1985 8 16
23 ODESSA 985 55 1985 8 19 ~ 1985 9 2
24 PAT 965 70 1985 8 24 ~ 1985 9 2
25 BRENDAN 980 70 1985 9 25 ~ 1985 10 8
26 NANCY 994 45 1986 6 18 ~ 1986 6 27
27 VERA 960 70 1986 8 13 ~ 1986 9 2
28 ABBY 996 NaN 1986 9 9 ~ 1986 9 24
29 THELMA 960 78 1987 7 6 ~ 1987 7 18
30 ALEX 994 NaN 1987 7 21 ~ 1987 8 2
31 DINAH 940 85 1987 8 19 ~ 1987 9 3
32 ELLIS 990 40 1989 6 18 ~ 1989 6 25
33 JUDY 970 65 1989 7 20 ~ 1989 7 29
34 VERA 1002 NaN 1989 9 11 ~ 1989 9 19
35 OFELIA 996 NaN 1990 6 15 ~ 1990 6 26
36 ROBYN 992 40 1990 6 29 ~ 1990 7 14
37 ABE 996 NaN 1990 8 22 ~ 1990 9 3
38 CAITLIN 945 80 1991 7 18 ~ 1991 7 30
39 GLADYS 975 50 1991 8 13 ~ 1991 8 24
40 UNNAMED 994 35 1991 8 21 ~ 1991 8 31
41 KINNA 965 70 1991 9 8 ~ 1991 9 16
42 MIREILLE 935 95 1991 9 13 ~ 1991 10 1
43 JANIS 965 70 1992 7 30 ~ 1992 8 13
44 IRVING 994 40 1992 7 30 ~ 1992 8 5
45 KENT 980 50 1992 8 3 ~ 1992 8 20
46 POLLY 1000 NaN 1992 8 23 ~ 1992 9 4
47 TED 992 45 1992 9 14 ~ 1992 9 27
48 OFELIA 990 40 1993 7 24 ~ 1993 7 29
49 PERCY 980 55 1993 7 25 ~ 1993 8 1
50 ROBYN 945 85 1993 7 30 ~ 1993 8 14
51 YANCY 955 75 1993 8 27 ~ 1993 9 7
52 RUSS 1004 NaN 1994 6 2 ~ 1994 6 12
53 WALT 992 40 1994 7 11 ~ 1994 7 28
54 BRENDAN 992 45 1994 7 25 ~ 1994 8 3
55 DOUG 985 48 1994 7 30 ~ 1994 8 13
56 ELLIE 970 65 1994 8 3 ~ 1994 8 19
57 FRED 1004 NaN 1994 8 12 ~ 1994 8 26
58 SETH 975 55 1994 9 30 ~ 1994 10 16
59 FAYE 950 75 1995 7 12 ~ 1995 7 25
60 JANIS 990 NaN 1995 8 17 ~ 1995 8 30
61 RYAN 985 60 1995 9 14 ~ 1995 9 25
62 EVE 980 60 1996 7 10 ~ 1996 7 27
63 KIRK 960 75 1996 7 28 ~ 1996 8 18
64 PETER 975 60 1997 6 15 ~ 1997 7 4
65 TINA 975 60 1997 7 21 ~ 1997 8 10
66 OLIWA 970 65 1997 8 28 ~ 1997 9 19
67 YANNI 975 55 1998 9 24 ~ 1998 10 2
68 NEIL 980 50 1999 7 22 ~ 1999 7 28
69 OLGA 975 60 1999 7 26 ~ 1999 8 5
70 PAUL 992 35 1999 7 31 ~ 1999 8 9
71 RACHEL 1000 NaN 1999 8 5 ~ 1999 8 11
72 SAM 1004 NaN 1999 8 17 ~ 1999 8 27
73 WENDY 1006 NaN 1999 8 29 ~ 1999 9 7
74 ZIA 990 40 1999 9 11 ~ 1999 9 17
75 ANN 994 38 1999 9 14 ~ 1999 9 20
76 BART 940 85 1999 9 17 ~ 1999 9 29
77 DAN 1012 NaN 1999 10 1 ~ 1999 10 12
78 KAI-TAK 994 35 2000 7 2 ~ 2000 7 12
79 BOLAVEN 985 40 2000 7 19 ~ 2000 8 2
80 BILIS 1001 NaN 2000 8 17 ~ 2000 8 27
81 PRAPIROON 965 70 2000 8 24 ~ 2000 9 4
82 SAOMAI 970 60 2000 8 31 ~ 2000 9 19
83 XANGSANE 1003 NaN 2000 10 24 ~ 2000 11 2
84 CHEBI 1000 NaN 2001 6 19 ~ 2001 6 25
85 RAMMASUN 965 65 2002 6 26 ~ 2002 7 7
86 NAKRI 996 NaN 2002 7 7 ~ 2002 7 13
87 FENGSHEN 980 50 2002 7 13 ~ 2002 7 28
88 RUSA 960 70 2002 8 22 ~ 2002 9 3
89 KUJIRA 1000 NaN 2003 4 8 ~ 2003 4 25
90 SOUDELOR 975 60 2003 6 7 ~ 2003 6 24
91 MAEMI 935 90 2003 9 4 ~ 2003 9 16
92 MINDULLE 984 45 2004 6 21 ~ 2004 7 5
93 NAMTHEUN 996 40 2004 7 24 ~ 2004 8 3
94 MEGI 970 65 2004 8 13 ~ 2004 8 22
95 CHABA 955 80 2004 8 17 ~ 2004 9 5
96 SONGDA 945 75 2004 8 26 ~ 2004 9 10
97 MEARI 975 60 2004 9 18 ~ 2004 10 2
98 MATSA 998 NaN 2005 7 29 ~ 2005 8 9
99 NABI 955 75 2005 8 28 ~ 2005 9 9
100 KHANUN 1000 NaN 2005 9 5 ~ 2005 9 13
101 CHANCHU 996 NaN 2006 5 7 ~ 2006 5 19
102 EWINIAR 975 60 2006 6 29 ~ 2006 7 12
103 WUKONG 980 45 2006 8 12 ~ 2006 8 21
104 SHANSHAN 950 80 2006 9 9 ~ 2006 9 19
105 MAN-YI 955 70 2007 7 6 ~ 2007 7 23
106 USAGI 960 80 2007 7 27 ~ 2007 8 4
107 PABUK 995 NaN 2007 8 4 ~ 2007 8 15
108 NARI 960 75 2007 9 11 ~ 2007 9 18
109 WIPHA 1005 NaN 2007 9 14 ~ 2007 9 20
110 KROSA 1010 NaN 2007 10 1 ~ 2007 10 14
111 KALMAEGI 994 NaN 2008 7 11 ~ 2008 7 24
112 LINFA 998 NaN 2009 6 13 ~ 2009 6 30
113 MORAKOT 998 NaN 2009 8 2 ~ 2009 8 13
114 DIANMU 985 50 2010 8 6 ~ 2010 8 13
115 KOMPASU 970 70 2010 8 27 ~ 2010 9 6
116 MALOU 992 50 2010 8 31 ~ 2010 9 10
117 MERANTI 1003 NaN 2010 9 6 ~ 2010 9 14
118 MEARI 980 55 2011 6 20 ~ 2011 6 27
119 MUIFA 973 63 2011 7 26 ~ 2011 8 15
120 KULAP 1012 NaN 2011 9 5 ~ 2011 9 11
121 KHANUN 991 43 2012 7 13 ~ 2012 7 20
122 DAMREY 965 70 2012 7 27 ~ 2012 8 4
123 TEMBIN 980 55 2012 8 17 ~ 2012 9 1
124 BOLAVEN 960 65 2012 8 18 ~ 2012 9 1
125 SANBA 940 85 2012 9 10 ~ 2012 9 18
126 LEEPI 1002 NaN 2013 6 16 ~ 2013 6 23
127 DANAS 965 65 2013 10 1 ~ 2013 10 9
128 NEOGURI 975 50 2014 7 2 ~ 2014 7 13
129 MATMO 994 NaN 2014 7 16 ~ 2014 7 26
130 NAKRI 980 50 2014 7 27 ~ 2014 8 4
131 FUNG-WONG 998 35 2014 9 17 ~ 2014 9 25
132 VONGFONG 975 60 2014 10 1 ~ 2014 10 16
133 CHAN-HOM 973 58 2015 6 29 ~ 2015 7 13
134 HALOLA 994 45 2015 7 6 ~ 2015 7 26
135 SOUDELOR 998 35 2015 7 29 ~ 2015 8 12
136 GONI 945 85 2015 8 13 ~ 2015 8 30
137 NAMTHEUN 994 45 2016 8 30 ~ 2016 9 5
138 MERANTI 1004 NaN 2016 9 8 ~ 2016 9 17
139 CHABA 965 70 2016 9 24 ~ 2016 10 7
140 NANMADOL 985 55 2017 7 1 ~ 2017 7 8
141 PRAPIROON 965 60 2018 6 27 ~ 2018 7 5
142 JONGDARI 992 45 2018 7 23 ~ 2018 8 4
143 LEEPI 998 40 2018 8 10 ~ 2018 8 15
144 SOULIK 963 73 2018 8 15 ~ 2018 8 30
145 KONG-REY 975 65 2018 9 27 ~ 2018 10 7
146 DANAS 985 43 2019 7 14 ~ 2019 7 23
147 FRANCISCO 975 65 2019 8 1 ~ 2019 8 11
148 LINGLING 963 73 2019 8 30 ~ 2019 9 12
149 TAPAH 975 60 2019 9 17 ~ 2019 9 23
150 MITAG 988 50 2019 9 24 ~ 2019 10 5
151 HAGUPIT 996 NaN 2020 7 30 ~ 2020 8 12
152 JANGMI 996 40 2020 8 6 ~ 2020 8 14
153 BAVI 950 85 2020 8 20 ~ 2020 8 29
154 MAYSAK 950 80 2020 8 26 ~ 2020 9 7
155 HAISHEN 945 85 2020 8 30 ~ 2020 9 10

References

1. Muis, S.; Verlaan, M.; Winsemius, H.C.; Aerts, J.C.; Ward, P.J. A global reanalysis of storm surges and extreme sea levels. Nat. Commun.; 2016; 7, 11969. [DOI: https://dx.doi.org/10.1038/ncomms11969]

2. IPCC. Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change; Cambridge University Press: Cambridge, UK, New York, NY, USA, 2021; pp. 1-2391.

3. Papadopoulos, N.; Gikas, V. Combined Coastal Sea Level Estimation Considering Astronomical Tide and Storm Surge Effects: Model Development and Its Application in Thermaikos Gulf, Greece. J. Mar. Sci. Eng.; 2023; 11, 2033. [DOI: https://dx.doi.org/10.3390/jmse11112033]

4. Antunes, C.; Lemos, G. A probabilistic approach to combine sea level rise, tide and storm surge into representative return periods of extreme total water levels: Application to the Portuguese coastal areas. Estuar. Coast. Shelf Sci.; 2025; 313, 109060. [DOI: https://dx.doi.org/10.1016/j.ecss.2024.109060]

5. Palmer, K.; Watson, C.S.; Power, H.E.; Hunter, J.R. Quantifying the Mean Sea Level, Tide, and Surge Contributions to Changing Coastal High Water Levels. J. Geophys. Res. Ocean.; 2024; 129, e2023JC020737. [DOI: https://dx.doi.org/10.1029/2023JC020737]

6. Goring, D.G.; Stephens, S.A.; Bell, R.G.; Pearson, C.P. Estimation of Extreme Sea Levels in a Tide-Dominated Environment Using Short Data Records. J. Waterw. Port Coast. Ocean. Eng.; 2011; 137, pp. 150-159. [DOI: https://dx.doi.org/10.1061/(ASCE)WW.1943-5460.0000071]

7. Yang, J.-A.; Kim, S.; Mori, N.; Mase, H. Bias correction of simulated storm surge height considering coastline complexity. Hydrol. Res. Lett.; 2017; 11, pp. 121-127. [DOI: https://dx.doi.org/10.3178/hrl.11.121]

8. Yang, J.-A.; Kim, S.; Mori, N.; Mase, H. Assessment of long-term impact of storm surges around the Korean Peninsula based on a large ensemble of climate projections. Coast. Eng.; 2018; 142, pp. 1-8. [DOI: https://dx.doi.org/10.1016/j.coastaleng.2018.09.008]

9. Yang, J.-A.; Kim, S.; Son, S.; Mori, N.; Mase, H. Correction to: Assessment of uncertainties in projecting future changes to extreme storm surge height depending on future SST and greenhouse gas concentration scenarios. Clim. Chang.; 2020; 162, pp. 443-444. [DOI: https://dx.doi.org/10.1007/s10584-020-02864-6]

10. Kim, H.-S.; Lee, S.-W. Storm Surge Caused by the Typhoon “Maemi” in Kwangyang Bay in 2003. J. Korea. Soc. Ocean; 2004; 9, pp. 119-129.

11. National Disaster Information Center. Typhoon Maemi’s Damage. Available online: https://web.archive.org/web/20150924093447/http://www.safekorea.go.kr/dmtd/contents/room/ldstr/DmgReco.jsp?q_menuid=&q_largClmy=3 (accessed on 17 September 2025).

12. Seo, S.N.; Kim, S.I. Storm Surges in West Coast of Korea by Typhoon Bolaven (1215). J. Korean Soc. Coast. Ocean Eng.; 2014; 26, pp. 41-48. [DOI: https://dx.doi.org/10.9765/KSCOE.2014.26.1.41]

13. Munhwa Broadcasting Corporation. Available online: https://imnews.imbc.com/replay/2016/nw1500/article/4133688_30224.html#:~:text=%EB%8B%AB%EA%B8%B0 (accessed on 17 September 2025). (In Korean)

14. Yonhap News Agency. Available online: https://science.ytn.co.kr/program/view.php?mcd=0082&key=2020090711443611297#:~:text=%EB%B9%84%EA%B3%B5%EC%8B%9D%20%EA%B8%B0%EB%A1%9D%EC%9D%B4%EC%A7%80%EB%A7%8C%2C%20%EC%A0%9C%EC%A3%BC%20%EC%82%B0%EA%B0%84%EC%97%90%20%ED%95%98%EB%A3%A8,1%2C000mm%EC%9D%98%20%ED%8F%AD%EC%9A%B0%EA%B0%80%20%EC%B2%98%EC%9D%8C%20%EA%B4%80%EC%B8%A1%EB%90%90%EC%8A%B5%EB%8B%88%EB%8B%A4 (accessed on 17 September 2025). (In Korean)

15. National Fire Agency. Available online: https://www.nfa.go.kr/nfa/news/disasterNews/;jsessionid=nCcZd2oihNduR2POx2RrAiWG.nfa12?boardId=bbs_0000000000001896&mode=view&cntId=161424 (accessed on 17 September 2025).

16. Tadesse, M.; Wahl, T.; Cid, A. Data-Driven Modeling of Global Storm Surges. Front. Mar. Sci.; 2020; 7, 260. [DOI: https://dx.doi.org/10.3389/fmars.2020.00260]

17. Tian, Q.; Luo, W.; Tian, Y.; Gao, H.; Guo, L.; Jiang, Y. Prediction of storm surge in the Pearl River Estuary based on data-driven model. Front. Mar. Sci.; 2024; 11, 1390364. [DOI: https://dx.doi.org/10.3389/fmars.2024.1390364]

18. Ayyad, M.; Hajj, M.R.; Marsooli, R. Machine learning-based assessment of storm surge in the New York metropolitan area. Sci. Rep.; 2022; 12, 19215. [DOI: https://dx.doi.org/10.1038/s41598-022-23627-6] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/36357413]

19. Sadler, J.M.; Goodall, J.L.; Morsy, M.M.; Spencer, K. Modeling urban coastal flood severity from crowd-sourced flood reports using Poisson regression and Random Forest. J. Hydrol.; 2018; 559, pp. 43-55. [DOI: https://dx.doi.org/10.1016/j.jhydrol.2018.01.044]

20. Sun, K.; Pan, J. Model of Storm Surge Maximum Water Level Increase in a Coastal Area Using Ensemble Machine Learning and Explicable Algorithm. Earth Space Sci.; 2023; 10, e2023EA003243. [DOI: https://dx.doi.org/10.1029/2023EA003243]

21. Yang, J.-A.; Lee, Y. Development of a Storm Surge Prediction Model Using Typhoon Characteristics and Multiple Linear Regression. J. Mar. Sci. Eng.; 2025; 13, 1655. [DOI: https://dx.doi.org/10.3390/jmse13091655]

22. Metters, D. Machine Learning to Forecast Storm Surge. Forum of Operational Oceanography, Melbourne. Available online: https://www.researchgate.net/publication/336779021_Machine_learning_to_forecast_storm_surge (accessed on 12 October 2019).

23. Lee, Y.; Jung, C.; Kim, S. Spatial distribution of soil moisture estimates using a multiple linear regression model and Korean geostationary satellite (COMS) data. Agric. Water Manag.; 2019; 213, pp. 580-593. [DOI: https://dx.doi.org/10.1016/j.agwat.2018.09.004]

24. Korea Hydrographic and Oceanographic Agency. Available online: https://www.khoa.go.kr (accessed on 25 July 2025).

25. Pawlowicz, R.; Beardsley, B.; Lentz, S. Classical tidal harmonic analysis including error estimates in MATLAB using T-TIDE. Comput. Geosci.; 2002; 28, pp. 929-937. [DOI: https://dx.doi.org/10.1016/S0098-3004(02)00013-4]

26. Jensen, C.; Mahavadi, T.; Schade, N.H.; Hache, I.; Kruschke, T. Negative Storm Surges in the Elbe Estuary-Large-Scale Meteorological Conditions and Future Climate Change. Atmosphere; 2022; 13, 1634. [DOI: https://dx.doi.org/10.3390/atmos13101634]

27. Dinápoli, M.G.; Simionato, C.G.; Alonso, G.; Bodnariuk, N.; Saurral, R. Negative storm surges in the Río de la Plata Estuary: Mechanisms, variability, trends and linkage with the Continental Shelf dynamics. Estuar. Coast. Shelf Sci.; 2024; 305, 108844. [DOI: https://dx.doi.org/10.1016/j.ecss.2024.108844]

28. Kutner, M.H.; Nachtsheim, C.J.; Neter, J. Applied Linear Statistical Models; 5th ed. McGraw-Hill: Irwin, ID, USA, New York, NY, USA, 2004.

29. He, H.; Garcia, E.A. Learning from imbalanced data. IEEE Tras. Knowl. Data Eng.; 2009; 21, pp. 1263-1284.

30. Fernández, A.; García, S.; Galar, M.; Prati, R.C.; Krawczyk, B.; Herrera, F. Learning from Imbalanced Data Sets; Springer: Cham, Switzerland, 2018.

31. Batista, G.E.; Prati, R.C.; Monard, M.C. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl.; 2004; 6, pp. 20-29. [DOI: https://dx.doi.org/10.1145/1007730.1007735]

32. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res.; 2002; 16, pp. 321-357. [DOI: https://dx.doi.org/10.1613/jair.953]

33. Douzas, G.; Bacao, F.; Last, F. Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE. Inf. Sci.; 2018; 465, pp. 1-20. [DOI: https://dx.doi.org/10.1016/j.ins.2018.06.056]

34. Han, H.; Wang, W.-Y.; Mao, B.-H. Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. International Conference on Intelligent Computing; Springer: Berlin/Heidelberg, Germany, 2005.

35. LemaÃŽtre, G.; Nogueira, F.; Aridas, C.K. Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res.; 2017; 18, pp. 1-5.

36. Mani, I.; Zhang, I. kNN approach to unbalanced data distributions: A case study involving information extraction. Proceedings of the Workshop on Learning from Imbalanced Datasets; ICML: Washington, UT, USA, 2003.

37. Tomek, I. Two modifications of CNN. IEEE Trans. Syst. Man Cybern.; 1976; SMC-6, pp. 769-772.

38. Wilson, D.L. Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern.; 1972; SMC-2, pp. 408-421. [DOI: https://dx.doi.org/10.1109/TSMC.1972.4309137]

39. Santhi, C.; Arnold, J.G.; Williams, J.R.; Dugas, W.A.; Srinivasan, R.; Hauck, L.M. Validation of the swat model on a large rwer basin with point and nonpoint sources 1. J. Am. Water Resour. Assoc.; 2001; 37, pp. 1169-1188. [DOI: https://dx.doi.org/10.1111/j.1752-1688.2001.tb03630.x]

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.