Full Text

Turn on search term navigation

1 Introduction

Warm-sector heavy rainfall often occurs in south China during the pre-flood season, primarily influenced by the East Asian summer monsoon . These rainfall events are characterized by intense and localized precipitation over a limited area. Despite their small scale, such unexpected and extreme warm-sector rainfall can cause significant damage, including flooding homes and vehicles, destroying crop fields, and endangering lives, leading to economic losses ranging from millions to even billions of dollars . Accurately predicting warm-sector heavy rainfall with numerical weather prediction (NWP) models is challenging due to the complex interaction of various factors, such as the low-level jet (LLJ), land–sea contrast, topography, and urban landscape . The complex terrain and heterogeneous land surface of the south China region are crucial in promoting active convection. Previous studies have demonstrated that a higher spatial resolution improves the performance of convective rainfall forecasts by more accurately resolving topographic features. Acknowledging the importance of resolution in forecasting severe convective weather, both the Chinese government and the community increasingly support the development of high-resolution operational forecast models specifically designed for warm-sector rainstorms and sudden local rainstorms. In early 2017, the China Meteorological Administration (CMA) launched an initiative to develop a comprehensive framework for evaluating the forecast performance of all available models, including high-resolution regional models, and advancing key technologies for forecasting high-impact weather.

The increased computational resources have facilitated a shift towards the implementation of regional NWP models with increasingly finer grid spacings, typically within the range of 1 to 10 $km$ . However, when the model grid spacing approaches the scale of convection, entering the so-called “gray zone” , cumulus convection transitions from being completely unresolved to partially resolved. Theoretically, the accurate representation of the smallest turbulent scales, achievable only through direct numerical simulation (DNS) at resolutions from millimeters to centimeters , still requires the use of the parameterization of turbulence or convection in weather modeling. There is ongoing debate regarding the efficacy of employing convection parameterization (CP) within the gray zone. Several studies have found that reducing horizontal grid spacing to below 4 $km$ while using the CP scheme does not enhance and may even degrade precipitation forecast performance. In contrast, other studies showed that forecasts with a horizontal grid spacing of 1 $km$ , where convection is explicitly resolved, yielded more accurate spatial representation of accumulated rainfall over 48 $h$ compared to forecasts using 12 and 4 $km$ grid spacings. This discrepancy in research findings, with some indicating no benefit from finer grid spacing and others suggesting improved forecast accuracy, seems to stem from the application of the CP at scales beyond its originally intended operational range. Therefore, it remains unclear if utilizing any CP schemes in the gray zone is effective for predicting localized warm-sector heavy rainfall.

To enhance prediction accuracy in the gray zone, researchers have developed scale-aware CP schemes. These schemes dynamically parameterize convective processes based on the horizontal grid spacing, thus facilitating seamless transitions between different spatial scales. A pivotal study by demonstrated that two specific scale-aware CP schemes, Grell–Freitas and multi-scale Kain–Fritsch (MSKF) , surpassed conventional CP schemes in predicting both the timing and intensity of precipitation over the Southern Great Plains of the United States. Additionally, showed that the MSKF scheme outperformed other CP schemes, including the Grell 3D ensemble and the new simplified Arakawa–Schubert , in precipitation simulation. This was evidenced by its lower root mean squared error (RMSE) values when compared against in situ observations and satellite data. Despite the increasing adoption of these scale-aware schemes due to their superior performance, it is crucial to acknowledge that their efficacy also relies on various empirical parameters . Therefore, developing specialized CP schemes for the gray zone in NWP models continues to be a significant challenge.

In recent years, an increasing number of studies have investigated the use of machine learning (ML) models as alternatives to conventional physics-based CP schemes. These ML-based schemes have demonstrated the potential for efficacy across various horizontal resolutions, benefiting from being trained on data from simulations that operate at varying grid resolutions . Unlike conventional CP schemes, which often rely on assumptions such as convective quasi-equilibrium , ML-based parameterization schemes do not require such assumptions. Notably, random forests (RFs) and fully connected (FC) neural networks (NNs) have become predominant ML models for CP schemes in previous studies. RFs offer the advantages of inherently enforcing physical constraints, such as energy conservation and non-negative surface precipitation, essential for maintaining stable simulations. demonstrated RFs' capability to emulate moist convection in an aquaplanet general circulation model (GCM), maintaining stability and effectively reproducing key climate statistics. Furthermore, employed the coarse-grained output from a high-resolution three-dimensional (3D) GCM model, simulated on an idealized equatorial beta plane, to train the RF parameterization. They showed that the RF parameterization is capable of reproducing the climate of the high-resolution simulation at coarser resolutions. However, FC NNs offer several advantages over RFs, such as the potential for higher accuracy and lower memory requirements. introduced a stochastic CP scheme using an ensemble of three-layer NNs, trained with data generated by a cloud-resolving model (CRM) during the TOGA COARE

TOGA COARE is an acronym for Tropical Ocean Global Atmospheres/Coupled Ocean Atmosphere Response Experiment. It is an international research program that investigates the interaction or coupling of the ocean and atmosphere in the western Pacific warm pool region from November 1992 to February 1993, encompassing 120 $d$ of field experiments involving the deployment of oceanographic ships, moorings, drifters, and Doppler radars (ship, land, and air).

experiment, demonstrating its capacity for generating reasonable decadal climate simulations across a broader tropical Pacific region when incorporated into the National Center of Atmospheric Research (NCAR) Community Atmospheric Model (CAM). Similarly, leveraged deep NN (DNN) trained on data from idealized and aquaplanet simulations performed using the Super-Parameterized Community Atmosphere Model (SPCAM). The DNN predicts temperature and moisture tendencies due to convection and clouds, as well as the cloud liquid and ice water contents. Additionally, successfully implemented an NN-based parameterization in a global GCM on an aquaplanet, conducting stable prognostic simulations over multiple years that accurately reproduced the climatology of SPCAM and capturing crucial aspects of variability, including extreme precipitation and realistic tropical waves. However, also found that minor changes to the configuration rapidly led to simulation instabilities, underscoring the need to address the robustness of NN parameterizations in GCMs. developed a FC NN that predicts subgrid fluxes instead of tendencies, incorporating physical constraints from coarse-grained high-resolution atmospheric simulation in an idealized domain. proposed a novel loss function designed to minimize accumulated prediction error over multiple time steps to enhance long-term stability and accuracy, by excluding upper atmospheric humidity and temperature from the input. Nonetheless, the approach of removing certain variables from the input is relatively rudimentary, demanding additional research to enhance the stability of NN-based parameterizations when integrated into the model.

Previous studies have predominantly used FC NNs to emulate convection, while more advanced NN structures have the potential to achieve higher accuracy. In a pioneering study, explored the use of a deep residual convolutional NN (ResNet) for the emulation of convection and cloud parameterization in the SPCAM model using a realistic configuration. They compared the performance of ResNet with various NN architectures, including a FC DNN, a DNN with skip connections, and a convolutional NN (CNN) without skip connections. The results revealed that ResNet and CNNs without skip connections outperformed FC NNs and DNNs with skip connections in accuracy, with ResNet and CNNs without skip connections showing comparable performance. This finding highlights the significant role of convolutions in enhancing accuracy. Furthermore, evaluated multiple ML model structures for simulating atmospheric radiative transfer processes, encompassing FC NNs, CNNs, bidirectional recurrent-based NNs (RNNs), transformer-based NNs , and Fourier neural operators (FNOs; ). Their results indicated that models capable of perceiving the global context of the entire atmospheric column significantly outperformed FC NNs and CNNs. Particularly, the bidirectional long short-term memory (Bi-LSTM) achieved the highest levels of accuracy. Similar to radiative transfer modeling, also emphasized the importance of ML having a global perspective of the entire atmospheric column for ML models in convection modeling. They demonstrated that increasing the depths of CNNs from 4 to 22 layers significantly improved model accuracy, a benefit primarily attributed to the expansion of the receptive field in deeper CNN layers. Therefore, ML models that integrate both global and local perception capabilities are better suited for developing ML-based CP schemes.

Previous research has mostly focused on replacing CP schemes in GCM models with ML models for climate forecasting. The complexity of CP schemes in weather forecasting models surpasses that in GCMs . Generally, CP schemes in GCMs, whether in explicit or implicit form, assume that both the horizontal grid size and the temporal intervals for physics implementation are significantly larger and longer compared to the grid size and duration of individual moist-convective elements. In contrast, CP schemes in high-resolution models must account for dependencies on both the model's resolution and the time interval for implementing the physics . The ultimate goal is to develop ML models, based on data from super-parameterization or cloud-resolving models, to replace conventional CP schemes in weather forecasting models. This replacement seeks to reduce uncertainties and improve the efficacy of ML parameterizations. This study represents an initial effort to employ a ML model as an alternative to conventional CP schemes in weather forecasting models. For our dataset, we used the Weather and Research Forecasting (WRF) model that covers the south China region, incorporating the scale-aware MSKF scheme employed as the CP scheme. The MSKF scheme, an improved version of the Kain–Fritsch (KF) scheme , aims to mitigate the overestimation of precipitation and address the premature convection trigger issue, particularly evident in overestimating precipitation during summer. To address these issues, the MSKF incorporates a scale-dependent capability, such as modifying the formulation of the convective adjustment timescale. This vital parameter, which determines the intensity and duration of convection, has been made dynamic and dependent on grid resolution . Furthermore, we utilize a Bi-LSTM model to emulate the convective processes and couple it with the WRF model through the WRF–ML coupler developed by . The performance of the ML-based CP scheme is evaluated in both offline and online settings.

The paper is structured as follows. Section provides a description of the WRF model for data generation, as well as the input and output data of the ML model. In Sect. , the original and the ML-based MSKF schemes are introduced. The results for both offline and online testing of the ML-based MSKF scheme are presented in Sect. . Finally, Sect. presents the summary and conclusion.

2 Data

2.1 Data generation

The dataset was generated by running the WRF model version 4.3 . The following subsections provide a comprehensive explanation of the WRF model configurations, as well as the input and output variables employed in the development of the ML-based CP scheme.

The WRF model is compiled using the GNU Fortran (gfortran version 7.5.0) compiler with the dmpar option. The WRF model is run using the domain configuration illustrated in Fig. . The WRF model is configured with a single domain consisting of 44 000 grid points, with a horizontal grid spacing of 5 $km$ and dimensions of 220 $\times$ 200 grid points in the west–east and north–south directions. The model consists of 45 vertical levels (i.e., 44 vertical layers), with a model top at 50 $hPa$ . Additionally, the WRF model is configured with physics schemes, including a WSM 6-class graupel scheme for microphysics, a Bougeault–Lacarrère (BouLac) scheme for planetary boundary layer (PBL) mixing, the Monin–Obukhov (Janjic) surface layer scheme , the Unified Noah model for land surface, RRTMG for both shortwave and longwave radiation , and MSKF for cumulus. The time step used for all WRF simulations is set to 15 $s$ .

Figure 1

Digital evaluation data of the single WRF domain with a horizontal resolution of 5°. The red lines are the province borderlines, and the black lines are the city borderlines.

[Figure omitted. See PDF]

The initial and boundary conditions for this work were derived from the ERA5 reanalysis dataset, which was provided by the European Centre for Medium-range Weather Forecast (ECMWF) . The ERA5 reanalysis dataset used in this study has a horizontal resolution of 0.25° and consists of 29 pressure levels below 50 $hPa$ . To create a dataset for developing the ML model, the WRF simulations were initialized at 12:00 UTC and conducted nine times every 2 $d$ , specifically from 20 May 2022 to 5 June 2022. Throughout the simulations, the MSKF scheme was called every 5 model minutes, generating outputs at each call. The simulations ran for 36 $h$ each time, with the first 24 $h$ used for training and the last 12 $h$ for validation. Therefore, the total number of training samples is 114 444 000 (114 444 000 $=$ $44 000 \times 9 \times (24 \times 60 / 5 + 1)$ ), while the offline validation set contains 57 024 000 (57 024 000 $=$ $44 000 \times 9 \times 12 \times 60 / 5$ ) samples.

Furthermore, given the possible discrepancy between offline performance, we conducted experiments that coupled the ML-based MSKF scheme with the WRF model. This coupling aims at evaluating the online efficacy of the ML-based MSKF scheme by comparing it with the original WRF simulations. These simulations were performed four times every 2 $d$ , with each simulation extending over a period of 168 $h$ (7 $d$ ). The initialization days spanned from 12 to 18 June 2022.

Table 1

Definition of all the input and output variables, whether they are surface or 3D variables, and their corresponding units. There are 44 model layers.

Type	Variable name	Definition	Type	Unit
Input	$u$	meridional wind component	3D	$m s^{- 1}$
	$v$	zonal wind component	3D	$m s^{- 1}$
	$w$	vertical wind component	3D	$m s^{- 1}$
	$t$	temperature	3D	$K$
	qv	water vapor mixing ratio	3D	$kg {kg}^{- 1}$
	$p$	pressure	3D	$Pa$
	th	potential temperature	3D	$K$
	dz8w	layer thickness	3D	$m$
	rho	air density	3D	$kg m^{- 3}$
	pi	Exner function, which is dimensionless pressure and can be defined as $(\frac{p}{p_{0}})^{R_{d} / c_{p}}$
	hfx	upward heat flux at surface	surface	$W m^{- 2}$
	ust	$u^{*}$ in similarity theory	surface	$W m^{- 2}$
	pblh	planetary boundary layer height	surface	$m$
Derived input	$p_{diff}$	pressure difference between adjacent levels	3D	$Pa$
	${qv}_{sat}$	saturated water vapor mixing ratio	3D	$kg {kg}^{- 1}$
	rh	relative humidity	3D	–
	trigger	boolean trigger indicating convection triggering	surface	–
Input and output	rthcuten	potential temperature tendency due to cumulus parameterization	3D	$K s^{- 1}$
	rqvcuten	water vapor mixing ratio tendency due to cumulus parameterization	3D	$kg {kg}^{- 1} s^{- 1}$
	rqccuten	cloud water mixing ratio tendency due to cumulus parameterization	3D	$kg {kg}^{- 1} s^{- 1}$
	rqrcuten	rainwater mixing ratio tendency due to cumulus parameterization	3D	$kg {kg}^{- 1} s^{- 1}$
	rqicuten	cloud ice mixing ratio tendency due to cumulus parameterization	3D	$kg {kg}^{- 1} s^{- 1}$
	rqscuten	snow mixing ratio tendency due to cumulus parameterization	3D	$kg {kg}^{- 1} s^{- 1}$
	w0avg	average vertical velocity	3D	$m s^{- 1}$
	nca	counter of the cloud relaxation time	3D	$s$
	pratec	precipitation rate due to cumulus parameterization	surface	$mm s^{- 1}$
Output	raincv	precipitation due to cumulus parameterization	surface	$mm$

2.2 Input and output data

Table presents a comprehensive list of the input and output variables used in this study, consistent with those utilized in the original MSKF scheme. There are 17 variables exclusively used as input, while 9 variables serve as both input and output. Specifically, the output variable raincv, representing the time step precipitation due to convection, is calculated through multiplying the precipitation rate by the model's time step. Among all the variables, five are two-dimensional (2D) surface variables, while the remaining ones are 3D variables characterized by 44-layer vertical profiles. Moreover, the ML model used in this study incorporates four derived variables as input. These variables consist of a 2D Boolean variable indicating convection triggering based on nca values, the pressure difference across adjacent vertical levels, the saturated water vapor mixing ratio, and the relative humidity. Furthermore, the output w0avg, which depends on the vertical wind component ( $w$ ) and input w0avg, is also included as an input to the model. In total, the ML model utilizes 27 input variables.

The variable nca represents the cloud relaxation time and must be an integer multiple of the model time step. For all WRF model simulations conducted in this study, a fixed time step of 15 $s$ is used. Thus, nca is expected to be exactly divisible by 15. To eliminate dependence on the specific model time step, nca is divided by the model time step before normalization is applied during model training. Moreover, within the MSKF scheme, nca plays a crucial role in determining the triggering of convection. Convection is triggered when nca is equal to or exceeds half of the model time step.

To ensure consistency with the dimensions of the 3D variables, the surface variables are padded by duplicating the values of the surface layer for all layers before feeding them into the model. Prior to utilizing the variables in the Bi-LSTM model for training and validation, normalization is applied to ensure uniformity in the magnitudes of all the variables. Each variable is divided by the maximum absolute value in the atmospheric column (for 3D variables) or at the surface (for surface variables).

3 Method

This section describes the flowchart of the original MSKF scheme for determining the convection trigger, ML model structures and training, and evaluation methods.

Figure 2

A flowchart outlining the convection trigger process in the original MSKF scheme.

[Figure omitted. See PDF]

3.1 Description of the original MSKF module

The MSKF scheme is a scale-aware adaptation of the KF CP scheme, initially developed by and further refined by . Figure illustrates the convection trigger process within the MSKF scheme. At the beginning of each simulation step, the scheme evaluates the variable nca to ascertain whether it equals or surpasses a threshold, defined as half of the model's time step (dt). Should nca equal or exceed half of dt, there would be no need to update convective tendencies or precipitation rates due to ongoing convection. In contrast, an nca value below this threshold triggers the MSKF scheme to employ a one-dimensional cloud model. This model calculates a set of variables related to cloud characteristics to evaluate the potential of convection triggering. Essential variables include the lifting condensation level (LCL), convective available potential energy (CAPE), cloud top and base heights, and entrainment rates. The LCL is crucial for determining the emergence of potential convective activities, with a lower LCL favoring more intense convection. CAPE quantifies the buoyant energy available to an air parcel for the formation of deep convective clouds upon reaching its level of free convection (LFC) above the LCL, with higher CAPE values signifying greater potential for intense convection. The cloud base is generally at the LCL, whereas the cloud top is defined at the altitude where buoyancy becomes negligible. Meanwhile, the vertical extent between the cloud base and top affect the cloud's growth and precipitation potential. The MSKF scheme requires surpassing a specific CAPE threshold to trigger convection. Furthermore, it assesses entrainment rates to measure the impact of ambient air on the evolution of the convective system. At grid points where convection is triggered, the MSKF scheme calculates both convective tendencies and precipitation rates; otherwise these values are set to zero. However, the variable w0avg is consistently updated, regardless of convection status. Active convection leads to a decrement in nca by one model time step for each iteration within the WRF model cycle.

Figure 3

The architecture of the multi-output Bi-LSTM model for combined classification and regression predictions.

[Figure omitted. See PDF]

3.2 Description of the ML-based MSKF scheme

In the original MSKF scheme, atmospheric columns are processed sequentially, one at a time, until all horizontal grid points within the domain have been processed. In contrast, the ML-based MSKF scheme processes data in batches, as indicated by B in Fig. , consisting of 27 features across 44 vertical layers. As a result, the input data have a dimension of B $\times$ 27 $\times$ 44. Before being fed into the ML model, the input data undergo pre-processing through a module incorporating a one-dimensional (1D) convolutional layer. This module expands the feature dimension from 27 to 64. The following sections provide a comprehensive description of the structures of the ML model.

3.2.1 ML model structure

Predicting whether convection is triggered as well as modeling convective tendencies and precipitation rates are two primary objectives of conventional CP schemes. Previous studies have applied ML models to address these objectives, with some dedicated solely to the classification task of the convection trigger , while others have independently pursued the regression of convective tendencies . However, regression-based models alone may result in inconsistent convective tendencies, leading to conflicting signals for triggering convection at specific grid points (see Figs. and in Appendix ). In contrast, models that focus exclusively on classification lack the capability to generate essential tendencies for an effective CP scheme. Therefore, the development of a ML-based CP scheme necessitates the integration of both a binary classification model for the prediction of the convection trigger and a regression model for convective tendencies. To address this, we propose a multi-output Bi-LSTM model capable of concurrently conducting regression and classification predictions (Fig. ). Our proposed model consists of a shared Bi-LSTM layer for learning features, a classification subnetwork, and a regression subnetwork. The shared Bi-LSTM layer includes three repeated Bi-LSTM blocks, with each block containing a forward and a backward layer that have a feature dimension of 32. The classification subnetwork is composed of a 1 $\times$ 1 1D convolutional layer, a FC layer, and a sigmoid activation layer. The output of the sigmoid layer represents the probability distribution of the convection trigger. The binary cross-entropy loss function is employed as the cost function for this classification task. Meanwhile, the regression subnetwork incorporates a FC layer to output precipitation rate, nca, and convective tendencies. Finally, outputs from both subnetworks are processed through a post-processing module to ensure their physical consistency (see Figs. and in Appendix ), with further details provided in the subsequent subsection.

3.2.2 Post-processing module

The post-processing module is designed to ensure physical consistency of all variables. To achieve this, the following rules are applied: (1) at grid points where the input nca is equal to or greater than half the value of dt, all other variables remain unchanged as they are still within the convection lifetime. (2) The output nca must be an integer. (3) At grid points where convection is predicted to be inactive, all corresponding output variables are set to zero by default. In addition, the calculation of time step convective precipitation (raincv) follows the methodology outlined in a previous section, Sect. .

3.2.3 Model training

As our model incorporates both classification and regression tasks, we optimize its performance by minimizing a multi-task loss function . The loss function is defined as the sum of the binary cross-entropy loss for the convection trigger and a weighted combination of the $L 1$ loss for all output variables from the regression subnetwork. The specific formulation of the loss function is as follows:

1 $L = \frac{1}{N_{cls}} \sum_{i, j} L_{cls} (p_{i, j}, p_{i, j}^{gt}) + \sum_{c} λ_{c} \frac{1}{N_{reg}} \sum_{i, j} p_{i, j}^{gt} L 1_{c} .$ Here, $i$ and $j$ denote the grid points in the domain and $p_{i, j}$ represents the probability of convection being triggered. The ground-truth label $p_{i, j}^{gt}$ takes a value of 1 if convection is triggered and 0 otherwise. The classification loss, $L_{cls}$ , is calculated using the binary cross-entropy loss. For the regression loss of different values of variable $c$ , $λ_{c}$ functions as a weight that balances the output variables by considering their respective magnitudes. The term $p_{i, j}^{gt} L 1_{c}$ indicates that the $L 1$ regression loss is activated only for triggered grid points ( $p_{i, j}^{gt} = 1$ ) and is disabled otherwise ( $p_{i, j}^{gt} = 0$ ). Both loss terms are normalized by $N_{cls}$ and $N_{reg}$ , which correspond to the total number of grid points and the number of triggered grid points, respectively.

The Adam optimizer is used with an initial learning rate of 0.003 to update the parameters of the model. Furthermore, the plateau scheduler is implemented to decrease the learning rate by a factor of 0.5 when the loss fails to decrease for 5 epochs. The model is trained for 150 epochs using a batch size of 44 000.

3.3 Evaluation methods

The ML-based MSKF scheme is evaluated in both offline and online settings. The offline performance of the ML-based MSKF scheme is evaluated by comparing it against the outputs of the original MSKF scheme using the validation dataset, including rthcuten, rqvcuten, rqccuten, rqrcuten, nca, and pratec. The overall model performance metrics include the RMSE and correlation coefficient. The mean absolute error (MAE) and mean bias error (MBE) per vertical layer are calculated using the equations below: $\begin{matrix} 2 & {MAE}_{l} = \frac{1}{N} \sum_{i = 1}^{N} | Y_{ML} (i, l) - Y (i, l) |, \\ 3 & {MBE}_{l} = \frac{1}{N} \sum_{i = 1}^{N} Y_{ML} (i, l) - Y (i, l), \end{matrix}$ where $Y (i, l)$ and $Y_{ML} (i, l)$ represent the outputs from the original MSKF scheme and the ML-based MSKF scheme, respectively. Here, $i$ denotes the horizontal grid point of a vertical profile, $N$ is the number of the horizontal grid points in the domain, and $l$ represents the vertical layer index.

4 Results

4.1 Offline validation of the ML-based MSKF scheme

The offline validation was conducted using data that were not used during the training process. Figure compares the cloud relaxation time (nca), precipitation rate (pratec), and convective tendencies predicted by the original MSKF scheme and the ML-based MSKF scheme, respectively. To facilitate the comparison, the units of precipitation rate and temperature tendencies were converted to $mm d^{- 1}$ and $K d^{- 1}$ from $mm s^{- 1}$ and $K s^{- 1}$ , respectively, by applying a conversion factor of 86 400 (24 $\times$ 3600). Similarly, the water vapor mixing ratio (rqvcuten), cloud water mixing ratio (rqccuten), and rainwater mixing ratio (rqrcuten), due to convection, were multiplied by 86 400 000 (24 $\times$ 3600 $\times$ 1000) to convert the output variables listed in Table from $kg {kg}^{- 1} s^{- 1}$ to $g {kg}^{- 1} d^{- 1}$ ; the variable w0avg is excluded as it is calculated using an equation with the ground truth as input in this offline validation. Hence, evaluating w0avg in the offline evaluation is unnecessary.

Figure 4

Comparison of the predicted ( $y$ axis) and true ( $x$ axis) nca (a), pratec (b), rthcuten (c), rqvcuten (d), rqccuten (e), and rqrcuten (f) when using validation data in the offline setting. Colors indicate the proportion of samples across the entire testing dataset, with values on the color bar normalized through the application of a base 10 logarithm.

[Figure omitted. See PDF]

Figure 5

Vertical profiles of the statistics in rthcuten (a), rqvcuten (b), rqccuten (c), and rqrcuten (d) using validation data in the offline setting data using ML-based emulators. The solid and dashed lines show the MAE and MBE profiles, respectively, and the shaded area indicates the 5th and 95th percentile of differences (prediction–target) at each layer.

[Figure omitted. See PDF]

Among all variables illustrated in Fig. , the variable nca exhibits a significantly higher RMSE of 4.32, with data points widely dispersed across a wide range of values. This suggests that accurately predicting convection poses a considerable challenge. To eliminate the dependency on time steps, nca is divided by the model's time step of 15 $s$ before proceeding with plotting and statistical evaluations. The precipitation rate demonstrates the highest correlation coefficient and minimal variability, as most data points cluster closely around the 1 : 1 line. While temperature and the four moisture tendencies exhibit some degree of variability, the majority of data points align closely with the 1 : 1 line. The correlation coefficient of convection trigger is 0.91 (not shown in Fig. ). Overall, the ML-based MSKF scheme shows a strong correlation with the original MSKF scheme for all examined variables, with correlation coefficients consistently higher than 0.91. This indicates that the ML-based MSKF scheme has the potential to replace the original scheme.

To obtain a comprehensive understanding of the vertical distribution of errors, Fig. presents the vertical profiles of error statistics associated with convective tendencies. The solid and dashed lines in the figure represent the MAE and MBE of tendencies at each vertical layer, respectively. Additionally, the shaded area corresponds to the 5th and 95th percentiles of differences between tendencies predicted by the ML-based MSKF-predicted scheme and the original MSKF scheme, respectively. The distribution of vertical errors in all tendencies exhibits a notable uniformity, with higher variance observed within the pressure layers between 800 and 1000 $hPa$ . These pressure layers correspond to the atmospheric layer where convection occurs most frequently. Due to the significantly lower cloud and rain content compared to water vapor in the atmosphere, the error magnitudes for rqccuten and rqrcuten are considerably lower than those observed for rqvcuten.

4.2 Prognostic validation

This subsection presents the performance of the ML-based MSKF scheme in the online setting.

The ML-based MSKF scheme was integrated into the WRF model as a substitute for the original MSKF scheme to simulate convective processes. Utilizing the WRF–ML coupler , this novel ML-based MSKF scheme was seamlessly incorporated into the WRF framework. A comparative analysis was conducted by initializing both the modified WRF model, which incorporates the ML-based scheme, and the original WRF model on 12, 14, 16, and 18 June 2022 for simulations extending over 168 $h$ . It is worth mentioning that these simulations were performed independently of the training dataset, ensuring the evaluation of the scheme's generalization capability.

Figure 6

Spatial map of the average WRF simulations using the original MSKF scheme (in the first, third, and fifth rows) along with the average MAD between WRF simulations coupled with the ML-based MSKF scheme and WRF simulation with the original MSKF scheme (in the second, fourth, and sixth rows). The simulations are shown for the 12 $h$ accumulated convective precipitation (RAINC) in the first and second rows, the 12 $h$ accumulated non-convective precipitation (RAINNC) in the third and fourth rows, and the 2 $m$ temperature (T2M) at forecast lead times of 24 $h$ (first column), 72 $h$ (second column), 120 $h$ (third column), and 168 $h$ (fourth column).

[Figure omitted. See PDF]

Figure presents the averaged spatial forecasts for predictions generated by the original WRF model. These forecast results include the accumulations of both convective precipitation (RAINC) and non-convective precipitation (RAINNC) over a 12 $h$ period, along with the 2 $m$ temperature (T2M) at 24, 72, 120, and 168 $h$ . The figure also demonstrates the mean absolute difference (MAD) between WRF simulations coupled with the ML-based MSKF scheme and those utilizing the original MSKF scheme. Within the spatial forecasts, red and blue patterns signify the magnitudes of the forecasted values, whereas in the spatial differences, these colors denote the positive and negative biases in the ML-based simulations, respectively. Green patterns suggest minimal deviation from the original WRF simulations. Furthermore, we calculate a domain-averaged MAD to evaluate the overall performance of the ML-based scheme in prognostic simulations. Generally, the differences are small, indicating good agreement between WRF simulations coupled with the ML-based MSKF scheme and the original WRF simulations. Notably, the differences do not increase with the progression of simulation time, as evidenced by a comparable domain-averaged MAD at 168 forecast hours compared to that at 24 forecast hours. These findings suggest that the ML-based MSKF scheme achieves stable prognostic simulations.

Figure provides a comparative analysis of domain-averaged time series forecasts from both the original WRF simulations and WRF simulations coupled with the ML-based MSKF scheme. This comparison includes 6 $h$ accumulations of RAINC and RAINNC, as well as T2M forecasts. The results demonstrate that WRF simulations coupled with the ML-based MSKF schemes are in close alignment with the original WRF simulations, particularly when capturing the diurnal variations in RAINC, RAINNC, and T2M. Notably, the T2M forecasts demonstrate remarkable consistency, underscoring the efficacy of the ML-based MSKF scheme in maintaining the predictive accuracy of the original scheme.

Figure 7

Comparison of domain-averaged forecasts derived from the original WRF simulations (black lines) and WRF simulations coupled with the ML-based MSKF scheme (light-green lines) of 6 $h$ accumulated RAINC (a) and RAINNC (b), along with T2M (c).

[Figure omitted. See PDF]

5 Conclusions

In this paper, we proposed a multi-output Bi-LSTM model to develop a ML-based MSKF scheme for predicting the convection trigger and reproducing the convective process in the gray zone. The model is trained on data generated by the WRF simulations at a spatial resolution of 5 $km$ , covering the south China region. The output variables of the ML-based MSKF scheme are identical to those of the original MSKF scheme, encompassing the cloud relaxation time (nca), precipitation rate (pratec), time step convective precipitation (raincv), and convective tendencies. This ML-based scheme ensures physical consistency among all output variables by incorporating a post-processing module to refine the output from the Bi-LSTM model. Offline validation demonstrates the excellent performance of the ML-based MSKF scheme. Furthermore, the ML-based MSKF scheme is coupled with the WRF model using the WRF–ML coupler. The WRF simulations coupled with the ML-based MSKF scheme are compared against the WRF simulation with the original MSKF scheme. Results shows that the ML-based scheme can generate forecasts similar to the original ML scheme in online settings, showing the potential substitution of the MSKF scheme by ML models in the gray zone.

This study demonstrates the feasibility of employing ML models as substitutes for the conventional CP scheme within the high-resolution weather forecasting model. Future efforts will focus on the development of ML models based on data generated by super-parameterization or cloud-resolving models to replace conventional CP schemes in weather forecasting models (see Appendix B). The objective of this substitution is to reduce uncertainties and improve the performance of weather forecast models.

Appendix A Comparison against classification only and regression only models

Two separate Bi-LSTM models were trained with slight modifications to the multi-output Bi-LSTM model illustrated in Fig. . The first model aimed at predicting convection triggers alone, termed the Bi-LSTM trigger, while the second model aimed at predicting convective tendencies, termed the Bi-LSTM tendency. In predicting the convection trigger, both the Bi-LSTM trigger model and the multi-output Bi-LSTM model demonstrated comparable accuracy, as observed in Figs. and . However, while the convection triggers predicted by the Bi-LSTM trigger model were indistinguishable from those of the multi-output Bi-LSTM model, the former failed to accurately predict the corresponding convective tendencies. Consequently, it cannot serve as a replacement for convection schemes within NWP models.

Figures and present snapshots of rthcuten and rqvcuten predicted by the Bi-LSTM tendency model. These figures reveal that the Bi-LSTM tendency model predicts non-zero values across nearly the entire domain. Since the Bi-LSTM tendency model exclusively focuses on predicting convective tendencies, convection triggers are derived using certain threshold values. The spatial distribution of these triggers is notably influenced by the choice of threshold values, and the patterns of convection triggers derived from rthcuten and rqvcuten exhibit considerable discrepancies. This confirms that models based solely on regression yield inconsistent tendencies. In contrast, the multi-output Bi-LSTM model does not encounter the aforementioned issues of the Bi-LSTM tendency model and generates a more consistent spatial pattern of rthcuten and rqvcuten (see Figs. and ).

Figure A1

Snapshot example of convection trigger, with panel (a) showing the ground truth (GT) and panel (b) showing the difference between convection trigger as predicted by the Bi-LSTM trigger model and ground truth values for the 25 $h$ WRF simulation initialized at 12:00 UTC on 20 May 2021.

[Figure omitted. See PDF]

Figure A2

Snapshot example of convection trigger, with panel (a) showing the ground truth (GT) and panel (b) showing the difference between convection trigger as predicted by the multi-output Bi-LSTM model and ground truth values for the 25 $h$ WRF simulation initialized at 12:00 UTC on 20 May 2021.

[Figure omitted. See PDF]

Figure A3

Snapshot examples of rthcuten summed along the vertical direction, with panel (a) showing the GT values and panel (b) showing the rthcuten predicted by the Bi-LSTM tendency model for the 25 $h$ WRF simulation initialized at 12:00 UTC on 20 May 2021. Similarly, snapshot examples of a trigger, with the GT shown in panel (c) and the predictions from the Bi-LSTM tendency model using varying threshold values of rthcuten shown in panels (d), (e), and (f), respectively.

[Figure omitted. See PDF]

Figure A4

Snapshot examples of rqvcuten summed along the vertical direction, with panel (a) showing the GT values and panel (b) showing the rqvcuten predicted by the Bi-LSTM tendency model for the 25 $h$ WRF simulation initialized at 12:00 UTC on 20 May 2021. Similarly, snapshot examples of a trigger, with the GT shown in panel (c) and the predictions from the Bi-LSTM tendency model using varying threshold values of rqvcuten shown in panels (d), (e), and (f), respectively.

[Figure omitted. See PDF]

Figure A5

Snapshot examples of rthcuten summed along the vertical direction, with panel (a) showing the GT values and panel (b) showing the rthcuten predicted by the multi-output Bi-LSTM model for the 25 $h$ WRF simulation initialized at 12:00 UTC on 20 May 2021. Similarly, snapshot examples of a trigger, with the GT shown in panel (c) and the predictions from the multi-output Bi-LSTM model using a threshold value of 0 shown in panel (d).

[Figure omitted. See PDF]

Figure A6

Snapshot examples of rqvcuten summed along the vertical direction, with panel (a) showing the GT values and panel (b) showing the rqvcuten predicted by the multi-output Bi-LSTM model for the 25 $h$ WRF simulation initialized at 12:00 UTC on 20 May 2021. Similarly, snapshot examples of a trigger, with the GT shown in panel (c) and the predictions from the multi-output Bi-LSTM model using a threshold value of 0 shown in panel (d).

[Figure omitted. See PDF]

Appendix B Significance of using ML-based parameterization to replace super-parameterization and cloud-resolving model

employed large-eddy simulation (LES) to model deep tropical convection over an area of approximately 205 $km$ $\times$ 205 $km$ , focusing particularly on maritime regions. They conducted a benchmark simulation spanning 24 $h$ , with a spatial resolution of 100 $m$ and 256 vertical levels. This benchmark simulation utilized 2048 processors and took approximately 6 $d$ of wall-clock time to complete. Additionally, we attempted a cloud-resolving simulation, covering a 600 $km$ $\times$ 500 $km$ domain with a grid spacing of 500 $m$ (resulting in a grid of 1200 $\times$ 1000 points) and employing 45 vertical levels. The wall-clock time for this simulation was approximately 40 times the forecast time (dt $=$ 2 $s$ ). For a single 36 $h$ simulation, the computational time is around 60 $d$ , which far exceeds our current computational resources. Therefore, implementing machine-learning-based parameterization would offer a significant advantage in reducing computational costs when replacing the super-parameterization scheme or cloud-resolving model.

Code and data availability

The source code for the WRF model version 4.3 used in this study is available at 10.5281/zenodo.11112400 . The source code and data used in this paper are available at https://doi.org/10.5281/zenodo.10032404 .

Author contributions

XZ trained the deep learning models and calculated the statistics of model performance. XY and XZ conducted the WRF simulations to provide the dataset for training and evaluation and offered valuable suggestions on the model training and paper revision. XZ and XY wrote, reviewed, and edited the original draft. XZ, XY, and HL supervised and supported this research and gave important opinions. All of the authors have contributed to and agreed to the published version of the paper.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Acknowledgements

This work was supported by the Basic and Applied Basic Research Foundation of Guangdong Province (grant no. 2021A1515012582). We are thankful to the Mesoscale and Microscale Meteorology (MMM) Laboratory at NCAR for developing and sharing the WRF source codes.

Financial support

This research has been supported by the Basic and Applied Basic Research Foundation of Guangdong Province (grant no. 2021A1515012582).

Review statement

This paper was edited by Po-Lun Ma and reviewed by two anonymous referees.

Word count: 6592

Show less

© 2024. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Warm-sector heavy rainfall along the south China coast poses significant forecasting challenges due to its localized nature and prolonged duration. To improve the prediction of such high-impact weather events, high-resolution numerical weather prediction (NWP) models are increasingly used to more accurately represent topographic effects. However, as these models' grid spacing approaches the scale of convective processes, they enter a “gray zone”, where the models struggle to fully resolve the turbulent eddies within the atmospheric boundary layer, necessitating partial parameterization. The appropriateness of applying convection parameterization (CP) schemes within this gray zone remains controversial. To address this, scale-aware CP schemes have been developed to improve the representation of convective transport. Among these, the multi-scale Kain–Fritsch (MSKF) scheme enhances the traditional Kain–Fritsch (KF) scheme, incorporating modifications that facilitate its effective application at spatial resolutions as high as 2 $km$ . In recent years, there has been an increase in the application of machine learning (ML) models across various domains of atmospheric sciences, including efforts to replace conventional physical parameterizations with ML models. This work introduces a multi-output bidirectional long short-term memory (Bi-LSTM) model intended to replace the scale-aware MSKF CP scheme. This multi-output Bi-LSTM model is capable of simultaneously predicting the convection trigger while also modeling the associated convective tendencies and precipitation rates with a high performance. Data for training and testing the model are generated using the Weather Research and Forecast (WRF) model over south China at a horizontal resolution of 5 $km$ . Furthermore, this work evaluates the performance of the WRF model coupled with the ML-based CP scheme against simulations with the traditional MSKF scheme. The results demonstrate that the Bi-LSTM model can achieve high accuracy, indicating the promising potential of ML models to substitute the MSKF scheme in the gray zone.

Details

Title

Machine learning parameterization of the multi-scale Kain–Fritsch (MSKF) convection scheme and stable simulation coupled in the Weather Research and Forecasting (WRF) model using WRF–ML v1.0

Author

Zhong, Xiaohui¹

; Yu, Xing²; Li, Hao¹

¹ Artificial Intelligence Innovation and Incubation Institute, Fudan University, Shanghai 200433, China
² Center for Special Robots, Institute of Artificial Intelligence and Robotics for Society, Guangdong, 518000 Shenzhen, China

Pages

3667-3685

Publication year

2024

Publication date

2024

Publisher

Copernicus GmbH

ISSN

1991962X

e-ISSN

19919603

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.5194/gmd-17-3667-2024

ProQuest document ID

3051307374

Machine learning parameterization of the multi-scale Kain–Fritsch (MSKF) convection scheme and stable simulation coupled in the Weather Research and Forecasting (WRF) model using WRF–ML v1.0

Jump to:

Full Text

Abstract

Details

Suggested sources