Introduction
The ocean surface boundary layer (OSBL) is a thin layer below the ocean surface, typically extending tens to a hundred meters in thickness, and is strongly affected by external forcing such as wind, waves, and net heat fluxes. Ocean currents within the OSBL are highly turbulent, with the scale of these turbulent currents ranging from centimeters to several hundred meters. These turbulent currents have a profound impact on ocean dynamics, both within and beyond the OSBL, playing a significant role in sustaining marine ecosystems and shaping global climates. However, despite advances in oceanography, accurately simulating these turbulent processes remains a formidable challenge, particularly in regional and global ocean models, where directly resolving these dynamics is computationally infeasible in the foreseeable future (Fox-Kemper et al., 2014, 2019).
In realistic ocean and climate models, the turbulent flux of a variable x, that is, , is calculated as:
Turbulent mixing parameterization schemes typically fall into two categories. The first category is the first-order closure scheme, in which parameters directly relate to the forcing conditions and water property profiles. A well-known example is the K-profile parametrization (KPP) scheme. The KPP scheme was initially proposed for turbulence in atmospheric boundary layers (Troen & Mahrt, 1986) and later adapted for the OSBL (Large et al., 1994). Due to its computational efficiency and stability, the KPP scheme is widely used in realistic simulations for regional and global oceans (e.g., Belcher et al., 2012; Q. Li & Fox-Kemper, 2017; McWilliams & Sullivan, 2000; Van Roekel et al., 2018; Vertenstein et al., 2012). Another well-known first-order closure scheme is the energetics-based planetary boundary layer scheme (ePBL, Reichl & Hallberg, 2018). The second category is the second-moment closure (SMC) scheme, where turbulent diffusivity and turbulent viscosity are derived from turbulence statistics (kinetic energy, length scale, and dissipation rate) and empirically calculated in the scheme (Kantha & Clayson, 1994; Umlauf & Burchard, 2003). The SMC scheme, being computationally more expensive than the KPP scheme, is more commonly used in simulations of coastal oceans, where the current environment is more complicated (e.g., Sane et al., 2021; Warner et al., 2005) than in global and regional oceans. Recent studies have revised both the KPP and SMC schemes to include enhanced turbulent mixing effect due to wave-driven Langmuir turbulence, that is, KPPLT and SMCLT. Studies have shown that the use of KPPLT and SMCLT generally improves the simulations of sea surface temperature (SST) and the mixed layer depth (MLD) for global (Q. Li et al., 2016) and regional oceans (Ali et al., 2019). However, a recent study (Q. Li et al., 2019) examining 11 mixing parameterization schemes, including KPP, SMC, KPPLT, and SMCLT, found substantial differences in the solutions provided by these methods, indicating persistent biases across all schemes.
Further refining traditional turbulent mixing parameterizations is challenging. In the upper ocean, turbulent mixing is driven by diverse combinations of wind, wave, and buoyancy conditions. However, traditional deterministic parameterization schemes were developed based on a small subset of the realistic conditions across the global ocean (e.g., Figure 1 in Q. Li et al., 2019). Furthermore, although the scaling law is well-established for a specific turbulent regime, it is not well-defined for common scenarios over the world's oceans, where the three types of turbulence are equally significant. It is challenging to identify optimal functional forms and coefficients to adequately encompass the vast turbulent regimes due to diverse combinations of wind, wave, and buoyancy conditions.
[IMAGE OMITTED. SEE PDF]
In light of these challenges, recent efforts have begun exploring alternative approaches to take advantage of the recent development of machine learning techniques, especially deep neural networks (DNNs), to enhance the representation of the mixing effects of OSBL turbulence. Deep neural networks utilize extensive data as truth to establish non-linear relationships between the inputs and predicted outcomes. Early attempts aimed to replace traditional mixing parameterization by directly predicting turbulent fluxes using DNNs (e.g., Gentine et al., 2018; Liang et al., 2022; Rasp et al., 2018). While these DNNs have shown promising results in predicting turbulent flux profiles, ensuring numerical stability when integrating them with realistic climate models for long-term use poses challenges (e.g., Noah D Brenowitz et al., 2020; Chattopadhyay & Hassanzadeh, 2023; Rasp, 2020).
An alternative approach is to retain the physics-based framework in traditional parameterizations and use DNN to predict parameters that are uncertain in those parameterizations. Sane et al. (2023) trained DNNs to predict profiles of eddy diffusivity in the OSBL under the framework of the ePBL (Reichl & Hallberg, 2018) using simulations based on a SMC scheme as the truth. The authors further coupled the ePBL-DNN model into the Modular Ocean Model (MOM, e.g., Adcroft et al., 2019), and demonstrated its stability for long-term integration. Zhu et al. (2022) trained DNNs to predict mixing coefficients in the interior ocean (below the OSBL) based on in-situ microstructure observations at the equatorial Pacific Ocean. By implementing it into the MOM, they demonstrated that incorporating a DNN into the model reduces cold biases in the equatorial Pacific. The study, however, focused only on the KPP below the OSBL, but did not attempt to improve the KPP for the effects of OSBL turbulence.
However, the DNN models in those two studies are not based on LES solutions and have not targeted the popular KPP model. In this study, we aim to bridge the gap by using high-resolution LES simulations to develop DNN models capable of predicting key turbulent mixing parameters in the widely used KPP model. The models in this study are designed to enhance the realism of OSBL simulations within the framework of the KPP scheme, without altering fundamental equations or time-stepping mechanisms, thus facilitating straightforward integration into existing ocean models. The rest of the paper is organized as follows: Section 2 presents the framework of the DNN-augmented KPP (KPP_DNN) and outlines the data used to train the DNNs. Section 3 provides details on the implementation of the DNN-augmented KPP schemes into the General Ocean Turbulence Model (GOTM). Section 4 describes how the GOTM is configured. Section 5 evaluates the performance of the KPP_DNN and compares the KPP_DNN with traditional parameterizations. Section 6 explores the use of the KPP_DNN to understand OSBL turbulence and deficiency in physics-based parameterizations. Section 7 summarizes the major findings of the study and discusses future research directions.
The K-Profile Parameterization Augmented by Deep Neural Networks (KPP_DNN)
Model Description
In the KPP framework (Large et al., 1994), the expression for viscosity or diffusivity Kx is given by:
In mixing parameterizations, the OSBL depth h is typically diagnosed by identifying the depth at which the bulk Richardson number (Rib), a measure of the relative importance between shear and stable stratification, exceeds a critical value Ric. This criterion is based on linear stability analysis, which shows that stably stratified shear flow is unstable and turbulent mixing quenches when the gradient Ri exceeds a critical value of 0.3, that is, Rib > Ric = 0.3. Below the OSBL depth, Ri is larger than Ric and the flow is stable. Above the depth, Rib is smaller than Ric and the flow is turbulent. In the KPP scheme, the bulk Richardson number Rib(z) is used and the critical bulk Richardson number is set to be 0.3 (Large et al., 1994). Rib(z) is related to ocean current and stratification as,
Recent studies have shown that the effects of non-breaking waves greatly modulate turbulent fluxes in the OSBL, either enhancing or suppressing turbulent fluxes depending on the alignment between wind and waves (McWilliams et al., 2014; Van Roekel et al., 2012). When wind and waves are largely aligned, as is common across the global ocean, turbulence is enhanced by wave-driven Langmuir turbulence. When waves are significantly misaligned with the wind, as occurs when the swell is strong, turbulence is suppressed. Several recent studies (e.g., Q. Li & Fox-Kemper, 2017; Q. Li et al., 2019; McWilliams & Sullivan, 2000; Van Roekel et al., 2012) have been devoted to including wave effects into the KPP framework. In those parameterizations, referred to as KPPLT hereafter, the turbulent velocity scale (wx), and the unresolved shear velocity scale, , are modified as,
In this study, these two coefficients will be determined by Deep feedforward Neural Networks (DNNs), as opposed to traditional deterministic functions in existing studies. The DNN augmented parameterization will be called KPP_DNN hereafter.
A DNN is made up of multiple densely connected layers, including one input layer, one output layer, and multiple hidden layers (Figure 1). Each layer includes multiple neurons. Neurons between layers are connected by the following relationship:
The DNN's input layer consists of water-column variables, including potential temperature profiles (θ), salinity profiles (s), and key OSBL turbulence drivers, including wind stress (τx,τy), shortwave radiation at the ocean surface (Sw), net heat flux excluding short wave radiation (Qf), the rate of evaporation minus precipitation (Qs), vertical profiles of Stokes drift associated with ocean surface waves (ust,vst) and the OSBL depth from the previous time step. The output layer consists of a single neuron in each DNN model, predicting a specific parameter. Specifically, we have two different DNN models based on the output: model Dϵ to predict the turbulent velocity scale coefficient (ϵ), and model Dη to predict the unresolved shear coefficient (η). To prevent the DNN models from predicting unphysical values, ϵ and η were scaled to a range of 0–1, and a sigmoid activation was added to the output layer to ensure the predictions by the DNN models always fall within this range.
The DNN model utilizes a vast array of computations characterized by nonlinear activation functions with distinct weights and biases. Integrating a well-tuned DNN model into a traditional physics-based parameterization scheme not only preserves the computational stability and efficiency of a traditional physics-based model but also enables a more flexible and effective non-linear mapping from input variables to output parameters than what traditional deterministic formulas could achieve.
Compared to other machine learning model architectures, such as Long-Short Term Memory (LSTM, Hochreiter & Schmidhuber, 1997) and Fourier Neural Operator (e.g., Z. Li et al., 2020), the neural network using basic densely connected layers is much more efficient and equally accurate for the purposes of this study.
Data Generation and Curation
The data used to develop and test the KPP_DNN schemes are turbulence-resolving simulations for Ocean Station Papa (OSP) using the NCAR-LES model for the OSBL (e.g., Sullivan & McWilliams, 2010). OSP (50°N, 145°W, see Figure 2a) is located within the Northern Pacific subpolar gyre. With a long history of continuous atmospheric and oceanographic in situ observations (Cronin et al., 2023; Whitney & Tortell, 2006), OSP has been served as a pivotal site for monitoring ocean climate (e.g., Bond et al., 2015; R. E. Thomson & Tabata, 1987), understanding ocean physical and biogeochemical processes, and developing parameterization schemes extensively employed in diverse ocean models (e.g., Chalikov, 2005; Craig & Banner, 1994; Gaspar et al., 1990; Kantha & Clayson, 1994; Large et al., 1994). Figures 2b and 2c present the probability of OSBL turbulence regime at OSP based on the observed forcing conditions. The most common turbulence regime at OSP is a mix of the three types of turbulence. There are periods when Langmuir turbulence dominates, while convection or shear-driven turbulence seldom dominates. Different from the global ocean (compare the blue and black contours), the OSBL at OSP is seldom strongly convective or strongly stabilizing. LES models are currently the state-of-the-art tool to study OSBL and submsoscale turbulence (e.g., Bodner et al., 2020; Skyllingstad & Denbo, 1995; Yuan & Liang, 2021), and to develop parameterizations for those processes (e.g., Bodner et al., 2023; Liang et al., 2013; Liang et al., 2018; Liu et al., 2022).
[IMAGE OMITTED. SEE PDF]
The use of the NCAR-LES model to generate data is similar to that reported in Liang et al. (2017, 2022): The domain of the LES model is configured with 160 uniformly distributed horizontal grid points, spanning 300 m in each horizontal direction, Vertically, the LES model features 128 stretched grid points across a 200 m depth, with the finest grid equal 0.2 m at the ocean surface. The LES model was driven by a combination of observed hourly meteorological (Cronin et al., 2015), wave conditions (J. Thomson et al., 2013) and the derived surface flux products at OSP from September 2010 to December 2022. These inputs include wind stresses, wave conditions, shortwave radiation, net surface heat flux (excluding shortwave radiation), and the rate of evaporation minus precipitation, at OSP from September 2010 to December 2022. Periods when the observational wave data were not available were excluded from LES simulations. The LES simulations were restarted every 10 days, and initial conditions of each restart were derived from observed water column temperature and salinity profiles, linearly interpolated to LES vertical grids. The restart procedure is to ensure that the LES solutions do not deviate from the true state of the ocean, as large- and mesoscale processes that also modulate the physical states of the upper ocean at the station (Cronin et al., 2015) are not resolved by the LES model. Comparisons with observations show that the LES simulations closely align with observed upper-ocean states with this approach (see Figure 3). In total, 367 LES simulations were conducted.
[IMAGE OMITTED. SEE PDF]
The turbulence-resolving LES solution data set differs from that used by Liang et al. (2022) in two ways: Firstly, the simulation period is longer, spanning from 2010 to 2022 in the current study, as opposed to 2010 to 2019 in Liang et al. (2022), thereby offering more data for model training and testing. Secondly, shortwave radiation penetrates the OSBL in the current study while shortwave radiation was applied only at the ocean surface in Liang et al. (2022). The shortwave radiation at depth z, Qsw(z), is calculated as
Ensemble-averaged profiles of temperature, salinity, velocities, turbulent kinetic energy (TKE), and their turbulent fluxes were calculated online and output every 30 min. The depth of the OSBL h was diagnosed as the depth at which the vertical gradient of momentum flux decreases to 2 × 10−7 m/s2. η was then diagnosed using Equations 4 and 6 with a Ric = 0.3. wx and G(σ) in Equation 5 were first calculated using the LES solutions and formulas detailed in Large et al. (1994). ϵ was then obtained by minimizing the difference between the momentum fluxes using Equation 1 and the output momentum flux from LES solutions.
Model Training
The turbulence-resolving LES data were separated into three data sets: the training, the validation and the testing, at a ratio of approximately 6:2:2. The validation data set was used to monitor the neural network training and prevent overfitting, while the testing data set was used to evaluate the model after the training process was completed. This partition followed the random block sampling strategy (Schultz et al., 2021), dividing the data by LES runs, each covering a 10-day period. This strategy efficiently avoids spurious correlations among the data sets and ensures that all three data sets represent different climatological conditions over the entire study period.
The predicted coefficients ϵ in model Dϵ and η in model Dη were compared with ϵ and η diagnosed from LES solutions as detailed in Section 2.2. Mean square errors served as the loss to update trainable parameters in the DNNs. The DNNs were trained using TensorFlow and Keras within the R programming environment. The architecture of these DNNs varied significantly, encompassing a range of different layers (1, 2, 4, 6, 8, and 12) and neurons per layer (2, 4, 8, 16, and 32), to explore the optimal structure for our specific application. The Adam optimizer was employed across all models. Each model was trained for 1,000 epochs. The learning rate was reduced by a factor of 0.1 whenever a plateau in validation loss, quantified as Mean Squared Error (MSE), was detected during the training process. The criterion for selecting the best model was based on the smallest validation loss, a standard measure of model accuracy on unseen data, ensuring that the chosen model has the highest generalization capability.
Implementation of KPP_DNN in the General Ocean Turbulence Model (GOTM)
The GOTM (Burchard et al., 1999) is a single-column model designed to examine the behavior of various turbulent mixing parameterization schemes in the OSBL. It provides a versatile framework, allowing for the straightforward compilation and execution of different OSBL turbulent mixing parameterization schemes, making it the ideal testbed for developing and testing mixing parameterizations. The current GOTM model includes a variety of first-order and second-moment closure schemes, allowing for the comparison of different schemes within the same framework.
Adding to the capability of the GOTM, this study implements the trained DNNs, their structure and trainable parameters, into the model. The GOTM, like most earth system models, is coded exclusively in Fortran, while DNN models are typically written in high-level programming languages like Python and R, utilizing deep learning libraries such as Keras (Gulli & Pal, 2017; Ketkar & Ketkar, 2017). There are two approaches that a DNN model could be implemented in a Fortran code: The first is to hard-code the entire DNN structure and trainable parameters directly into Fortran (e.g., N. D. Brenowitz & Bretherton, 2018; Gagne et al., 2020). The other approach, adopted in this study, is to overcome the computer language interoperability by incorporating a software library that connects Fortran and Python environments, such as the Fortran-Keras Bridge (FKB, Ott et al., 2020) used in this study.
The process involves converting a trained DNN using Keras, saved in HDF format, into an ASCII file offline. This ASCII file is specifically structured for easy interpretation by the FKB. In a FKB informed Fortran program, the DNN model, including its structure and weights, is reconstructed by loading this ASCII file. During each timestep of integration in the GOTM, the necessary input array, composed of outputs from the GOTM model and forcing conditions, as detailed in Section 2.3, was normalized and fed into the loaded DNN model. Subsequently, the DNN's predictions were then denormalized and integrated back into GOTM to compute the enhancement factors in Equation 6.
Model Configurations
Three different KPP_DNNs were compared against seven existing physics-based parameterizations (Table 1) using the GOTM. The three KPP_DNNs vary in complexity. In KPP_DNN1, only the coefficients for the velocity scale coefficient ϵ predicted by Dϵ were utilized. In KPP_DNN2a, both the velocity scale coefficient ϵ predicted by Dϵ and the unresolved shear coefficient η predicted by Dη were used. Wave-induced stokes profiles were not included as inputs of the KPP_DNN2a. KPP_DNN2b was the same as KPP_DNN2a but additionally incorporated Stokes profiles as inputs. For ocean models not yet coupled with wave models, the KPP_DNN2b could be employed by reading in pre-calculated Stokes profiles calculated offline from wave spectrum products (e.g., Ali et al., 2019; Q. Li et al., 2016), or the KPP_DNN2a can be utilized without the Stokes drift as inputs.
Table 1 List of Parameterization Names and the References for the Traditional Deterministic Parameterization Schemes Compared in This Study
Parameterization name | References |
KPP_LMD | Large et al. (1994) |
KPPLT_VR12 | Van Roekel et al. (2012) |
KPPLT_RW16 | Reichl et al. (2016) |
KPPLT_LF17 | Q. Li and Fox-Kemper (2017) |
SMC_KC94 | Kantha and Clayson (1994) |
SMCLT_H15 | Harcourt (2015) |
The best-trained DNN models, identified by the smallest validation loss during training, are configured with 2 hidden layers of 16 neurons each for Dϵ and 4 hidden layers of 16 neurons each for Dη in KPP_DNN2a, and 4 hidden layers of 8 neurons each for Dϵ and 8 hidden layers of 8 neurons each for Dη in KPP_DNN2b. The performance of the training process and the model is in the (see Figure S1 in Supporting Information S1 for training and validation loss curves and Figure S2 in Supporting Information S1 for density distribution curves for DNN predictions and LES truths in the Supporting Information S1).
Seven well-known traditional deterministic parameterizations were also selected for comparison (Table 1). The KPP_LMD is the basis of KPP schemes and does not incorporate the enhancement of non-breaking waves. KPPLT_VR12 adds the enhancement of non-breaking wave effects only to the turbulent velocity scale but leaves the unresolved shear component unchanged. KPPLT_LF17 builds on KPPLT_VR12 and includes modification on both the velocity scale and the unresolved shear components. KPPLT_RW16 is similar to KPPLT_LF17, but formulas and coefficients that modify velocity scale and the unresolved shear were tuned using LES solutions under hurricane conditions, thus has a stronger enhancement than KPPLT_LF17. It should be noted that all three KPPLT schemes have considered the effects of wind-wave misalignment. Across the global oceans, wind and waves are often misaligned (e.g., Abolfazli et al., 2020; Hanley et al., 2010). When waves align with the wind, Langmuir turbulence enhances OSBL turbulence. When waves oppose the wind, OSBL turbulence is suppressed (e.g., McWilliams et al., 2014). All KPPLT schemes were tuned using LES solutions.
SMC_KC94 is the second closure model tuned using data over at a few different locations across the global oceans. This scheme does not include the non-breaking wave effects. SMCLT_H15 generalizes SMC_KC94 to incorporate the impact of non-breaking waves by including the Stokes profiles in the governing equations. Coefficients in the SMCLT_H15 scheme were tuned using LES solutions.
The performance of the seven traditional parameterizations and the three variants of KPP_DNN schemes is compared using the GOTM for the year 2011–2016, when observed meteorological conditions and directional wave spectra were continuously available. The GOTM simulations are divided into two sets. Both sets of simulations were driven by observed meteorological and wave conditions. They differ by the surface buoyancy fluxes used to drive the model. In the first set of simulation (set 1), surface buoyancy flux products at OSP provided by Pacific Marine Environmental Laboratory (PMEL), were used as input. Those fluxes were calculated using the Coupled Ocean-Atmosphere Response Experiment (COARE) algorithm with the observed ocean and atmosphere conditions. In the second set of simulations (set 2), surface buoyancy fluxes were calculated using the same COARE algorithms online during the GOTM simulations. The online flux calculation is based on observed meteorological condition and the simulated SST and sea surface salinity (SSS). The approach in the pre-calculated surface buoyancy flux has been commonly used in studies aiming at improving or comparing mixing parameterization schemes (e.g., Q. Li et al., 2019). In this model configuration, forcing conditions are identical among different simulations and the difference in solutions are purely due to mixing parameterizations. The online calculation of surface buoyancy flux in the second set of simulation is consistent with that in most realistic ocean simulations using regional and global models (e.g., Chassignet et al., 2020). In simulations driven by pre-calculated buoyancy fluxes, corrected fluxes to nudge the simulated SST and SSS to their climatological states are usually imposed to prevent the long-term drift in the solutions (e.g., Barnier et al., 1995). With this approach, however, the surface buoyancy flux is different among simulations using different mixing parameterizations. The GOTM simulations were restarted at the beginning of each year using observed temperature and salinity profiles as initial conditions. In each simulation, outputs were recorded at every 30 min. It should be noted that the GOTM with all parameterizations could be integrated for a full 6-year period without any stability issue. However, restarting at the beginning of each year mitigates the long-term drift in the solution due to the exclusion of larger-scale processes in the 1-D vertical column model (see Figure S3 in Supporting Information S1).
Performance of the KPP_DNN Models
Model Evaluation Using LES Solutions
To directly evaluate the performance of GOTM models using different turbulent mixing schemes, we compared them with LES solutions. We conducted several 9-day simulations coinciding with all the testing periods, initializing from the beginning of day 2 in each simulation, and ran for 9 days for each testing period. We then calculated three statistics for SSTs—mean standard deviation, root mean square errors, and correlations—for all the testing periods from both the LES solutions and GOTM simulations. Comparisons of four variants of KPP schemes, including KPP_DNN2b, KPPLT_LF17, KPPLT_RW16, and KPP_LMD, using the Taylor diagram (Figure 4) show that KPP_DNN2a and KPP_DNN2b share similar evaluation statistics, and both are superior to all other schemes in terms of root mean square error and correlation. Among the traditional deterministic schemes, the performance of KPPLT_LF17 is close to that of KPP_DNN2a and KPP_DNN2b, as simulations used to tune KPPLT_LF17 is also typical to mid-latitude stratified ocean. In agreement with Q. Li et al. (2019), the performance of KPPLT_RW16 and KPP_LMD is not as good for OSP, as KPPLT_RW16 was tuned using LES simulations of hurricane-forced OSBL, while KPP_LMD was tuned using observation over the global ocean.
[IMAGE OMITTED. SEE PDF]
Model Comparison Using Long-Term Integration
Figure 5 shows the evolution of surface forcing and ocean temperature profiles calculated using various mixing parameterizations for the year 2013 using pre-calculated surface buoyancy fluxes. Forcing conditions are identical for these solutions. Both wind and buoyancy fluxes exhibit distinct seasonal variability. Winds are weaker and stabilizing surface buoyancy fluxes prevail from March to early September. During winter, there were multiple storms characterized by short-term and significant strengthening in both wind and destabilizing surface buoyancy fluxes. For example, during the cold front in late September, the daily average wind speed doubled within a single day and remained above 12 m/s for approximately 1 week.
[IMAGE OMITTED. SEE PDF]
Figure 5b displays the temperature profiles calculated using KPP_LMD. From January to March, there is minimal variability in the simulated MLD and temperature. The simulated mixed layer was relatively deep, close to 100 m, and the mixed layer temperature was around 5°C. The upper ocean re-stratified quickly in April. The MLD shallowed from −100 to −20 m during April. However, the warming of the mixed layer during the month is relatively modest, about 2°C. The mixed layer continued to warm, reaching a maximum temperature of 17.6°C in early September. Since then, the mixed layer cooled and deepened. It should be noted that a marine heatwave, famously known as “the Blob,” started in the winter of 2013/2014 (Bond et al., 2015; Di Lorenzo & Mantua, 2016) resulting in a shallower and warmer mixed layer at the end of 2013 than the beginning of the year. In addition to the seasonal cycle, rapid mixed layer cooling and deepening associated with storms are also evident, leading to short-term variability in both mixed layer temperature and depth. For example, storms in early June led to notable mixed layer cooling and deepening during the longer-term seasonal warming of the mixed layer during summer, while a storm in late September accelerated mixed deepening and cooling during fall.
Figures 5c–5j show the differences between different parameterizations and KPP_LMD. In KPPLT_VR12 and KPP_DNN1, wave effects are incorporated only into the velocity scale coefficient ϵ, but not in unresolved shear coefficient η. The results (Figures 5c and 5h) demonstrate only a slight impact on the simulated temperature profiles. The deviation in temperature from the baseline KPP_LMD remained relatively minor, less than 1°C throughout the year. The mixed layer was only slightly deeper after September.
In the KPP schemes that include wave effects in both velocity scale coefficient ϵ and unresolved shear coefficient η, that is, KPPLT_LF17, KPPLT_RW16, KPP_DNN2a and KPP_DNN2b, as shown in Figures 5d, 5e, 5i and 5j, the simulated mixed layer using those schemes was evidently cooler and deeper throughout the year than that using KPP_LMD. A warm anomaly was observed at a depth of approximately 120 m throughout the year. That is the greatest depth that the mixed layer reached in March and well below the mixed layer after April when the water column re-stratified, thus highlighting the significance of OSBL mixing in shaping upper-ocean thermal profiles and heat transfer between the surface and the interior ocean. Among these solutions, the one using KPPLT_RW16 displays the most rapid mixed layer cooling and deepening in Fall, implying the strongest mixing during that period, consistent with the finding by Q. Li et al. (2019). The stronger mixing by KPPLT_RW16 is attributed to the use of hurricane conditions to tune the coefficients. The simulated short-term mixed layer cooling and deepening due to storms, and the subsequent short-term warming and restratification by these four parameterizations were also more dramatic than those by KPP_LMD, KPPLT_VR12 and KPP_DNN1. These results highlight the importance of accounting for the unresolved shear coefficient η in modeling wave effects in parameterizations under the KPP framework.
For SMC_KC94 (Figure 5f), which did not incorporate wave effects, the simulated mixed layer tends to be shallower and warmer throughout the year compared to that in KPP_LMD, indicating that the parameterized mixing in SMC_KC94 is weaker than that in KPP_LMD. This is particularly evident during the first half of the year when the mixed layer warms and re-stratifies. With the inclusion of wave effects, the simulation using SMC_H15 yields a mixed layer that is cooler and deeper compared to the one using SMC_KC94. Between January and March, the simulated mixed layer using SMC_H15 exhibits higher temperatures than those generated by the KPPLT and KPP_DNN2 schemes. The re-stratification predicted by SMC_H15 occurs more rapidly than that by KPP_LMD, evidenced by a sharper increase in mixed layer temperature during April. The simulated mixed layer cooling and deepening rates by SMC_H15 in fall is close to those using KPPLT_LF17, KPP_DNN2a and KPP_DNN2b.
The time series of SST for the years 2011–2016 are presented in Figure 6. Observed SST is plotted as reference and is not a metric to evaluate the 1D simulations, as observed SST includes contributions from processes other than OSBL turbulence. The simulated SST is mostly warmer than observation at the end of the year for all years. At the OSP, large- and meso-scale processes also contribute to the annual cycle of SST (Cronin et al., 2015). Across the six years simulated, SST was the highest using the SMC_KC94 and the lowest using KPPLT_RW16, respectively, implying that mixing is the weakest in SMC_KC94 and is the strongest in KPPLT_RW16. When using KPPLT_VR12 and KPP_DNN1, the simulated SST is close to that in KPP_LMD throughout the 6 years, reaffirming that KPP parameterizations without counting on wave effects on unresolved shear coefficient η has only limited impact on the evolution of MLD and temperature within the mixed layer.
[IMAGE OMITTED. SEE PDF]
The simulated SSTs using KPPLT_LF17, SMCLT_H15, KPP_DNN2a, and KPP_DNN2b were lower than that using KPP_LMD, KPPLT_VR12, KPP_DNN1, and SMC_KC94, but higher than that using KPPLT_RW16. There is a considerable difference between the solutions of the two KPP_DNN2 schemes: KPP_DNN2a and KPP_DNN2b. The simulated SSTs by the two schemes were close to each other for the year 2013. In other simulated years, the simulated SSTs when using KPP_DNN2b were noticeably cooler than that using KPP_DNN2a. The difference between KPP_DNN2a and KPP_DNN2b highlights the different roles that waves played in different years.
The simulated SSTs using KPPLT_LF17, SMCLT_H15, KPP_DNN2a, and KPP_DNN2b were more closely aligned with both the magnitude and the tendency of the observed SSTs in OSP than using KPP_LMD, KPPLT_VR12, KPPLT_RW16 and SMC_KC94. However, it is important to note that the one-dimensional column models like the GOTM do not account for processes at a scale larger than boundary layer turbulence, such as submesoscale, mesoscale, and large-scale circulations. Therefore, differences between GOTM solutions and observations should be interpreted with caution as they could be due to contributions by those larger-scale processes. As pointed out by Large et al. (1994), OSP is often impacted by heat advection between September and February, a factor that can significantly modulate SSTs but is not included in the 1D GOTM simulation, thus often causing larger discrepancies between simulated and observed SSTs during these months. For example, observed cooling is stronger than the simulated cooling by all schemes during November 2016 and warmer than the simulated cooling by all schemes with wave effects during December 2013.
The simulated SSTs, derived using online flux calculation (set 2), are presented in Figure 7. With online flux calculation, the buoyancy fluxes vary across different simulations. A lower simulated SST results in smaller surface heat loss, as both the outgoing long wave and the sensible heat loss calculated from the COARE algorithm are both smaller. Different from the solutions using pre-calculated fluxes (set 1) shown in Figure 6, the differences in SST among simulations employing different turbulent mixing schemes in Figure 7 were much smaller, mostly less than 0.5°C. The simulated SSTs using different parameterization schemes were also more closely aligned with observations. However, starting from November, consistent biases from the observed SSTs were found in each simulated year, with SSTs generally being higher except for the year 2013. The deviated SSTs in winter are due to the advection effects which are not considered in the 1D GOTM model, while the unique SST biases in winter 2013, is likely due to the heatwave “Blob.” These biases underscore the influence of advection on SSTs, an impact that could not be completely mitigated by online flux calculation using bulk formulas.
[IMAGE OMITTED. SEE PDF]
Figure 8 shows the differences in the simulated MLDs between simulations driven by pre-calculated buoyancy fluxes (set 1) and those driven by fluxes calculated online using bulk formulas (set 2). During the summer months, the simulated MLDs in set 1, using pre-calculated flux, were mostly slightly shallower with KPP_LMD, and slightly thicker with KPPLT_RW16 in comparison with the observed MLDs. Simulated MLDs in set 2, which used online flux calculation, were shallower and better aligned to observations. During this period, online flux calculation reduces biases in both simulated SSTs and MLDs. However, during the colder months from January to April and after November, when the simulated SST is higher than the observed SST (Figure 7), the simulated MLDs were deeper when driven by fluxes calculated online using bulk formula. Note that during these periods, the MLDs in simulations using parameterization schemes with wave enhancements (KPPLT_LF17, KPPLT_RW16 and KPP_DNN2b) were deeper than the observed mixed layer. Results of simulations over a 6-year period from 2011 to 2016 (see Figure S3 in Supporting Information S1) confirms that all the simulations using online flux calculation efficiently eliminates the warming drift of SSTs, but the deviations of MLDs in colder months amplified over years, even for KPP_LMD. While the online flux calculation has the potential to reduce biases in the simulated SST, it could conversely increase biases in the simulated MLDs.
[IMAGE OMITTED. SEE PDF]
The Efficiency of KPP_DNNs
A parameterization must be efficient so that it can be used in realistic ocean models for long-term integrations. The efficiency of the KPP_DNNs is evaluated by comparing them with the traditional KPP and KPPLT schemes (refer to Table 2) within the GOTM framework.
Table 2 List of Parameterization Names and Their Run Time
Simulation name | Run time (seconds) |
KPP_LMD | 3.06 |
KPPLT_VR12 | 3.07 |
KPPLT_LF17 | 3.48 |
KPPLT_RW16 | 4.12 |
KPP_DNN1 | 3.09 |
KPP_DNN2a | 3.16 |
KPP_DNN2b | 3.20 |
Simulations were conducted on a dedicated single core of Intel Cascade Lake (Intel® Xeon® Platinum 8,260 Processor) CPUs on the Louisiana Optical Network Initiative's high-performance computing server (LONI-HPC). The year 2013 served as the benchmark period for the GOTM model runs to evaluate efficiency. In all simulations, the forcing and configuration were identical. To ensure accuracy in measuring computational efficiency, we disabled output.
The results showed that the run times for KPP_DNNs are comparable with those of traditional KPP and KPPLT schemes. Specifically, the run time for KPP_DNNs exceeds less than 4% that of KPP_LMD and KPPLT_VR12 but is 8% faster than that for KPPLT_LF17% and 22% faster than that for KPPLT_RW16. This comparison suggests that KPP_DNN schemes are suitable for implementation in realistic ocean and climate models.
Neural Network Enabled Understanding of OSBL Turbulence and Parameterization
Neural networks excel at discovering complex relationships and can be employed to help understand the complicated dynamics in OSBL turbulence and potential missing relationships in traditional physics-based KPP schemes.
Figure 9 presents the dependence of the velocity scale coefficient (ϵ in Equation 5) on turbulent Langmuir number and MLD and compares it across KPPLT_LF17, KPPLT_RW16, and KPP_DNN2b. In all three schemes, the magnitude of ϵ shows a clear dependence on the non-dimensional turbulent Langmuir number (Lat). Specifically, a smaller Lat is associated with a larger ϵ, indicating wave-induced turbulence has a larger effect on mixing. ϵ by KPPLT_RW16 (Figures 9c and 9d) is the largest among the three schemes. ϵ by KPP_DNN2b displays a dependence on the MLD as well. The deeper the MLD, the larger the ϵ.
[IMAGE OMITTED. SEE PDF]
Figure 10 shows the unresolved shear coefficient (η in Equation 6) for KPPLT_LF17, KPPLT_RW16, and KPP_DNN2b. As demonstrated in the simulated temperature profiles and SST (Figures 5 and 6), the magnitude of the unresolved shear coefficient η is more impactful than the magnitude of the velocity scale ϵ coefficient in the simulation of upper-ocean temperature and stratification.
[IMAGE OMITTED. SEE PDF]
For KPPLT_LF17 (Figures 10a and 10b), η only varies with forcing conditions when surface buoyancy forcing is destabilizing. Under stabilizing buoyancy forcing conditions, the velocity scale of unresolved shear by KPPLT_LF17 is the same as that by KPP_LMD, thus η = 1.0 regardless of the wind-wave-buoyancy condition or MLD. Under destabilizing buoyancy forcing conditions, the average value of η ranges from 1.0 to 2.5, but there is no apparent correlation between η and either Lat or MLD. For KPPLT_RW16 (Figures 10c and 10d), there is an apparent relationship between η and Lat. The more dominant the wave effect over the wind effect, the smaller the Lat and the larger the η. However, there are no apparent differences of η in magnitudes under different buoyancy forcing conditions or MLD if Lat is the same. The more dominant the wave effect over the wind effect, the smaller the Lat and the larger the η.
In KPP_DNN2b (Figures 10e and 10f), η is impacted not only by Lat, but also by MLD and surface buoyancy forcing. Similar to KPPLT_LF17 and KPPLT_RW16, η increases with decreasing turbulence Langmuir number for all MLDs. However, different from the two KPPLT schemes, there is also an evident relationship between η and MLD: η decreases with increasing MLD, implying a weaker influence of Langmuir circulations on mixed layer deepening when the mixed layer is deeper. Langmuir circulation arises from wave-current interaction close to the surface, where it exhibits the greatest intensity (e.g., McWilliams et al., 1997). Q. Li and Fox-Kemper (2020); Weller and Price (1988) found no significant wave effect at the base of the mixed layer if the MLD exceeds −40 m deep.
Furthermore, η also depends on whether the surface buoyancy forcing is stabilizing or destabilizing. For the same Lat and MLD, η is larger when surface buoyancy forcing is stabilizing, indicating that the traditional KPP scheme (KPP_LMD) more significantly and more consistently underestimates the entrainment effect. As discussed in Ali et al. (2019), due to an underestimation of entrainment effect, KPP schemes tend to underestimate the summertime MLD everywhere, during which season stabilizing buoyancy forcing is predominant. The KPPLT schemes, such as the KPPLT_LF17 slightly improve the simulated summertime MLD, but still underestimate it. The KPP_DNN2b scheme identifies the limitation of KPP_LMD and predicts larger η under stabilizing buoyancy forcing.
Conclusions
In this study, feedforward DNNs tuned using 11-year solutions of turbulence-resolving large eddy simulations (LES) driven by realistic forcing conditions at OSP, are used to improve one of the most popular parameterizations for mixing in the OSBL, the KPP. Specifically, the DNNs are used to parameterize two coefficients: the turbulent velocity scale coefficient ϵ and the unresolved shear coefficient η in Equations 5 and 6 respectively. These two coefficients revise the turbulent velocity scale and the unresolved shear, two key parameters in the KPP. The KPP_DNNs are implemented into the GOTM, a one-dimensional column model and commonly used testbed of turbulence parameterization. The KPP_DNNs are compared with seven popular traditional deterministic schemes, including variants of the first-order KPP and SMC schemes within the GOTM using simulations for upper-ocean conditions at OSP between 2011 and 2016. Key conclusions from this study are summarized as follows:
-
The KPP_DNNs are stable, accurate and efficient for integration over several years. The KPP_DNN scheme including wave effects, that is, the KPP_DNN2b, is 8% faster than KPPLT_LF17% and 22% faster than KPPLT_RW16.
-
When using the pre-calculated flux products, the simulated mixed layer is the warmest and the shallowest using the schemes without wave effects, that is, KPP_LMD and SMC_KC94. The simulated re-stratification in spring is faster when SMC compared to KPP.
-
Biases in the simulated SST are smaller when using online buoyancy flux calculation using bulk formulas (Simulation Set 2) compared to using pre-calculated flux (Simulation Set 1). However, biases in the simulated MLD are larger with the online buoyancy flux calculation.
-
In KPP_DNN2b, the value of the turbulent velocity scale coefficient ϵ and the unresolved shear coefficient η not only increases with decreasing Lat, but also changes with the thickness of the mixed layer. As the mixed layer deepens, ϵ increases while η decreases. When MLD and Lat are identical, η is smaller when surface buoyancy forcing is destabilizing compared to stabilizing.
The KPP_DNN2 schemes not only reproduce the dependence of turbulent mixing on Langmuir number, but also uncover the dependence on the MLD and whether the surface buoyancy forcing is stabilizing or destabilizing. This study highlights the ability of deep learning to discover relationships and physics not easily identified in traditional deterministic KPP schemes, and to incorporate complex, multifaceted influences on turbulent mixing in the OSBL.
There are two main directions for further studies.
-
The first is to implement and evaluate the KPP_DNNs in a realistic ocean model for a regional ocean. We are currently conducting LES simulations for the Gulf of Mexico and will include those new simulations in the training data set of the KPP_DNNs. Both the KPP_DNN2b scheme, with Stokes drift profiles included as model inputs, and the KPP_DNN2a will be implemented in the HYCOM model configured for the Gulf of Mexico (Dukhovskoy et al., 2015; Laxenaire et al., 2023).
-
The capability of the KPP_DNN scheme will be expanded to include other turbulent and geographic regimes by including training data for those regimes and geographic locations as training data. The current KPP_DNN is trained using LES solutions at OSP, where turbulence is typical of the mid-latitude oceans, and is accurate for the mid-latitude oceans with similar turbulent regimes. To have a KPP_DNN suitable for other regions, existing and new LES simulations for strongly convective high-latitude oceans (e.g., Skyllingstad & Denbo, 1995), configuration typical of the equatorial regions (e.g., Schmitt et al., 2024; Whitt et al., 2022), and strongly forced hurricane conditions (e.g., Liang et al., 2020) can be added to the training data set. Finally, some other factors and conditions, such as the horizontal component of the Earth's rotation (Liu et al., 2018), and a background front (e.g., Fan et al., 2018; Taylor & Thompson, 2023; Yuan & Liang, 2021) also modulates OSBL turbulence. Adding those LES simulations will further expand the capability of KPP_DNN. The advantage of neural networks is their flexibility to accurately map any input to any output. This advantage will be evident when the KPP_DNN is used for multiple regimes.
Acknowledgments
The authors thank Dr. Aakash Sane and Dr. Luke Van Roekel for many comments and suggestions that improve the manuscript. JY and JHL were supported by the Office and Naval Research through Grant N00014-23-1-2553. EC and OZ were supported by the Office of Naval Research through Grant N00014-23-1-2547. JHL was also supported by the National Science Foundation through Grant OCE1945502. Computation was performed at supercomputers provided by the high-performance computing at Louisiana State University and Louisiana Optical Network Initiative (LONI). This is PMEL publication #5622.
Data Availability Statement
The observed temperature and salinity profiles, the forcing data, the derived surface fluxes at OSP can be downloaded from the PMEL website (). The GOTM codes with KPP_DNN model and COARE bulk flux algorithm implemented are available at Liang (2024).
Abolfazli, E., Liang, J. H., Fan, Y., Chen, Q., Walker, N. D., & Liu, J. (2020). Surface gravity waves and their role in oceanOcean‐Atmosphere coupling in the Gulf of Mexico. Journal of Geophysical Research‐Oceans, 125(7), [eLocator: e2018JC014820]. [DOI: https://dx.doi.org/10.1029/2018JC014820]
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
This study utilizes Deep Neural Networks (DNN) to improve the K‐Profile Parameterization (KPP) for the vertical mixing effects in the ocean's surface boundary layer turbulence. The deep neural networks were trained using 11‐year turbulence‐resolving solutions, obtained by running a large eddy simulation model for Ocean Station Papa, to predict the turbulence velocity scale coefficient and unresolved shear coefficient in the KPP. The DNN‐augmented KPP schemes (KPP_DNN) have been implemented in the General Ocean Turbulence Model (GOTM). The KPP_DNN is stable for long‐term integration and more efficient than existing variants of KPP schemes with wave effects. Three different KPP_DNN schemes, each differing in their input and output variables, have been developed and trained. The performance of models utilizing the KPP_DNN schemes is compared to those employing traditional deterministic first‐order and second‐moment closure turbulent mixing parameterizations. Solution comparisons indicate that the simulated mixed layer becomes cooler and deeper when wave effects are included in parameterizations, aligning closer with observations. In the KPP framework, the velocity scale of unresolved shear, which is used to calculate ocean surface boundary layer depth, has a greater impact on the simulated mixed layer than the magnitude of diffusivity does. In the KPP_DNN, unresolved shear depends not only on wave forcing, but also on the mixed layer depth and buoyancy forcing.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details



1 Department of Oceanography & Coastal Sciences, Louisiana State University, Baton Rouge, LA, USA
2 Department of Oceanography & Coastal Sciences, Louisiana State University, Baton Rouge, LA, USA, Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, USA
3 Center for Ocean‐Atmospheric Prediction Studies, Florida State University, Tallahassee, FL, USA
4 Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, USA, Department of Mathematics, Louisiana State University, Baton Rouge, LA, USA
5 NOAA Pacific Marine Environmental Laboratory (PMEL), Seattle, WA, USA