Full Text

Turn on search term navigation

Introduction

Computational simulations of the coupled atmosphere and ocean are complex, multidisciplinary and multiscale. They are subject to many numerical and physical parameters that influence the model dynamics. Many of the required parameters are globally defined, not completely understood, and relatively uncertain. The persistent model error (or model bias) can be highly sensitive to the specification of such parameters, and is a significant barrier to predictability. Typically, ocean only general circulation models (GCM) forced by accurately observed atmospheric winds as a boundary condition produce realistic large scale dynamics, such as the El Niño Southern Oscillation. This is also the case for atmosphere only GCMs forced by sea surface temperatures as a boundary condition. However, when this boundary condition is removed in coupled GCMs simulating the oceans and atmosphere, it is much more difficult to achieve realistic dynamics at the interface (Boucharel et al., 2011). The typical, and important, approach to reducing model error, is to run various simulations with manually determined parameter values. A decision is then made on which configuration is the most fit for purpose. Here, we develop a systematic and objective solution to model calibration and parameter tuning using an ensemble Kalman filter (EnKF). With the aid of data assimilation (DA), we focus on understanding the influence that changes in the optical ocean properties have on the forecast skill of a coupled general circulation model, and the related physical changes to the air‐sea heat transfer and ocean mixed layer depth structure. We are specifically interested in the shortwave penetration depth and ocean albedo, for which we have observations at the scales of interest (e.g., global ocean surface and in situ temperatures).

In general, DA provides the ability to modify imperfect simulations of reality with a series of incomplete and possibly noisy measurements, ideally resulting in a better representation of the true system state and also model parameters. The original Kalman filter was developed to do so for linear time invariant systems with approximately Gaussian model and observational error (Kalman, 1960). Under these conditions, estimates of both the state and state error covariance are evolved in time using the associated linear operators. For nonlinear systems, one can employ the extended Kalman filter (EKF), which is essentially the standard Kalman filter with the linear operators replaced by linearized versions of the full nonlinear system (Miller et al., 1994). For strongly nonlinear systems, however, this tangent linear approach does not always adequately represent the behavior of the system. In this instance the EnKF of Evensen (1996) is a more robust approach, where the state estimate and error covariances are sampled directly from an ensemble of many nonlinear simulations, each initiated from different initial conditions. For the combined state and parameter estimation problem, the Kalman filter state vector is extended to also include the parameters, with each simulation additionally initiated with different parameter values (Evensen, 2009). In all of the above variants, the Kalman filter effectively pushes the model strongly (weakly) to the observations when the model uncertainty is large (small) compared to the prescribed observational uncertainty.

Coupled DA is the term given to undertaking DA specifically in coupled atmospheric and oceanic GCMs, be it using Kalman filtering or some other form of estimation. The earliest coupled DA studies include those of Zhang et al. (2007) and Sugiura et al. (2008). It has the potential to make better use of observations from different realms to constrain coupled models, create more dynamically balanced analyses, and hence minimize initialization inconsistencies in the coupled model (Chang et al., 2013). Strongly coupled DA is when observations within a particular realm (e.g., ocean) contribute to the corrections (or increments) applied to other realms (e.g., atmosphere). In contrast, weakly coupled DA is when observations within a given realm only contribute to the increments of that realm, with these observations influencing the other realms via the fluxes at the realm interfaces. The interested reader is referred to the introduction of Lu et al. (2015) for a detailed overview of the distinction between strongly and weakly coupled DA. Varying combinations of strong and weak coupling, have been shown to improve air‐sea heat fluxes in oceanic and atmospheric boundary layers (Laloyaux et al., 2016; Penny et al., 2019; Storto et al., 2018). In fact, in the present study we will be estimating parameters influencing the flux of heat into the ocean, and also the heat distribution. Sandery et al. (2020) studied the relative differences between all variants of strongly and weakly coupled DA in a complex climate model assimilating many of the available ocean, sea‐ice, and atmospheric observations. In that system, the combination of strongly coupled oceanic observations and weakly coupled atmospheric observations, was found to produce the most well constrained analyses. We adopt this particular coupling configuration here, as detailed in Section 2.6.

Within a DA framework, multiparameter estimation has been demonstrated to improve climate models of varying complexity across a range of spatiotemporal scales. The following is a cross section of the research undertaken in this area, however, a more detailed review can be found in Zhang et al. (2020). Jackson et al. (2004) adopted Bayesian stochastic inversion methods to estimate the optimal parameters and associated uncertainties in a surrogate climate model, based on fields generated from an atmospheric GCM coupled to a 50 m slab ocean. Annan et al. (2005b) used an EnKF to simultaneously estimate the state and five parameters within a spectral atmospheric model, resulting in an improved fit to an atmospheric reanalysis. With application to a coupled atmosphere ocean intermediate complexity model, Annan et al. (2005a) used an EnKF to estimate 12 parameters and in doing so successfully tuned the model climatology to better agree with the observed one. In a numerical weather prediction context, Aksoy et al. (2006) used an EnKF to estimate the state and vertical mixing parameter in twin experiments with a regional area atmospheric model. The EnKF system was able to recover this nominal parameter, within acceptable error bounds. Within a coupled DA framework, Sugiura et al. (2008) used a four‐dimensional variational method to assimilate ocean surface and interior observations into a low resolution coupled GCM, estimating the ocean state and exchange coefficients governing the bulk air‐sea fluxes. With application to an intermediate complexity coupled atmosphere ocean model, Kondrashov et al. (2008) showed that an EKF undertaking both state and parameter estimation, as opposed to only state estimation, improved the fit to both synthetic model generated data and also real world observations. In Zhang et al. (2012), the ensemble adjustment Kalman filter (EAKF) was used to estimate the state and parameters of an idealized system comprising of a three‐variable Lorenz model coupled to a slab ocean. The learned parameters were shown to extend the lead time of predictability for both the atmospheric and oceanic components. For an atmospheric GCM, experiments in Tett et al. (2017) estimated 7 and 14 parameters using flavors of the Gauss‐Newton algorithm, to minimize the difference between simulated and observed large scale multiyear averaged net radiative fluxes, which were then applied in a coupled GCM. The stability of the coupled GCM was dependent upon the selection of the estimated parameters. The EAKF was also adopted in Zhao et al. (2019) to estimate the state and multiple parameters in a variety of models the most complex of which was a coupled intermediate complexity model. Reduced error was achieved when the most sensitive parameters were estimated, however, error was increased when insensitive parameters were additionally included. The joint estimation of the model state and multiple parameters is clearly a fertile area of research across a hierarchy of model complexities.

The intention of this study is to develop a means by which one can simultaneously estimate prognostic variables and spatially dependent model parameters in a complex coupled GCM, and in doing so derive parameter maps that may improve the underlying model in a multiyear climate forecasting context. We adopt the Climate re‐Analysis and Forecast Ensemble (CAFE) system, described in Sandery et al. (2020), which employs the ensemble transform Kalman filter (ETKF) variant first proposed by Bishop et al. (2001), and specifically the implementation of Sakov (2014). Using this coupled DA system we estimate the model state and spatiotemporally varying parameters that minimize the error between short term forecasts of the coupled climate model and a network of real world atmospheric, oceanic, and sea‐ice observations. In all of the DA experiments presented within, the parameters are trained during a recent period (January 2010 to December 2011) in which the Earth system was relatively well observed by virtue of the enhanced coverage of in situ ocean observations by the Argo array. The parameters of interest are the ocean albedo and shortwave ocean penetration e‐folding length scale. They are related to the uncertainty in optical water properties, and directly influence the uptake and distribution of heat in the ocean. To our knowledge, this is the first study attempting geographically dependent parameter estimation using strongly coupled DA on GCMs of this complexity and resolution.

The manuscript is organized as follows. The DA framework is outlined in Section 2 including: the ETKF algorithm and implementation; details pertaining to the adopted coupled GCM; and the ingested observations. In Section 3 we determine the sensitivity of the parameter values to estimating them individually and simultaneously over a 28‐day DA forecast window, and determine the influence the estimation of these parameters has on the DA skill metrics. Relative to baseline (or control) forecasts, we also assess the skill of multiyear climate forecasts adopting the final converged parameter maps, which are initiated during an out‐of‐sample period that was not used to learn the model parameters. The remaining experiments are all undertaken estimating only the shortwave e‐folding length scale. In Section 4 the sensitivity to DA cycle length (3–28 days) is determined for the parameter values, and the multiyear forecast performance using the final converged parameter maps. For the case that produces the least biased out‐of‐sample multiyear forecasts, in Section 5 we compare its ocean mixed layer depth and surface heat fluxes to those of the baseline simulation. Concluding remarks are then made in Section 6.

State and Parameter Estimation Theory and Implementation

This section presents the Kalman filter theory and its extension to parameter estimation. We also provide details on the coupled GCM, parameter forward model, observational network, and the structure of the DA coupling.

ETKF State Estimation

In ensemble Kalman filtering one first defines an ensemble of k forecasts ( $x_{e}^{f}$ ) and associated analyzed states ( $x_{e}^{a}$ ), where e is the ensemble member number. Each ensemble member state vector is of length n, where n is the number of state elements. In the present application $x_{e}^{f}$ and $x_{e}^{a}$ contain the state variables in the atmosphere, ocean and sea‐ice at all climate model grid points. The ensemble members are decomposed according to [Image Omitted. See PDF] [Image Omitted. See PDF]where 〈·〉 denotes the ensemble average, and $\hat{}$ the ensemble anomaly. To facilitate the following discussion, we define the anomaly matrices A^f and A^a of size n × k, respectively containing ${\hat{x}}_{e}^{f}$ and ${\hat{x}}_{e}^{a}$ in their columns for each ensemble member. In the ETKF, the ensemble average and ensemble anomalies are updated separately.

The ensemble averaged forecast, 〈x^f〉, is updated by the ensemble averaged increment, 〈xⁱ〉, to return the ensemble averaged analysis, 〈x^a〉, according to [Image Omitted. See PDF] [Image Omitted. See PDF]

In the current study y is a vector of daily mean observations of length p, where p is the number of observations, $〈 \bar{x^{f}} 〉$ is the ensemble average of daily mean forecasts, and $y - 〈 H (\bar{x_{e}^{f}}) 〉$ is the ensemble average innovation vector. The operator $H$ is designed to observe the model state as the instrument observes the real world. The Kalman gain, K, is a matrix of size n × p, given by [Image Omitted. See PDF]where R is a p × p observational error covariance matrix prescribed to each observation type. H is a p × n matrix, and is a linearized version of $H$ . The superscript ⁻¹ denotes the matrix inverse, and ^T the matrix transpose operation. The forecast error covariance is an n × n matrix defined as [Image Omitted. See PDF]where the diagonals of P^f are the error variance of a given state variable. Low spread across the ensemble indicates a degree of consensus among the ensemble members, while a large spread indicates higher uncertainty in the forecast. The analysis anomalies, ${\hat{x}}_{e}^{a}$ , contained in the columns of A^a, are given by [Image Omitted. See PDF] [Image Omitted. See PDF]with T_R being the k × k transform matrix. The moderating parameter, ζ ∈ (0, 1], is specific to the implementation of Sakov (2014). It controls the extent of contraction in the ensemble anomalies, while leaving the ensemble mean increment unchanged. In the present experiments ζ = 0.8. Finally Equation 2 is applied to calculate the ensemble analysis fields, $x_{e}^{a}$ , for each ensemble member from the sum of 〈x^a〉 and ${\hat{x}}_{e}^{a}$ . Analogous to Equation 3, increments for each ensemble member are then defined as $x_{e}^{i} = x_{e}^{a} - x_{e}^{f}$ .

Extension to Parameter Estimation

The above formulation of the ETKF can also be extended to undertake both state and parameter estimation, by redefining the forecast state vector for ensemble member e as [Image Omitted. See PDF]where ${[x_{S}^{f}]}_{e}$ contains the model prognostic state variables of length n_S, and ${[x_{ψ}^{f}]}_{e}$ contains any estimated model parameters of length n_ψ, with analogous expressions for the analysis state vector $x_{e}^{a}$ . The total number of elements for a given ensemble member is n = n_S + n_ψ. The associated forecast anomaly ensemble is then redefined as [Image Omitted. See PDF]where $A_{S}^{f}$ contains the state anomalies of size n_S × k, and $A_{ψ}^{f}$ contains the model parameter anomalies of size n_ψ × k, with analogous expressions for the analysis ensemble A^a. Since the parameters are not required to observe the state, the linearized observational operator is redefined as [Image Omitted. See PDF]where 0 is p × n_ψ matrix of zeros, and H_S is the operator applied to the model state fields and of size p × n_S. It follows from Equation 5, that the Kalman gain, required to update of the ensemble mean, expands out to [Image Omitted. See PDF]where $P_{S S}^{f} = A_{S}^{f} {(A_{S}^{f})}^{T} / (k - 1)$ captures the covariances between the model prognostic variables, and $P_{ψ S}^{f} = A_{ψ}^{f} {(A_{S}^{f})}^{T} / (k - 1)$ captures the covariances between the ensemble of forecast prognostic variables and ensemble of forecast model parameters. Note, when expanded out the transform matrix in Equation 8 required to update the anomalies is unchanged. Unlike the Kalman gain, the construction of the transform matrix does not make use of the forecast parameter ensemble, $A_{ψ}^{f}$ . If both n_S and n_ψ are nonzero, then one is undertaking both state and parameter estimation. If n_ψ = 0, then one is only estimating the model prognostic variables, as is done for the baseline case within the present study. As mentioned in the introduction, for the parameters to be estimated appropriately, there must be useful sampled covariances between the parameter and forecast state ensembles ( $P_{ψ S}^{f}$ ). These covariances are used in the calculation of the Kalman gain in Equation 12. In nonlinear systems, the covariances between the specified parameters and final evolved prognostic variables may grow (decay) with increasing DA cycle length, if the underlying model biases targeted by said parameters, are large (small) with the respect to the propensity for nonlinear interactions with other aspects of the system. However, the increment to the parameter field is by definition proportional to the distance the model is from the observations—see Equation 4. So the forecast period must be sufficiently large for the systematic biases to emerge. Hence, there are potentially conflicting factors effecting the selection of the optimal DA cycle length. A key part of our study is, therefore, to determine the sensitivity of our results to forecast window length.

State Forward Model—The General Circulation Model

In the present set of experiments we adopt the Geophysical Fluid Dynamics Laboratory Coupled Model version 2.1 described in Delworth et al. (2006), with an upgraded oceanic component (MOM4.1–MOM5.1). The climate model also comprises of atmospheric (AM2.1), land (LM2), and sea‐ice (SIS1) components, with fluxes exchanged via a coupler (FMS). AM2.1 has a grid resolution of 2° in latitude (ϕ), 2.5° in longitude (λ) with 24 vertical levels, driven by realistic incoming solar radiation, aerosols, radiative gases and land cover. The MOM5.1 grid is tripolar over the Arctic north of 65°N. It has 1° resolution in longitude, higher latitudinal resolution in specific regions and 50 vertical levels (Bi et al., 2013). SIS1 has the same horizontal resolution as the sea‐ice grid, with five ice thickness categories. Time step sizes of the ocean and atmospheric models are 1 and 0.5 h, respectively. The ocean, atmosphere, land, and sea ice exchange heat, mass and momentum fluxes every hour. Laplacian and biharmonic lateral ocean mixing operators are applied to the horizontal momentum equations (Griffies & Hallberg, 2000). We adopt the K‐profile parameterization (KPP) of Large et al. (1994) for vertical ocean mixing of the scalar variables, comprising of two terms: one proportional to local scalar gradients; and another proportional to nonlocal properties of the oceanic boundary layer. The neutral physics implementation of Griffies (1998) is used to represent the iso‐pycnal diffusivity of Redi (1982) and di‐pycnal diffusivity of Gent and McWilliams (1990).

All of the DA experiments undertaken within are initialized on January 1, 2010 and end in December 2012. The required initial 96 member ensemble of model prognostic variables is taken from the ensemble of a previous coupled DA experiment in Sandery et al. (2020).

Parameter Forward Model

Using the ETKF we estimate: the ocean albedo (α), which parameterizes the reflection of incoming radiation at the ocean surface; and the shortwave ocean penetration e‐folding length scale (L_SW), which parameterizes the distribution of shortwave radiation within the ocean interior. To ensure against collapse of the parameter ensemble after repeated updates, we adopt a simplified white noise version of the approach outlined in Evensen (2003) for the forward model of these parameter fields. Both L_SW and α have two horizontal spatial dimensions of λ and ϕ. At DA cycle index i, and for each ensemble member e, a given parameter (ψ_e (λ, μ, i)) is evolved according to [Image Omitted. See PDF]where Δt is the DA cycle length in days, and w_e (λ, ϕ, i) is drawn from a Gaussian distribution of zero mean and standard deviation σ. Unless otherwise stated, the standard deviations of the noise component are σ = 0.02 m (days)^−1/2 for L_SW, and σ = 5 × 10⁻⁴ (days)^−1/2 for α. The initial ensemble of parameters used in the DA experiments are spatially homogeneous, with their amplitude drawn from a uniform distribution to span a physically valid range of values. For L_SW, the initial ensemble and all subsequent estimated fields are bounded between 0.02 and 60 m. For α the initial ensemble is bounded between 0.12 and 0.18, with all subsequently estimated fields enforced to be bounded between 0 and 1. We have assessed the influence of σ for the case in which only L_SW is estimated. In comparison to the selection of estimated parameters and the DA cycle length, the perturbation amplitude is found here to have minimal influence. The impact of σ would presumably be larger when adopting initial parameter ensembles of reduced range. We also find a tendency of smaller amplitudes of σ to produce more spatially coherent parameter maps and a less biased multiyear climate forecast. However, this may not necessarily be the case when using a spatially correlated noise component as adopted in Rucksthul and Janjíc (2020). The influence of the correlation length scale in such a forward model, is out of the scope of the present study, but will be subject of future research.

The above mentioned parameters enter into the climate model as follows. The ocean albedo controls the extent of incoming radiation that is, reflected from the surface, and in the baseline case is prescribed using the scheme of Taylor et al. (1996), where at model time t [Image Omitted. See PDF]with θ (λ, ϕ, t) the zenith angle of the Sun. The absorbed total incoming radiation is then decomposed into two parts, 43% shortwave (Jerlov, 1976; Paulson & Simpson, 1977), and the remainder longwave which does not penetrate past the first level of the ocean model. At a given depth z, the incoming shortwave component is given by [Image Omitted. See PDF]where F_SW is the associated per unit area radiant flux. In the baseline case L_SW (λ, ϕ, t) is a two‐dimensional monthly defined climatological map of the e‐folding length scale applied at time t, derived from ocean color measurements (Xiaobing et al., 2015). Climatological maps used in the baseline case of L_SW and monthly averaged α are illustrated in Figures S1 and S2 of the supporting information.

Observational Network

The observations of the climate system are obtained from various sources, which are listed in Table 1 along with their assigned observational error. Satellite sea surface temperature (SST) is obtained from a wide range of infrared and microwave satellites, including NAVOCEANO (May et al., 1998), AMSR‐E, and WindSat (Gaiser et al., 2004). Only night time SST is assimilated. Subsurface in situ observations of the ocean are obtained from the CORA5.0 data set (Cabanes et al., 2013), including Argo profiles (Roemmich et al., 2009), TAO, PIRATA, and RAMA moored arrays, conductivity, temperature, and depth (CTD) and eXpendable BathyThermograph (XBT) profiles. Altimetric sea level anomaly (SLA) is provided by the Radar Altimeter Database System (RADS) (Schrama et al., 2000), which adopts tide, mean dynamic topography and inverse barometer corrections. Atmospheric observational data is taken from daily averages of the Japanese atmospheric reanalysis, JRA55 (Kobayashi et al., 2015). Satellite sea‐ice concentrations (SIC) are sourced from the 25 km resolution climate data record, EUMETSAT Ocean and Sea Ice Satellite Application Facility (OSISAF) global sea ice concentration reprocessing data set 1978–2015 (v1.2, 2015), by the Norwegian and Danish Meteorological Institutes. Accompanying the SIC observations, we have created a synthetic under sea‐ice sea surface temperature (UISST) data set, where in the presence of sea‐ice the surface temperature is set to the freezing point of salt water (−1.8°C). This is included to further constrain the warm SST bias which strongly affects the sea‐ice formation and retention.

1 TableObservations Assimilated Asynchronously From 2010 to 2012

Observations	Data set	Number	SO	Spatial	Error estimate	R‐factor
Sea surface temperature	NAVO‐AVHRR, WindSat, AMSR‐E	2.80 × 10⁷	2.13 × 10⁶	Point	^a, 0.5 K, 0.5 K	16
Sea level anomaly	RADS	3.72 × 10⁶	4.32 × 10⁵	Track	^a	8
In situ ocean temperature	Argo, XBT, CTD, TAO, PIRATA	2.44 × 10⁶	2.91 × 10⁵	Profile	0.5 K	8
In situ ocean salinity	Argo, CTD, TAO, PIRATA	1.92 × 10⁶	2.59 × 10⁵	Profile	0.075 psu	8
Sea ice concentration	OSISAF	2.43 × 10⁷	7.29 × 10⁵	Gridded	^b	1
Under sea‐ice surface temperature	Synthetic	5.67 × 10⁶	3.80 × 10⁵	Gridded	0.5 K	1
Zonal wind	JRA55	1.01 × 10⁷	3.17 × 10⁵	Gridded	1 ms⁻¹	64
Meridional wind	JRA55	1.01 × 10⁷	3.17 × 10⁵	Gridded	1 ms⁻¹	64
Air temperature	JRA55	1.01 × 10⁷	3.17 × 10⁵	Gridded	1 K	64
Specific humidity	JRA55	1.01 × 10⁷	3.17 × 10⁵	Gridded	0.05 kg·kg⁻¹	64

Note. Number represents the typical amount of observations for a 28‐day cycle. SO denotes number of superobservations. The effective observational error in the data assimilation (DA) system is the product of the error estimate and R‐factor. The number of JRA55 observations and superobservations represent those on only the assimilation day. A localization radius of 1,000 km is used for all observations, other than for the sea‐ice observations which use a radius of 250 km.

^adenotes error estimates provided by product vendor for each observation.

^bdenotes that adoption of the error function approach in (Sakov et al., 2012).

Measurements from given observational products that are located within the same model cell and occur on the same day, are amalgamated into one superobservation. The total number of individual observations, and also amalgamated superobservations for each product within a typical 28‐day DA cycle are provided in Table 1. The values, coordinates, and times of these spatiotemporally close observations are averaged with the weights inversely proportional to the observation error variance (Sakov, 2014). All DA error statistics presented within are performed in observational space with respect to these superobservations.

Structure of DA Coupling

Via the covariance matrix P^f in Equation 6, observations within a particular realm (e.g., ocean) can in general contribute to the calculation of the increments in other realms (e.g., atmosphere). As mentioned in the introduction, this is referred to as strongly coupled DA. Weakly coupled DA is when observations of a given realm are only allowed to effect that realm, with the influence of these observations propagating through to the other realms via the evolution of the coupled climate model. In this instance the off‐diagonal block elements of P^f pertaining to covariances between the realms in question are explicitly set to zero. Through a series of targeted experiments undertaken in Sandery et al. (2020), the following combination of weakly and strongly coupled DA produced the lowest forecast errors across all realms, for the system used in this study. Ocean observations contribute to the increments of both the ocean and atmosphere. Atmospheric observations are used to calculate the DA increments for only the atmospheric state variables. Sea‐ice observations contribute to the DA increments of the sea‐ice, oceanic and atmospheric state variables. We adopt precisely this coupling in the present study. Increments associated with the parameter fields at the air‐sea interface (i.e., α, L_SW) are determined via the cross‐covariances from all realms, that is, the sea‐ice, oceanic, and atmospheric observations.

At the beginning of each DA cycle, the model is initialized to the analyzed states and adopts the analyzed parameters. Ocean and sea‐ice observations are assimilated asynchronously, that is, they are compared to the model forecasts on the day the observations occurred (Sakov et al., 2010). Daily averaged atmospheric JRA55 data is assimilated on the analysis day. To remove spurious long range correlations, a local analysis is performed using a Gaspari and Cohn (1999) horizontal localization function. The localization radii are dependent upon the observation type, and applied consistently across all state variables and parameters. A radius of 1,000 km is used for all observation types, other than the sea‐ice observations which use a radius of 250 km. We also adopt a capped covariance inflation of 5%, while also ensuring that the inflated amount is not more than the ratio of forecast and analysis ensemble standard deviations (Sakov, 2014). The capped covariance inflation is applied across the entire state vector, and as such acts upon the state ensemble, and also the parameter ensemble for the parameter estimation cases. The R covariance matrix is diagonal, under the common assumption that the superobservations are uncorrelated. The diagonal elements of the matrix R for a particular observation type, are given by the product of the specified error estimate and associated R‐factor, both of which are listed in Table 1. Uniquely for the sea‐ice concentration observations, the error estimate is in fact a function of the observational value itself following Sakov et al. (2012). This addresses the fact that observations of either 0% or 100% concentration are typically reliable, with values in between less so. Additionally, the system adaptively moderates observational impact using the method of Sakov and Sandery (2017) with a K‐factor of unity. This method was designed to limit the impact of individual observations with anomalously large innovations. The above configuration was found to produce the best forecasts of the baseline configuration over a 7‐day DA cycle, and for consistency is used for all DA experiments in the present study.

Comparison of Individual and Joint Parameter Estimation

As discussed in the introduction, the motivation of this study is to reduce model bias in multiyear climate forecasts of a coupled GCM by better estimating certain key model parameters. The two model parameters of interest in this study are the ocean optical properties of L_SW and α. We first assess the impact of estimating the parameters individually as opposed to simultaneously. Specifically, the assessed DA variants for joint estimation of the model state and parameters are: L_SW only; α only; and both of these parameters simultaneously. These experiments are compared to a baseline case in which only model prognostic variables are estimated. The benchmark case adopts a best practice selection of GCM parameterizations and associated parameter values. To some extent the DA error metrics distinguish between each of the cases, but they are incapable of isolating the influence that the parameters alone have on the model. This is because the model state and parameter estimation are intimately linked. The purpose of Section 3.1 is to demonstrate that each of the DA experiments are adequately constrained. The spatiotemporal properties of the estimated parameters are presented in Section 3.2. The most appropriate way to assess any improvement or otherwise in the model is to compare multiyear out‐of‐sample climate forecasts adopting these learned parameter maps, which is undertaken in Section 3.3. Note the influence of the DA cycle length will be addressed in Section 4.

DA Cycle Forecast Error Statistics

Within each DA experiment, we assess the skill of the short term forecasts by comparing the forecast state to the observations prior to the analysis being made. The error measures of interest are the mean absolute deviation (MAD) and bias, which for a given observation type are respectively defined as [Image Omitted. See PDF] [Image Omitted. See PDF]where {·} denotes averaging over observations of specified type in the spatial domain of interest, which in this section is the entire globe. Bias is an indication of the persistent systematic model error, while MAD also includes phase error associated with physical structures not being located at the same point in space.

For each of the parameter estimation variants, Table 2 lists the MAD and bias with respect to each of the assimilated observational products. We have included only the real world observations in this table, and hence excluded the synthetic UISST. These error metrics are also temporally averaged during the 2011 calendar year, in order to exclude the initial spinup time of 1 year. Observation types are highlighted in bold type face where there is a greater than 1% reduction in both MAD and absolute bias from the baseline case. For the present coupled climate model the forecast error of the daily atmospheric fields reach their maximum prior to the end of the 28‐day DA cycle as demonstrated by the daily error growth rates approaching zero in Figure 9 of Sandery et al. (2020). As such each of the parameter estimation cases demonstrate exceedingly similar global MAD for the atmospheric state variables. Of the remaining assimilated data, only SIC, SST, and in situ ocean temperature have a greater than 1% reduction in both MAD and absolute bias for at least one of the DA parameter estimation experiments. For the in situ ocean temperature, all cases exhibit a bias reduction, with the joint estimation of α and L_SW demonstrating the greatest reduction in both bias and MAD. For SST, the case estimating L_SW individually has the lowest MAD and bias, with the other cases exhibiting a change in sign and increase in absolute bias. For SIC, none of the experiments have a deterioration in the skill measures upon the baseline, with the L_SW case having the lowest MAD, and the joint estimation case the lowest bias. As will become evident in the multiyear forecasts to follow, the baseline model configuration is systematically biased to having too much sea‐ice in the Northern Hemisphere, and too little in the southern. Note, globally averaged bias can at times be misleading when there are regions of both positive and negative bias that cancel each other out. This is why the sign of the globally averaged SST bias changed for cases involving α. We will account for this effect with application to the multiyear forecast error measures in Section 3.3.

2 TableDA Error Metrics of Forecast Mean Absolute Deviation (MAD) and Bias With Respect to the Assimilated Observations, Spatially Averaged Over the Entire Globe, and Temporally Averaged From January 2010 to December 2011, Inclusive

		Baseline	L_SW only	α only	(L_SW, α)
Sea‐ice concentration (m²/m²)	MAD	0.077	0.071	0.075	0.072
	Bias	0.024	0.021	0.022	0.017
Sea surface temperature (°K)	MAD	0.643	0.634	0.634	0.634
	Bias	−0.022	−0.014	0.083	0.110
In situ temperature (°K)	MAD	0.708	0.711	0.702	0.694
	Bias	−0.106	−0.098	−0.070	−0.041
In situ salinity (psu)	MAD	0.123	0.122	0.122	0.121
	Bias	−0.011	−0.014	−0.014	−0.014
Sea level anomaly (m)	MAD	0.084	0.084	0.083	0.084
	Bias	−0.003	−0.004	−0.005	−0.005
Air temperature (°K)	MAD	3.219	3.224	3.221	3.228
	Bias	−0.220	−0.232	−0.136	−0.128
Zonal air speed (m/s)	MAD	5.951	5.971	5.912	5.961
	Bias	−0.215	−0.272	−0.235	−0.220
Meridional air speed (m/s)	MAD	5.083	5.096	5.057	5.082
	Bias	−0.003	−0.004	−0.003	−0.002

Note. The listed DA experiments have 28‐day cycle lengths and include: a baseline case (no parameter estimation); estimation of LSW; estimation of α; and joint parameter estimation of both LSW and α. Error measures for specific observations types are highlighted in bold type face where there is greater than a 1% reduction in both MAD and absolute bias from the baseline case.

Evolution of the Parameter Fields

We now discuss the temporal evolution and spatial structure of the estimated parameter maps. The following two plots include the full assimilation time series, including the spin up period to highlight the convergence of these parameters with time. Figure 1a represents the ensemble averaged and globally averaged values of L_SW as a function of assimilation time. The red line is when only L_SW is estimated, and the green line is when L_SW is estimated in conjunction with α. As a point of comparison, the black line is the globally averaged climatological reference L_SW used in the baseline experiment. As compared to the climatological reference L_SW, the L_SW only case converges to a slightly deeper globally averaged depth, and the joint estimation case converges to a shallower one. The globally averaged values of α are illustrated in Figure 1b. The blue line is for the case in which only α is estimated, and again the green line is when both α and L_SW are estimated together. The black line represents the daily α from the baseline experiment, however, it is averaged from 50°S‐50°N to avoid the influence of the sea‐ice zones. Both estimated cases converge to less reflective values than that of the baseline experiment, with the individually estimated α case the least reflective. In addition to the global average of the parameters converging after one year, the overall pattern of the maps is also unchanged, with the distinctions between low and high parameter values being further refined. Over this two year period, none of the estimated parameters appear to exhibit any cyclical behavior as present in the baseline L_SW and α. The lack of temporal periodicity in the estimated parameters, could be due to the mean model bias being significantly larger than any periodic component. It is also possible, however, that the noise component in the parameter forward model is weakening spatial correlations, which inhibits the influence of the observations and potentially the ability to elucidate any temporal periodicity. The latter possibility will be tested in future research with application of a parameter forward model with spatially correlated noise.

View Image - 1 Figure. Ensemble averaged estimated parameters in DA experiments of 28‐day cycle lengths. As a function of DA analysis time, globally averaged estimated values of (a) LSW in meters; and (b) α. For the baseline case, January averaged maps for: (c) LSW; and (d) α, excluding the sea‐ice zones. Final ensemble averaged estimated parameter maps of: (e) LSW, when only LSW is estimated; (f) α, when only α is estimated; (g) LSW, when both LSW and α are jointly estimated; and (h) α, when both LSW and α are jointly estimated.

1 Figure. Ensemble averaged estimated parameters in DA experiments of 28‐day cycle lengths. As a function of DA analysis time, globally averaged estimated values of (a) LSW in meters; and (b) α. For the baseline case, January averaged maps for: (c) LSW; and (d) α, excluding the sea‐ice zones. Final ensemble averaged estimated parameter maps of: (e) LSW, when only LSW is estimated; (f) α, when only α is estimated; (g) LSW, when both LSW and α are jointly estimated; and (h) α, when both LSW and α are jointly estimated.

The spatial structure of the converged parameter maps on December 30, 2012 are now discussed. As a point of comparison, the January average L_SW and α used in the baseline case are illustrated in Figures 1c and 1d, respectively. The full climatology of these parameters can be found in Figures S1 and S2 of the supplementary material. Note, the color bars in Figures 1e–1h are set such that the center contour represents the ensemble average of the initial parameter field, with yellow (dark blue) contours indicating that the ensemble average of the estimated parameter has reduced (increased). A higher L_SW means that more heat is taken away from the surface layer and distributed further into the water column, and hence a lower SST but a higher in situ temperature.

We first consider the case in which only α is estimated, illustrated in Figure 1f. The α map exhibits large scale coherent structures in the equatorial region, resembling the widely observed cold tongue bias. A higher α means more heat is reflected away from the oceans, and to a first order a lower SST and also lower temperature within the water column. Interestingly it appears that the estimated α field is attempting to undo the influence of the prescribed standard L_SW map in Figure 1c, where within at least the region 40°S‐40°N the two fields appear to be anticorrelated. The standard L_SW field is derived to be inversely proportional to the opacity of the ocean, which among other things is dependent upon the chlorophyll concentration. This result suggests that the L_SW parameter field that the coupled model requires to produce the best short term forecast, has more to do with the biases present in the model, than the actual optical properties of the ocean.

The case in which only L_SW is estimated is illustrated in Figure 1e, and indicates that L_SW is large in boundary currents and the Antarctic Circumpolar Current, that is, regions of high variability resolved in the model. This is the opposite to what one would expect from the typical optical properties of the ocean illustrated in Figure 1c. Note the intense yellow structure in the tropical Pacific. The SST of the baseline system is biased hot in this equatorial region, as will become evident from the multiyear climate forecasts in the following section. The only way for this DA experiment to address this bias is to increase L_SW, and hence distribute the heat deeper into the water column. The southern Ocean exhibits a strong near zonal pattern, again consistent with a warm temperature bias at the surface.

We finally consider the case in which both parameters are estimated simultaneously. In terms of the α map, the dark blue regions between 40°S and 40°N in the joint estimation case in Figure 1h is less reflective as compared to the individually estimated case in Figure 1f. This is because L_SW in Figure 1g is effectively retaining more heat near the surface by not distributing as much to the ocean depths in these zones. Figure 1g illustrates L_SW in the jointly estimated case, and has a much reduced yellow structure in the tropical Pacific, in comparison to the individually estimated one in Figure 1e. This is due to the additional reflection from the estimated α in this region.

These estimated maps have highlighted, that the parameter values required to improve an incomplete and imperfect model, do not necessarily resemble the parameter values observed in nature. In all of the cases presented within, the estimated α and L_SW resemble more the model biases than maps of these parameters inferred from direct measurements. This notion of adopting non‐physical parameters is not a new one, and is widely accepted in the field of turbulence modeling exemplified by the theoretical proposition (Leith, 1971) and data driven quantification of negative eddy viscosities in oceanic (Kitsios et al., 2013) and atmospheric (Kitsios & Frederiksen, 2019) flows.

Multiyear Forecast Error Statistics

We now look to address the key question of whether parameters trained on the basis of a series of short term (28‐day) forecasts, can reduce the onset of model bias in longer term multiyear simulations. In particular, in this section we will determine whether joint or individual estimation of albedo and/or shortwave e‐folding length produce less biased multiyear forecasts. All of the forecasts are initiated from the final ensemble of the baseline DA experiment run out to the first of February 2012 to be outside of the in‐sample period used to estimate the parameters. For a given configuration, the full ensemble of 96 members are forecast, with each ensemble member adopting the same ensemble averaged parameter map. The bias fields and time series presented within, are calculated using the ensemble mean.

Starting with SST, the globally averaged monthly MAD with respect to HadiSST as a function of time is illustrated in Figure 2a. The forecasts initially have low values of MAD, as a result of the DA system constraining the state. This MAD grows in a secular fashion until it plateaus at the beginning of 2016. Only the case in which L_SW is estimated individually, has persistently less MAD than the baseline. The differences between each of the configurations become clearer when one considers the spatially averaged positive and negative biases separately, as illustrated in Figure 2b. Here, a positive (negative) bias is one in which the observations are warmer (cooler) than the model forecasts. There is a persistent reduction in positive bias (solid line) between the L_SW case and the baseline for all time. The negative bias (dashed line) is reduced for the L_SW case in the Southern Hemisphere summer, and similar to the remaining cases for the rest of the year.

View Image - 2 Figure. Out‐of‐sample multiyear climate forecast error in sea surface temperature. As function of forecast date: (a) globally averaged monthly averaged MAD, with the legend also applicable to (b); and (b) averaged positive (solid line) monthly averaged bias and averaged negative (dashed line) monthly averaged bias. Fields averaged from January 2016 to December 2019 for: (c) HadISST observations minus the baseline case (i.e. bias); (d) baseline case minus case when only LSW is estimated; (e) baseline case minus case when only α is estimated; and (f) baseline case minus case when both LSW and α are jointly estimated. All maps are derived from DA experiments with 28‐day DA cycles. For plots (d–f) the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level.

2 Figure. Out‐of‐sample multiyear climate forecast error in sea surface temperature. As function of forecast date: (a) globally averaged monthly averaged MAD, with the legend also applicable to (b); and (b) averaged positive (solid line) monthly averaged bias and averaged negative (dashed line) monthly averaged bias. Fields averaged from January 2016 to December 2019 for: (c) HadISST observations minus the baseline case (i.e. bias); (d) baseline case minus case when only LSW is estimated; (e) baseline case minus case when only α is estimated; and (f) baseline case minus case when both LSW and α are jointly estimated. All maps are derived from DA experiments with 28‐day DA cycles. For plots (d–f) the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level.

Potential regions of improvement in the model become evident when considering spatial maps of the bias fields. Figure 2c illustrates the average SST bias of the baseline configuration with respect to HadiSST over the period of saturated bias from January 2016 to December 2019 inclusive. This map indicates a near zonal warm model bias in the southern Ocean, and warm biases south of Greenland, off the west coasts of Africa, North America and South America, and off the east coast of Japan. Everywhere else has a persistent cold bias. Figure 2d illustrates the difference in average SST between the baseline configuration and the L_SW case over the same period. The same differences with respect to the baseline configuration, are illustrated for the joint estimation case in Figure 2e, and the case in which only α is estimated in Figure 2f. In comparison to Figure 2c, it is only the case in which L_SW is individually estimated, that the oceans are warmed where the baseline configuration is biased cold, and cooled where the baseline case is biased warm. Note, in Figures 2d–2f, and related figures to follow, the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level.

Given a less biased representation of the SST, one would expect that the SIC would also exhibit some improvement. The monthly SIC MAD with respect to OSISAF averaged over the globe as a function of time is illustrated in Figure 3a, with the biases of the Northern (dashed) and Southern (solid) Hemispheres in Figure 3b. Note, we have been more stringent in our assessment of SIC forecast error measures for these timeseries. Since observational products treat any values of SIC less than 0.15 as essentially no sea‐ice, we do not include any cells in the calculation of the error in which both the model and observations have values of SIC that are less than 0.15. As an example, the system does not get credit for forecasting no sea‐ice in the tropics. For symmetry, nor do we include any cells in which both the model and observations have values of SIC that are greater than 0.85. All of the cases presented within have too much sea‐ice in the Northern Hemisphere and too little in the southern. The MAD again grows secularly until the yearly average plateaus at the beginning of 2016. After the second year, the L_SW case is persistently the least biased in the Northern Hemisphere, with the other estimated case more biased than the baseline. All cases out to perform the baseline in the Southern Hemisphere, with the joint estimation case the least biased.

View Image - 3 Figure. Out‐of‐sample multiyear climate forecast error in sea‐ice concentration. As function of forecast date: (a) globally averaged monthly averaged MAD, with the legend also applicable to (b); and (b) Northern (solid) and Southern (dashed) Hemisphere averaged monthly averaged bias. Fields averaged from January 2016 to December 2019 for: (c) HadISST observations minus the baseline case (i.e., bias); (d) baseline case minus case when only LSW is estimated; (e) baseline case minus case when only α is estimated; and (f) baseline case minus case when both LSW and α are jointly estimated. All maps are derived from DA experiments with 28‐day DA cycles. For plots (d–f) the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level.

3 Figure. Out‐of‐sample multiyear climate forecast error in sea‐ice concentration. As function of forecast date: (a) globally averaged monthly averaged MAD, with the legend also applicable to (b); and (b) Northern (solid) and Southern (dashed) Hemisphere averaged monthly averaged bias. Fields averaged from January 2016 to December 2019 for: (c) HadISST observations minus the baseline case (i.e., bias); (d) baseline case minus case when only LSW is estimated; (e) baseline case minus case when only α is estimated; and (f) baseline case minus case when both LSW and α are jointly estimated. All maps are derived from DA experiments with 28‐day DA cycles. For plots (d–f) the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level.

Maps of the SIC bias indicate regions of improvement in the model. Figure 3c illustrates the average SIC bias of the baseline configuration with respect to OSISAF over the same period as was done for the SST bias. This map clearly indicates excessive sea‐ice in the Northern Hemisphere, and insufficient sea‐ice in the south. Figures 3d–3f illustrates the difference in average SIC between the baseline configuration and the L_SW case, joint estimation case, and α case, respectively. The L_SW case is moving the model biases in the right direction in most places, albeit by a small amount. The cases involving the estimation of α have increased sea‐ice in almost all regions, which is consistent with the global cooling of SST illustrated in Figures 2e and 2f.

Again, given a less biased SST, one would also expect a less biased atmosphere in the coupled GCM. If the sea‐surface meridional temperature gradient is better represented, then via thermal wind balance, one would expect the vertical shear and hence jet position in the atmosphere to also be better represented. We use the height of 500 hPa isobar (H500) in the atmosphere as an indicator of jet position. Note, this is not an assimilated variable, and is diagnosed from the model. The globally averaged monthly MAD with respect to JRA55 as a function of time is illustrated in Figure 4a. There are many parallels in these results with the SST field, with only the L_SW case to exhibit a reduction. In Figure 4b, the bias is again decomposed into its positive (solid) and negative (dashed) components. Like in SST, most of the improvement is coming from a reduction in the positive bias.

View Image - 4 Figure. Out‐of‐sample multiyear climate forecast error in the height of the 500 hPa isobar of the atmosphere. As function of forecast date: (a) globally averaged monthly averaged MAD, with the legend also applicable to (b); and (b) averaged positive (solid line) monthly averaged bias and averaged negative (dashed line) monthly averaged bias. Fields averaged from January 2016 to December 2019 for: (c) HadISST observations minus the baseline case (i.e. bias); (d) baseline case minus case when only LSW is estimated; (e) baseline case minus case when only α is estimated; and (f) baseline case minus case when both LSW and α are jointly estimated. All maps are derived from DA experiments with 28‐day DA cycles. For plots (d–f) the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level.

4 Figure. Out‐of‐sample multiyear climate forecast error in the height of the 500 hPa isobar of the atmosphere. As function of forecast date: (a) globally averaged monthly averaged MAD, with the legend also applicable to (b); and (b) averaged positive (solid line) monthly averaged bias and averaged negative (dashed line) monthly averaged bias. Fields averaged from January 2016 to December 2019 for: (c) HadISST observations minus the baseline case (i.e. bias); (d) baseline case minus case when only LSW is estimated; (e) baseline case minus case when only α is estimated; and (f) baseline case minus case when both LSW and α are jointly estimated. All maps are derived from DA experiments with 28‐day DA cycles. For plots (d–f) the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level.

The H500 bias patterns of the baseline case with respect to JRA55 are illustrated in Figure 4c, again averaged over the period from January 2016 to December 2019 inclusive. These bias patterns are reminiscent of the SST bias in Figure 2c, with a near zonal negative bias over the southern Ocean, strong negative bias over Greenland, and positive biases elsewhere. The changes to H500 as a result of the L_SW, joint, and α estimation cases are illustrated in Figures 2d–2f, respectively. Again it is only the L_SW case that is, moving the model biases in the right direction in almost all regions.

The observed forecast error measures from each of the configurations are summarized in Table 3, with the smallest absolute value in each row highlighted in bold type face. For all of the variables assessed in this section, the case in which only L_SW is estimated, has the lowest global MAD. Cases involving the estimation of α, be it individually or in a joint fashion, have a global MAD that is, larger than the baseline for all variables. The L_SW only case also produces the least biased multiyear climate forecasts for all variables, other than the Southern Hemisphere SIC bias. Recall, however, in the in‐sample DA experiments the additional flexibility of estimating α along with L_SW, produced the lowest global MAD for in situ temperature, the lowest bias for SIC, and very similar values to those of the L_SW only case for all of the other error measures. This discrepancy between the in‐sample DA statistics and out‐of‐sample multiyear climate forecasting statistics indicate how inherently intertwined the state and parameter estimation is in the DA. For a given DA experiment, within each cycle the model parameters are estimated such that the evolution of the prognostic variables better fit the observed states, while the prognostic states themselves are also consistently being pushed toward the observations. In general, the best parameter set in which the state is also consistently being corrected (i.e., DA), is not necessarily the best parameters set in which the state is not consistently being corrected (i.e., multiyear forecast). Our results suggest that for the case in which only L_SW is estimated, there is some consistency between parameters optimized for a 28‐day and multiyear forecasts. This is not the case for any experiments including the estimation of α.

3 TableSensitivity of Multiyear Out‐of‐Sample Forecast Error Measures (Averaged From January 2016 to December 2019) to Estimated Parameters

Variable	Bias	Baseline	L_SW only	α only	(L_SW, α)
SST	Global MAD	1.16	0.985	2.06	2.29
	+ve bias	1.19	0.996	2.21	2.45
	−ve bias	−1.07	−0.954	−1.13	−1.10
SIC	Global MAD	0.499	0.454	0.543	0.537
	NH bias	−0.482	−0.453	−0.592	−0.602
	SH bias	0.438	0.351	0.368	0.305
H500	Global MAD	34.0	29.7	69.1	79.7
	+ve bias	35.0	30.3	70.5	81.1
	−ve bias	−13.3	−7.23	−13.3	−12.8

Note. In all cases, the associated parameter variances are σ_SW = 0.02 m and σ_α = 5 × 10⁻⁴, with a DA cycle length of 28 days. The smallest absolute forecast error measure in each row is in bold type face. Numbers are presented to three significant figures.

The degradation of the climate forecasts when adopted either an individually or jointly estimated map of α, could in part be due to the difficulty in estimating α within the sea‐ice zones. In these regions the local effective albedo adopted in the model is dependent upon the sea‐ice concentration. If the sea‐ice concentration for a given ice class is 100% within a given grid box, then the local albedo is determined by some combination of the albedo of ice and snow, depending on the snow cover. If the sea‐ice concentration is 0% within a given grid box, then the local albedo is determined by the albedo of the ocean. So in the sea‐ice zone, not only is the evolution of the state dependent upon the specified parameters, but the effective local parameter value is also dependent upon the state. For example, the baseline run of the GCM is demonstrated as having persistently too little sea‐ice during the summer months. This means that during the summer the DA system persistently adds sea‐ice to the initial conditions at each DA cycle. For a grid box completely covered in sea‐ice, the albedo is set by the reference ice and snow cover albedos, as opposed to the ocean albedo. Only once the sea‐ice has sufficiently melted during the DA window, does the ocean albedo take effect. This switching behavior between the reference albedos weakens the correlations between the parameters and observed states, and hence reduces the increments into the parameter fields. Shortwave penetration depth is arguably less prone to the above issues since it operates both in the presence and absence of sea‐ice. Even when sea‐ice is present, a proportion of the shortwave radiation is transmitted through the sea‐ice. The remaining flux of shortwave radiation then penetrates into the water column according to the shortwave penetration depth parameter. This means that correlations between the ensemble of shortwave penetration depth parameter fields and observed model states is given the opportunity to persist.

Sensitivity to the Assimilation Window Length

We now determine the influence of DA cycle length on the estimation of the parameter maps, and the bias of multiyear climate forecasts adopting these maps. It was found in the previous section that the case in which only the L_SW parameter was estimated produced the overall most robust multiyear forecasts. We, therefore, compare DA experiments estimating the state and only the L_SW parameter using 28, 14, 7, and 3‐day assimilation windows. As one would expect the DA error metrics are strongly dependent upon the DA cycle length, with shorter DA windows having more frequent corrections of the state and hence lower forecast errors. Even more so than in the previous section, a comparison of the DA error metrics does not represent the influence the parameters alone have on the model. The most appropriate metric by which to compare these cases is on the basis of the multiyear climate forecasts, which along with the parameter maps themselves will be the focus of this section.

The globally averaged values of L_SW from the beginning of each DA experiment throughout the following two years, are illustrated in comparison to that of the baseline case (black line) in Figure 5a. Note, the 28‐day case (red line) is the same as the L_SW only case in the previous section. In all cases, no temporally periodic behavior is observed, with the converged values relatively constant throughout the second year of all experiments. The global average of the final converged map is plotted in Figure 5b with respect to the associated DA cycle length. For reference the spatial global average of the time averaged climatological map of the baseline case is included as the dashed line. Across each of the cases, the final values of the globally averaged L_SW first rapidly decreases from 3 to 7‐day cycles, then converges to 23.7 m for the 28‐day case. The final converged parameter maps of L_SW for the 3, 7, 14, and 28‐day cases, are respectively illustrated in Figures 5c–5f. The color bars are again set such that the central contour represents no change from the original ensemble averaged parameter, with dark blue indicating a decrease, and yellow an increase. For the 3‐day cycle, almost the entire southern Ocean has the maximum allowable value of 60 m. As one moves in order of increasing DA cycle lengths from 3 to 28‐days, L_SW in the southern Ocean decreases and become less zonal. In addition to the globally averaged L_SW decreasing with increasing DA cycle, the final parameter maps indicate that L_SW is decreasing in all regions, except in the Arctic where this parameter is increasing.

View Image - 5 Figure. Ensemble averaged estimated LSW in meters for DA experiments of different DA cycle lengths. Globally averaged LSW as a function of: (a) DA analysis time; and (b) DA cycle length for, the final converged map for the estimated cases, and the time averaged climatological maps of the baseline case. Final ensemble averaged parameter maps estimated using DA cycle lengths of: (c) 3‐day; (d) 7‐day; (d) 14‐day; and (e) 28‐day (identical to Figure 1e).

5 Figure. Ensemble averaged estimated LSW in meters for DA experiments of different DA cycle lengths. Globally averaged LSW as a function of: (a) DA analysis time; and (b) DA cycle length for, the final converged map for the estimated cases, and the time averaged climatological maps of the baseline case. Final ensemble averaged parameter maps estimated using DA cycle lengths of: (c) 3‐day; (d) 7‐day; (d) 14‐day; and (e) 28‐day (identical to Figure 1e).

The influence of each of these parameter maps on the model bias is now assessed with application to multiyear climate forecasts. The globally averaged SST MAD is illustrated in Figure 6a as a function of time. The correspondence between the experiments and line colors indicated by the legend is applicable to all plots. As demonstrated in Section 3.3, there is a persistent reduction in global MAD at all times with respect to the baseline case. The reduction is similar for all DA cycle lengths, other than the 3‐day cycle which has greater error the other estimated cases. The meridional distribution of the bias maps is illustrated in Figure 6b by the zonally averaged time average positive (solid) and negative (dashed) bias over the same period adopted in Section 3.3, from January 2016 to December 2019. In terms of the positive bias, the greatest reduction is within the tropics, where the bias appears to be decreasing with increasing DA cycle length. The negative bias in the tropics behaves similarly, with bias in general decreasing with increasing DA cycle length. However, in the southern extra‐tropical region near the sea‐ice edge, the forecasts adopting maps using the shortest DA cycle of 3‐day is the least biased.

View Image - 6 Figure. Sensitivity of forecast error measures to DA cycle length for cases only estimating LSW. SST: (a) globally averaged MAD as a function of time, with the legend applicable to all figures; and (b) zonally and time (January 2016 to December 2019) averaged positive (solid) and negative (dashed) bias as a function of latitude. SIC: (c) globally averaged MAD as a function of time; and (d) meridionally and time averaged (2016–2019) Northern (solid) and Southern (dashed) Hemisphere bias as a function of longitude. H500: (e) globally averaged MAD as a function of time; and (f) zonally averaged and time averaged (2016–2019) positive (solid) and negative (dashed) bias as a function of latitude.

6 Figure. Sensitivity of forecast error measures to DA cycle length for cases only estimating LSW. SST: (a) globally averaged MAD as a function of time, with the legend applicable to all figures; and (b) zonally and time (January 2016 to December 2019) averaged positive (solid) and negative (dashed) bias as a function of latitude. SIC: (c) globally averaged MAD as a function of time; and (d) meridionally and time averaged (2016–2019) Northern (solid) and Southern (dashed) Hemisphere bias as a function of longitude. H500: (e) globally averaged MAD as a function of time; and (f) zonally averaged and time averaged (2016–2019) positive (solid) and negative (dashed) bias as a function of latitude.

Consistent with the latter observation, the 3‐day DA cycle length parameter map produces the lowest globally averaged SIC MAD, as illustrated in Figure 6c. Figure 6d illustrates the longitudinal distribution of sea‐ice bias for the Northern (solid) and Southern (dashed) Hemispheres, averaged over the period from January 2016 to December 2019. In the Southern Hemisphere, and in particular for the region between 50°W and 50°E, the 3‐day parameter map produces the least biased SIC forecasts. While overall there are minimal changes in the Northern Hemisphere with respect to the baseline case, there are some improvements for the sectors 0°E‐50°E, 180°E‐150°E, and 150°W‐180°W, with bias reducing with increasing DA cycle length.

Globally averaged MAD of H500 in the atmosphere, are illustrated for each of the DA cycle length cases as a function of time in Figure 6e. The H500 positive (solid) and negative (dashed) biases are also time averaged over the above mentioned period, zonally averaged, and plotted as a function of latitude in Figure 6f. For practically all periods in time and all points in space (aside from south of 60°S) the estimated cases have reduced bias with respect to the baseline case, and this bias reduces as the DA cycle length increases.

The above discussed multiyear forecasts are summarized in Table 4, containing the global MAD and biases spatially and temporally averaged over the period from January 2016 to December 2019. The biases listed in the table are essentially the spatial averages of the associated plots in the right column of Figure 6, and the listed values of global MAD are the temporal averages of the plots in the left column. In each row the smallest absolute error measure is indicated in bold type face. The 28‐day DA cycle case produces the least biased forecasts in terms of SST global MAD, H500 global MAD, and biases of, positive SST, Northern Hemisphere SIC, and positive H500. The 3‐day DA cycle map has the lowest global SIC MAD, and positive Southern Hemisphere SIC bias (model with too little sea‐ice), with the 7‐day map having the smallest negative H500 bias and negative SST (model too warm). These biases are intimately linked since the regions of negative SST and H500 bias are predominantly around the Southern Hemisphere sea‐ice edge—see Figures 2c, 3c, and 4c. Arguably, these shorter DA cycle maps out to perform the other longer DA cycles in this region, because the dynamics and pertinent biases are of shorter timescale. Elsewhere, particularly in the tropics and mid‐troposphere, the dominant dynamics are longer in timescale, and the longest DA cycle length of 28‐days produces the least biased forecast. The selection of cycle length in the DA experiment sets the timescale over which the ETKF estimates the state and parameters in order to minimize the distance between the model forecasts and observations. In the context of using these parameter maps in out‐of‐sample multiyear climate forecasts, the DA cycle length sets the timescale over which bias is attempted to be reduced. Since the climate system is multiscale, and the models and parameterizations are imperfect, the timescale of the dominant model bias is not the same at all points in space. This is what we have observed with shorter cycle maps producing a less biased model around the Southern Hemisphere sea‐ice edge, and the 28‐day map producing a less biased model elsewhere for the quantities assessed within. As an aside, for the situation in which one has a perfect model of a multiscale system with uncertain parameters, it is possible to determine one set of parameters that minimizes bias across all timescales (Berry & Harlim, 2014), but that is not the scenario we have here.

4 TableSensitivity of Multiyear Out‐of‐Sample Forecast Error Measures (Averaged From January 2016 to December 2019) to DA Cycle Length, With Reference to the Baseline Forecast

Variable	Bias	Baseline	3‐day	7‐day	14‐day	28‐day
SST	Global MAD	1.16	1.17	1.03	0.999	0.985
	+ve bias	1.19	1.23	1.06	1.02	0.996
	−ve bias	−1.07	−0.938	−0.929	−0.932	−0.954
SIC	Global MAD	0.499	0.436	0.440	0.441	0.454
	NH bias	−0.482	−0.481	−0.466	−0.460	−0.453
	SH bias	0.438	0.249	0.296	0.307	0.351
H500	Global MAD	34.0	38.8	32.4	29.8	29.7
	+ve bias	35.0	39.6	33.0	30.4	30.3
	−ve bias	−13.3	−8.90	−7.22	−8.36	−7.23

Note. In all cases, the estimated parameter is L_SW with a parameter variance of σ_SW = 0.02 m. The smallest absolute forecast error measure in each row is in bold type face. Numbers are presented to three significant figures.

Modified Representation of the Physical Processes

In the previous section, the L_SW 28‐day case was assessed to have the lowest error measures with respect to the observations in the majority of regions, and on this basis arguably best represents the Earth system. Using this configuration, we compare the representation of the ocean mixed layer and surface heat fluxes to those of the original baseline case.

The ocean mixed layer is a region within which density (as a function of temperature and salinity) is nearly uniform. Turbulence acts to homogenize this region, via physical processes including wind and buoyancy induced mixing, respectively akin to Kelvin‐Helmholtz and Rayleigh‐Taylor type instabilities. We calculate the mixed layer depth (MLD) as the vertical position at which the buoyancy difference with respect to the surface is equal to the critical buoyancy value of 3 × 10⁻⁴ ms⁻². The climatologically and ensemble averaged MLD is illustrated in Figure 7. The left column represents the MLD of the forecasts using the estimated map of L_SW. Figure 7a contains the average MLD over all months within the period from January 2016 to December 2019, and also ensemble averaged across all 96 members, for a total of 4,608 samples (4 years × 12 months × 96 members). Figures 7c, 7e, 7g, and 7i, respectively contain the average MLD for: December, January, and February (DJF); March, April, and May (MAM); June, July, and August (JJA); and September, October, and November (SON). Each climatological map is, therefore, calculated using a total of 1,152 monthly averaged samples (4 years × 3 months × 96 members). These figures demonstrate the seasonality of MLD associated with the deepening of the mode waters in the Northern and Southern Hemispheres during their respective winter and spring periods.

View Image - 7 Figure. Mixed layer depth (in meters) of the LSW 28‐day case averaged from 2016 to 2019 for: (a) all months; (c) December, January, and February; (e) March, April, and May; (g) June, July, and August; and (i) September, October, and November. Averaged mixed layer depth of the LSW 28‐day case minus that of the baseline case for: (b) all months; (d) December, January, and February; (f) March, April, and May; (h) June, July, and August; and (j) September, October, and November. In the right column the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level.

7 Figure. Mixed layer depth (in meters) of the LSW 28‐day case averaged from 2016 to 2019 for: (a) all months; (c) December, January, and February; (e) March, April, and May; (g) June, July, and August; and (i) September, October, and November. Averaged mixed layer depth of the LSW 28‐day case minus that of the baseline case for: (b) all months; (d) December, January, and February; (f) March, April, and May; (h) June, July, and August; and (j) September, October, and November. In the right column the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level.

The modification to the climate model as a result of the estimated parameters is highlighted by taking differences with respect to the baseline multiyear climate forecast. Figure 7b illustrates the difference between the MLD in Figure 7a and the MLD of the ensemble average baseline case calculated over the same period. As in Section 3.3, the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level. The change in MLD bears a striking resemblance to the associated estimated map of L_SW in Figure 5f, where in general larger (smaller) L_SW distributes heat to greater (lesser) depths in the ocean, which results in a deeper (shallower) ocean mixed layer. The difference in MLD is decomposed into the respective seasons in Figures 7d, 7f, 7h, and 7j. These figures illustrate that, while the parameter map L_SW is time and state independent, its influence upon the model is not. The change in MLD is clearly a function of the ocean state, with larger changes coinciding with deeper mixed layers at the higher latitudes of the Northern (Southern) Hemisphere during the winter and spring months of December to May (June to November).

The components of the air‐sea heat fluxes averaged over the period from January 2016 to December 2019 and all months are illustrated in Figure 8. The left column represents the heat fluxes of the forecasts using the estimated map of L_SW. In these plots positive (negative) fluxes indicate heat entering (leaving) the ocean. Figure 8a illustrates shortwave heat flux, which is incoming and largest in the tropics. This is because this term is dominated by the incoming solar radiation from the Sun, which is propagated through the atmospheric energy balance model before arriving at the sea‐surface. The longwave radiation is illustrated in Figure 8c, which is outgoing and proportional to $ϵ σ T_{0}^{4}$ , where σ is the Stefan‐Boltmann coefficient, T₀ the SST, and ϵ is the surface emissivity (assumed to be unity in this model). The sensible and latent heat fluxes are illustrated in Figures 8e and 8g, respectively, both of which are parameterized using the boundary layer similarity theory of Monin and Obukov (1954). The sensible heat flux is proportional to the negative of the local vertical temperature gradient, and is predominately outgoing other than a region in the extra‐tropical southern Ocean between 30°E and 60°E. An incoming sensible heat flux occurs where the local air temperatures are warmer than SST. The latent heat flux is outgoing and of largest magnitude in the tropics, which is a region of significant large scale evaporation. The net heat flux is the sum of the aforementioned terms and illustrated in Figure 8i.

View Image - 8 Figure. Surface heat fluxes (in W/m2) of the LSW 28‐day case averaged from 2016 to 2019 for: (a) shortwave radiation; (c) longwave radiation; (e) sensible heat flux; (g) latent heat flux; and (i) net heat flux. Averaged surface heat fluxes of the LSW 28‐day case minus that of the baseline case for: (b) shortwave radiation; (d) longwave radiation; (f) sensible heat flux; (h) latent head flux; and (j) net heat flux. In the right column the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level.

8 Figure. Surface heat fluxes (in W/m2) of the LSW 28‐day case averaged from 2016 to 2019 for: (a) shortwave radiation; (c) longwave radiation; (e) sensible heat flux; (g) latent heat flux; and (i) net heat flux. Averaged surface heat fluxes of the LSW 28‐day case minus that of the baseline case for: (b) shortwave radiation; (d) longwave radiation; (f) sensible heat flux; (h) latent head flux; and (j) net heat flux. In the right column the nonhatched regions indicate zones in which the difference between the baseline and estimated case is statistically different from zero to within a 99.9% confidence level.

The change in the associated fluxes with respect to the baseline case are illustrated in the right column of Figure 8. Again the nonhatched zones in the right‐hand‐side plots, indicate regions in which the difference between the two forecasts is greater than zero to within a 99.9% confidence level. In these plots positive (negative) fluxes indicate that the heat flux in the estimated case is larger (smaller) than the baseline case, with all color bars having the same limits to facilitate a direct comparison. It is clear that the changes in the shortwave radiation in Figure 8b and latent heat transfer in Figure 8h contribute the most to the net heat transfer changes in Figure 8j. Most of the statistically significant changes to the net heat fluxes are in the most variable regions of the resolved ocean, including the ocean storm tracks, gulf stream, Kuroshio, outcropping regions and tropical instability waves. By virtue of this high resolved variability, ensemble spread is larger, which the Kalman filter interprets as larger model uncertainty, and hence applies larger associated increments to the parameter field.

Concluding Remarks

Using an ETKF, spatiotemporally varying parameter maps of ocean albedo and the incoming shortwave radiation penetration depth are estimated both individually and simultaneously on the basis of short term (3–28 days) forecasts of a coupled climate atmosphere, ocean and sea‐ice general circulation model. The parameter estimation is shown to improve the fit of the DA system to a network of real world observations of the Earth system. However, only the case in which shortwave e‐folding length scale was estimated alone, produced out‐of‐sample multiyear climate forecasts with systematically reduced bias. Multiyear climate forecasts using the estimated maps were also shown to make statistically significant changes to the ocean mixed layer depth and air‐sea heat fluxes as compared to the baseline case. To our knowledge, this is the first attempt at undertaking parameter estimation on GCMs of this size and complexity.

The selection of cycle length in the DA experiment was also found to set the timescale over which the model bias was minimized. For example, the 3‐day map produced the least biased model around the Southern Hemisphere sea‐ice edge governed by fast surface dynamical processes. The 28‐day map produced the least biased model in regions where the dominant dynamics are of longer time scales, as in the tropics and mid‐tropospheric atmosphere. This indicates that for an incomplete and imperfect model with lead time dependent biases, one cannot minimize bias across all timescales at once, and a decision must be made on what is the pertinent timescale of interest for the application at hand. The experiments also highlighted that the parameter values required to minimize the biases over the selected timescale do not necessarily resemble the parameter values observed in nature.

While having non‐physical parameter values is common practice in the field of turbulence modeling, we acknowledge that the ocean albedo and shortwave length scale are presumably not the optimal selection of parameters. An accurate representation of these parameters can improve the model performance and general representation, but it cannot resolve all model discrepancies. As outlined in Equation 4, the ETKF modifies both the state and parameters on the basis of how far the model is away from the observations. The ETKF cannot distinguish between the various model deficiencies, be them how the shortwave distribution is represented in the ocean, the influence of numerical diffusion, or the parameterization of ocean mixing. So while we have estimated the shortwave length scale, L_SW, we may well have been using it to address other model deficiencies, as opposed to the representation of shortwave per se. The ocean albedo controls how much radiation enters into the ocean at the interface, and hence cannot directly address deficiencies in the representation of the ocean interior. The fact that out‐of‐sample multiyear forecast bias increased when adopting the estimated albedo maps, and bias decreased when adopting only the estimated L_SW maps, suggests that the distribution of heat within the ocean is a larger source of bias than the ocean heat uptake.

There are clearly many possible avenues to pursue in order to achieve further improvements. As opposed to adopting a globally constant value for the shortwave proportion of total incoming radiation, one could undertake a joint estimation of this parameter along with L_SW, which may well result in different and perhaps more physically consistent parameter maps. One could also adopt, and potentially estimate parameters within, more sophisticated optical schemes which calculate the ocean optical properties from the light absorption characteristics of the various marine bio‐geo‐chemical constituents. Such models have been shown to indirectly influence sea‐ice concentration via changes to SST (Lengaigne et al., 2009; Patara et al., 2012), and also feed back onto the ocean stratification and vertical mixing (Oschlies, 2004). Alternatively, one could address the vertical mixing directly, and hence better represent the nonsolar turbulent heat fluxes within the ocean interior. The estimated large values of shortwave penetration length in the Antarctic Circumpolar Current could also be indicative of insufficient ocean mixing. As mentioned in Section 2.3, one of the terms in the KPP vertical mixing scheme is proportional to properties of the oceanic boundary layer. This terms acts to nonlocally redistribute the surface heat flux throughout the boundary layer (Van Roekel et al., 2018). Enhanced nonlocal mixing would, therefore, have a similar influence on the ocean temperature distribution as increasing the shortwave penetration length. Note, the KPP scheme also acts directly on salinity which the solar heat fluxes do not. Similar or potentially greater improvement could be achieved by adopting the baseline ocean shortwave length scale climatological maps, and instead estimating ocean mixing parameters. All of the above points are currently speculation, elements of which will be the focus of future research.

Finally, the DA and parameter estimation framework developed within has demonstrated a means of improving the out‐of‐sample skill of a given coupled GCM comprising of its own specific model biases. The further one can reduce such biases, through either improved numerical methods and/or better functional representations of unresolved processes, the more representative the estimated parameters would be of actual physical properties as opposed to compensating for model deficiencies. As such, we believe our results justify the continued research into the physical understanding of the Earth system and associated model development.

Acknowledgments

The authors were supported by the Australian Commonwealth Scientific and Industrial Research Organisation Decadal Forecasting Project (research.csiro.au/dfp), and acknowledge the Australian National Computational Infrastructure and Pawsey supercomputing centre for providing the computational resources. The authors would also like to acknowledge Pavel Sakov, for his advice and many discussions during the development of this manuscript. We also appreciate the efforts of the editor and reviewers, which we believe contributed to a superior and more impactful manuscript.

The coupled GCM is as described in Delworth et al. (2006), with modifications, model structure and configuration presented in Sections 2.3 and 2.4. The EnKF‐C DA code (version 1.107.7) is documented in Sakov (2014), with the algorithms and settings also described in Sections 2.1, 2.2, and 2.6.

Data Availability Statement

All of the observational data ingested into the DA system are appropriately cited in Section 2.5. The postprocessing and visualization of the results are all fully described preceding the associated figures and tables.

Word count: 12646

Show less

© 2021. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Coupled general circulation models (GCM), and their atmospheric, oceanic, land, and sea‐ice components have many parameters. Some parameters determine the numerics of the dynamical core, while others are based on our current understanding of the physical processes being simulated. Many of these parameters are poorly known, often globally defined, and are subject to pragmatic choices arising from a complex interplay between grid resolution and inherent model biases. To address this problem, we use an ensemble transform Kalman filter, to estimate spatiotemporally varying maps of ocean albedo and shortwave radiation e‐folding length scale in a coupled climate GCM. These parameters are designed to minimize the error between short term (3–28 days) forecasts of the climate model and a network of real world atmospheric, oceanic, and sea‐ice observations. The data assimilation system has an improved fit to observations when estimating ocean albedo and shortwave e‐folding length scale either individually or simultaneously. However, only individually estimated maps of shortwave e‐folding length scale are also shown to systematically reduce bias in longer multiyear climate forecasts during an out‐of‐sample period. The bias of the multiyear forecasts is reduced for parameter maps determined from longer DA cycle lengths.

Details

Title

Ensemble Kalman Filter Parameter Estimation of Ocean Optical Properties for Reduced Biases in a Coupled General Circulation Model

Author

Kitsios, V¹

; Sandery, P²; T. J. O’Kane²; Fiedler, R²

¹ CSIRO Oceans and Atmosphere, Aspendale, VIC, Australia; Department of Mechanical and Aerospace Engineering, Laboratory for Turbulence Research in Aerospace and Combustion, Monash University, Clayton, VIC, Australia
² CSIRO Oceans and Atmosphere, Battery Point, TAS, Australia

Section

Research Article

Publication year

2021

Publication date

Feb 2021

Publisher

John Wiley & Sons, Inc.

e-ISSN

19422466

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1029/2020MS002252

ProQuest document ID

2492673773

Ensemble Kalman Filter Parameter Estimation of Ocean Optical Properties for Reduced Biases in a Coupled General Circulation Model

Jump to:

Full Text

Abstract

Details

Suggested sources