Full text

Turn on search term navigation

1 Introduction

Autonomous observing systems, such as profiling floats, sample the global ocean at much higher spatial and temporal resolution than traditional ship-based hydrography . The more than 2 million profiles collected by autonomous Argo floats as part of the “Core Argo” array over the past 2.5 decades have revolutionized our understanding of the spatiotemporal variability in physical ocean properties in the top 2000 m of the water column (e.g., ). Complementing the successful implementation of a global physical float array, a Global Ocean Biogeochemistry (GO-BGC) Array will be implemented over the upcoming years and decades as part of the “BGC Argo” array . In recent years, the deployment of $>$ 250 floats carrying biogeochemical and biological sensors within the Southern Ocean Carbon and Climate Observations and Modeling project has already permitted the first quantification of spatiotemporal variability in carbon, nutrients, and oxygen in the Southern Ocean , promising similar scientific advances on the global scale. Similarly, the successful deployment of regional pilot “Deep Argo” arrays, with floats sampling down to 6000 m, demonstrates the possibility of expanding this technology to the sampling of the whole water column . All three arrays are part of the international “One Argo” program . While the advent of Argo floats represents a major step forward in the global ocean observing system with many potential uses, important uncertainties remain due to, e.g., the specific spatiotemporal coverage of the resulting dataset and uncertainties in determining float positions.

Most Argo floats profile the upper ocean approximately every 10 d, drifting with ocean circulation in between profiles at a parking depth of $\sim$ 1000 dbar and being localized via the Global Positioning System (GPS) when transmitting data upon surfacing (Fig. and ). As a float's position is not known while drifting at the parking depth, its exact trajectory in between profiles cannot be resolved, which directly affects float-based velocity estimates . Similarly, since the presence of sea ice prevents a float from surfacing to avoid sensor damage and total float loss , floats can currently not be routinely localized while under sea ice, thus limiting the utility of Argo floats in high-latitude polar regions .

Figure 1

Sketch of a synthetic vs. a real-world biogeochemical float. (a) A synthetic biogeochemical float in E3SMv2 with a 1 d sampling cycle, i.e., drifting at a parking depth of 1000 m for a day and then instantaneously sampling any model tracer or diagnostic throughout the water column in open waters and under sea-ice cover. (b) A biogeochemical Argo float with a 10 d sampling cycle, i.e., drifting at a parking depth of 1000 m for about 9 d, then descending to 2000 m and sampling temperature, salinity, oxygen, nitrate, pH, chlorophyll, particulate backscatter, and irradiance in the upper 2000 m while ascending. This float relies on localization via GPS, such that its exact position is unknown while under ice. Panel (b) is adapted from .

[Figure omitted. See PDF]

Data from the Core Argo and BGC Argo floats may still not fully characterize the spatiotemporal variability in upper ocean biogeochemical properties with high certainty despite the unprecedented spatiotemporal coverage of the global ocean. As an example, the projected float density in BGC Argo (1000 floats; ) corresponds to approximately one float per 6° $\times$ 6° grid cell. This target density is much larger than typical spatial scales of, e.g., variability in phytoplankton chlorophyll ( $<$ 100 km; see ) and $p$ CO $_{2}$ ( $<$ 1 km; see ), complicating the extrapolation from individual float-based observations to larger spatial scales. The 10 d sampling frequency is also longer than the turnover time of phytoplankton biomass of 2–6 d . Similarly, biogeochemical properties such as nutrients and carbonate chemistry are known to exhibit variability on sub-kilometer spatial scales and on timescales shorter than 10 d , e.g., due to the diurnal cycle , tides , or ocean weather . Altogether, this complicates the float-based detection of any trends in phytoplankton dynamics and carbon cycling.

Until the Deep Argo float program is fully operational, the majority of the ocean volume will remain undersampled . As a result, the observation-based detection of changes in deep-ocean heat and oxygen content and in deep-ocean ventilation are still mostly based on scarce hydrographic observations , which introduces large uncertainties when extrapolating to the global scale. All Deep Argo floats will be equipped with sensors to measure temperature and salinity, but the fraction of floats onto which an oxygen sensor will be mounted is uncertain . While available funding will ultimately limit both the number of floats to be deployed and the number of floats including an oxygen sensor, dedicated studies assessing the impact of float density on our ability to track large-scale changes in both deep-ocean temperature and oxygen are lacking.

Overall, in the absence of knowledge on the true distribution of physical, biogeochemical, and biological properties, uncertainties stemming from sampling frequency and density as well as imprecise localization are difficult to address with float-based observations alone, as it remains unknown how representative a given float profile is for a wider area or a longer timescale . Model simulations are one approach to address these uncertainties. In particular, numerical models with synthetic observing systems can provide a known truth for the global distribution of any physical, biogeochemical or biological tracer, so that such models can be used as an ideal test bed to address uncertainties in sampling network design. In the past, this approach has been used to assess variability in oceanic heat content , salinity distributions , the global oceanic carbon sink , and chlorophyll concentrations , demonstrating the wide range of possible scientific applications. In general, synthetic observations can be extracted either offline from time-averaged model output or online during model run time. Most published studies extracted the synthetic observations offline (e.g., ). This approach is storage-intensive, as model fields need to be stored at high temporal frequency (often at least daily) because real-world observations always represent snapshots of ocean properties rather than time averages, leading to higher uncertainties if lower-frequency (e.g., monthly) model fields are used to extract the synthetic observations. While this offline extraction of synthetic observations offers the advantage of being easily applicable to any model with high-enough frequency output available, extracting synthetic observation online during the model run time eliminates the uncertainty associated with assessing time-averaged model output, as such synthetic observations provide the same snapshot view of the modeled ocean as real-world observing systems do of the real ocean. Yet, since this approach requires substantial modifications of the model code, few models have such capabilities to date .

Here, we present the new synthetic biogeochemical float capabilities (LIGHT-bgcArgo-1.0) of the Energy Exascale Earth System Model version 2 (E3SMv2). These capabilities build on the Lagrangian in Situ Global High-Performance Particle Tracking (LIGHT) module . To more closely resemble real-world Argo floats, the synthetic floats sample the model fields online during model run time, which facilitates a more realistic assessment of what floats truly see when they sample the ocean. The number and distribution of the synthetic floats, the sampling frequency, and the sampled variables are defined by the end user before the start of the model experiment. After describing the implementation of synthetic floats into E3SMv2 in more detail in the following section, we will present its utility for physical, biogeochemical, and biological research questions with several case studies. These case studies address critical uncertainties related to float sampling networks, i.e., quantifying the impact of (i) sampling density on the float-derived detection of deep-ocean change in temperature or oxygen and on float-derived estimates of phytoplankton phenology, (ii) sampling frequency and sea-ice cover on float trajectory lengths and hence float-derived estimates of current velocities, and (iii) short-term variability in ecosystem stressors on estimates of seasonal variability.

2 Methods

2.1 Model description: E3SMv2 with synthetic biogeochemical floats (E3SMv2-LIGHT-bgcArgo-1.0)

We implement the synthetic biogeochemical float capabilities, LIGHT-bgcArgo-1.0, into the ocean component of the E3SMv2 version 2 (E3SMv2; ). The physical component of E3SMv2 consists of the E3SMv2 Atmosphere Model version 2 (EAMv2), the E3SMv2 Land Model version 2 (ELMv2), the Model for Prediction Across Scales – Ocean (MPAS-O), and the Model for Prediction Across Scales – Sea Ice (MPAS-Seaice; ). Both MPAS-O and MPAS-Seaice are run on unstructured multiresolution model grids, allowing for enhanced model resolution in selected regions . While we assess an ocean sea-ice-only simulation in this study, i.e., a simulation without coupled atmosphere and land model components (see Sect. ), we note that the new technical development described below can equally be used in the fully coupled mode. Ocean biogeochemistry is described by the Marine Biogeochemistry Library (MARBL; ), which is based on the Biogeochemical Elemental Cycling module (BEC; ). MARBL describes the biogeochemical cycling of carbon, nitrogen, silicon, phosphorus, iron, and oxygen and allows flexible lower-trophic-level ecosystem configuration . Lastly, sea-ice biogeochemistry is represented by four dissolved inorganic nutrients (silicate, nitrate, ammonium, and iron), dissolved organic nitrogen, and two phytoplankton groups, i.e., diatoms and small flagellates (MPAS-Seaice zbgc; ).

The implementation of synthetic biogeochemical floats in E3SMv2 builds on the online Lagrangian, in Situ, Global, High-Performance Particle Tracking (LIGHT) module developed for MPAS-O . For our simulations, LIGHT particles are seeded at a depth of 1000 m and advected laterally with ocean circulation using a second-order Runge–Kutta scheme; unlike previous studies e.g.,, particles are not permitted to move vertically. During the simulation, the virtual floats instantaneously sample the whole water column at their current location and with a prescribed frequency, e.g., daily or every 10 d. The synthetic floats are thus not subject to lateral current displacement during ascent or descent (cf. synthetic and Argo floats in Fig. ). Further, in contrast to Argo floats, whose position can only be registered upon surfacing in ice-free waters, the position of synthetic floats in E3SMv2 is known at all times (Fig. ). In general, any prognostic or diagnostic physical or biogeochemical model variable can be recorded by the synthetic floats, and the sampled variables are bilinearly interpolated to the float's current location. Inclusion of profiling floats increased the computational cost of the simulation by about 50 % and scaled approximately linearly with the numbers of processors, floats, and variables. However, we note that for the proof-of-concept simulation assessed in this study (see Sect. ), no attempt was made to optimize the new code's performance. In particular, interpolation weights from biogeochemical tracer locations to particle locations were unnecessarily recalculated for every tracer, which certainly caused significant slowdown. The distribution of virtual particles to be seeded (or floats to be deployed), the sampling frequency, and the variables to be recorded are defined by the user prior to the model simulation to best align with a given application or research question.

2.2 Model setup and simulation

For this study, we use the ocean–ice version of E3SM, i.e., MPAS-O and MPAS-Seaice, each with their corresponding biogeochemical module (MARBL and MPAS-Seaice zbgc; see Sect. ). In this version, the ecosystem within MARBL consists of three phytoplankton functional types (diatoms, diazotrophs, and a mixed small phytoplankton group) and a single zooplankton functional type. To demonstrate the functionality of the new synthetic float tool, we conduct a 6-year model simulation from 2012 to 2017, i.e., overlapping with the SOCCOM period starting in 2014 . In our simulation, the model grid has a horizontal resolution that ranges from $\sim$ 30 km in the tropics and the high latitudes to $\sim$ 60 km in the subtropics (Fig. ), and includes 60 $z$ -star levels in the vertical, i.e., the vertical coordinate system varies with changes in the local water-column thickness in response to sea-surface height variability (EC30to60E2r2 mesh; ). The simulation is forced with 3 h atmospheric data from the Japanese atmospheric reanalysis version 1.4 (JRA; ). All model fields are initialized from an existing unpublished simulation using the same model grid and atmospheric forcing data as the 6-year simulation analyzed in this paper . Model tracers in this existing simulation were initialized in the same manner as in . The simulation was spun-up from 1750 to 1957 using repeat-year atmospheric and river runoff forcing derived from the period July 1984 to June 1985, with atmospheric CO $_{2}$ concentrations held constant at 284 ppm between 1750 and 1850 and increasing according to historical records thereafter . Starting in 1958, interannually varying JRA forcing is enabled and run through 2012, after which point the atmospheric CO $_{2}$ is held at a constant value of 405 ppm through 2017. Nutrient inputs with river discharge are invariant in time and taken from GNEWS model. The model uses climatological fields for atmospheric deposition of dust and iron and nitrogen . Eulerian model output is stored at monthly frequency for all variables; daily output is stored for phytoplankton and zooplankton biomass.

Figure 2

(a) Distribution of deployed synthetic floats in E3SMv2 and Core and biogeochemical Argo floats between 2012 and 2017. Blue dots on the map indicate the initial positions of synthetic floats seeded in the deep ocean ( $>$ 2000 m). On the sides of the map, the number of deployed synthetic floats per degree longitude (top) and latitude (right) is shown in blue, with the corresponding numbers for Core Argo and biogeochemical Argo floats denoted as the gray and red lines, respectively. Note the different axis scale for the biogeochemical Argo floats (red scale; synthetic and Core Argo floats are shown on the black scale). (b) Zonal average model resolution in kilometers in the E3SMv2 test simulation with synthetic floats analyzed in this paper. The shaded area denotes the latitudinal range for which the model resolution is eddy-permitting, i.e., where it is higher than the Rossby radius of deformation .

[Figure omitted. See PDF]

Synthetic biogeochemical floats (a total of 10 560) are deployed on 1 January 2012. Initially, floats are located at every grid cell vertex at a depth of 1000 m, then are successively culled so that a specified number of cells separates each float. Due to the multiresolution model grid of MPAS-Ocean, the resulting density of synthetic floats varies in space. Only floats seeded in the open ocean away from continental shelves and slopes, i.e., at a water depth $>$ 2000 m, are retained for the analysis in this study (8739 floats; see Fig. a). The synthetic float density is up to 4 times higher than the density of Core Argo floats deployed for the same period in the (sub)tropics (especially in the Pacific and Indian sectors) and is comparable to it elsewhere (Fig. a). For all latitudes and longitudes, the synthetic float density exceeds that of Deep Argo (not shown) and BGC Argo (red lines; note the different scale; Fig. a). The synthetic floats instantaneously sample the whole water column every day at midnight Greenwich Mean Time. In addition to every float's position at sampling, the following model variables are recorded: temperature, salinity, dissolved inorganic carbon (DIC), alkalinity, nitrate, silicic acid, phosphate, oxygen, total phytoplankton carbon biomass, total phytoplankton chlorophyll, zooplankton carbon biomass, O $_{2}$ consumption, O $_{2}$ production, the zonal- and meridional-resolved velocity components, sea-ice fraction, mixed-layer depth (as determined with the 0.03 kg m $^{- 3}$ density criterion), sea-level pressure, surface partial pressure of CO $_{2}$ ( $p$ CO $_{2}$ ) in seawater, difference in $p$ CO $_{2}$ between the ocean and the atmosphere, atmospheric CO $_{2}$ concentration, air–sea CO $_{2}$ flux, and air–sea O $_{2}$ flux. We calculate pH on each float profile offline using the Python routines to model the ocean carbonate system (mocsy v2.0; code was obtained from https://github.com/jamesorr/mocsy on 19 July 2023; ). We use DIC, alkalinity, potential temperature, salinity, silicic acid, phosphate, and sea-level pressure from each float profile as inputs for the mocsy functions.

3 Results and discussion

3.1 Evaluation of synthetic float velocity, temperature, salinity, and nitrate in E3SMv2

We evaluate the synthetic float capabilities in E3SMv2 in two ways: (1) by comparing the synthetic float data to the full Eulerian model output, we ensure the sampling by synthetic floats technically functions as intended and is sufficient in terms of spatiotemporal coverage, and (2) by comparing the synthetic float data to Core Argo data , we evaluate the extent to which the new synthetic observing network can be used for real-world applications. Specifically, we evaluate whether ocean currents at 1000 m, i.e., at the float parking depth, are adequately represented and whether environmental variables, such as temperature, salinity, and nitrate, are adequately simulated in E3SMv2 for realistic float-based sampling.

The simulated pattern of current velocities in E3SM agrees with an observation-based estimate, but current speeds are overall biased low in the model. In E3SM, current velocities at 1000 m are highest in the Antarctic Circumpolar Current and in the subpolar North Atlantic off the southeast coast of Greenland (locally $>$ 8 cm s $^{- 1}$ ), with velocities of less than 3 cm s $^{- 1}$ elsewhere (Fig. a). Figure b shows a Lagrangian-based velocity estimate derived from all 10 d synthetic float positions that were averaged within 3° $\times$ 3° boxes. The spatial patterns and magnitudes in current velocity produced by the Eulerian model output are largely captured by the synthetic floats (cf. Fig. a and b). The equatorial regions are the only exception, for which the Lagrangian E3SMv2 estimate suggests higher velocities (up to 4 cm s $^{- 1}$ ) than the Eulerian estimate (up to 2 cm s $^{- 1}$ ). This implies substantial variability in current speeds at 1000 m in E3SMv2 at sub-monthly timescales that are not captured by the Eulerian time-averaged output. Quantitatively, using bilinear interpolation to align the average Eulerian velocity field to the same 3° $\times$ 3° grid of the Lagrangian velocity estimate, the Pearson correlation coefficient between the two fields amounts to 0.40, and the area-weighted mean bias is 0.21 cm s $^{- 1}$ . In comparison to Argo-derived current speeds , velocities in E3SMv2 at 1000 m are a factor of 2–3 too low, and the area-weighted mean bias amounts to 4.2 cm s $^{- 1}$ (compare panels (a)–(c) in Fig. ). This bias in ocean current speeds is a common feature in non-eddying ocean circulation models and is possibly related to how high-frequency dynamical processes are parameterized (e.g., internal mixing or tides; ), in addition to limitations related to grid resolution. In spite of this bias, most high-velocity features present in the Argo-derived dataset are reproduced in E3SMv2 (Fig. b and c), and the Pearson correlation coefficient between the two fields amounts to 0.66. The only exception to the fairly good spatial agreement is the Gulf Stream, which is too shallow in E3SMv2 (not shown), resulting in much lower current speeds at 1000 m in E3SMv2 than in the Argo-based estimate in the northwest Atlantic ( $<$ 2 cm s $^{- 1}$ compared to $\sim$ 12 cm s $^{- 1}$ ).

Figure 3

Horizontal current velocity at 1000 m in centimeters per second (a) in the full Eulerian E3SMv2 output from 2012 to 2017, (b) derived from 10 d position data of synthetic floats in E3SMv2 and averaged for regular 3° $\times$ 3° boxes, and (c) derived from 10 d position data of Argo floats and averaged for regular 3° $\times$ 3° boxes . Note the different scales between panels (a)–(b) and (c).

[Figure omitted. See PDF]

Synthetic floats in our configuration of E3SMv2 are capable of sampling the wide range of global physical and biogeochemical properties that appear in the Eulerian-mean model output, and they sample across a wider range of global temperature and salinity values than Core Argo floats. By comparing the model datasets in temperature–salinity space, we evaluate the ability of the synthetic floats in E3SMv2 to correctly sample their model environment (Fig. ). For the three latitudinal bands, 30–90° N, 30° N–30° S, and 30–90° S, the sampled temperature–salinity–nitrate space is very similar for the 10 d whole-water-column synthetic float output and the monthly mean Eulerian output (compare the first two columns in Fig. ; note that only data for the year 2016 is shown here). We attribute any differences to not having a synthetic float sample in every single grid cell and to the differing temporal resolution of the data. In comparison to all core-Argo data from 2012–2017 (third column in Fig. ), the model output samples a larger temperature–salinity space (see e.g., cold, fresh waters for 30–90° N in panels a–c). While model biases likely contribute to some extent, we mostly attribute this difference to differences in the float distribution (e.g., in contrast to in E3SM, there are very few floats in the Arctic within Argo; see Fig. ) and to differences in the sampled water depth (see e.g., fewer data points in Core Argo than in E3SMv2 for the latitudinal band 30–90° S at temperatures between 0 and 5 °C and salinities between 34 and 35; these data points lie in the deep ocean $>$ 2000 m as indicated by nitrate concentrations exceeding 35 mmol m $^{- 3}$ ). In summary, the synthetic floats in E3SMv2 reproduce key large-scale patterns of variability of both the Eulerian model output and the Core Argo floats, making these floats a valuable tool for the assessment of spatiotemporal variability in physical and biogeochemical properties from a Lagrangian perspective and for sampling network design.

Figure 4

(a–c) Temperature–salinity diagrams north of 30° N of (a) 10 d data from synthetic floats in E3SM, (b) monthly mean Eulerian E3SMv2 output, and (c) Core Argo float data . Data points in panels (a)–(b) are for the whole water column in 2016, and data points in panel (c) are a subset of all available points for the top 2000 m in 2012–2017. We note that subsurface interannual variability in temperature and salinity is negligible on the large spatial scale assessed here and that $>$ 80 % of all E3SMv2 data are in the top 2000 m of the water column, implying comparability of the first two columns with the third. Data points in panels (a)–(b) are colored as a function of nitrate concentrations (in mmol m $^{- 3}$ ) and shown as averages within 50 equally sized bins in temperature and salinity space. The small inlets in panels (a)–(c) show the distribution of data. Panels (d–i): same as (a)–(c) but for (d)–(f) 30° N–30° S and (g)–(i) south of 30° S.

[Figure omitted. See PDF]

3.2 The impact of sampling frequency on float-derived velocities

Only knowing the position of typical Argo floats upon surfacing every $\sim$ 10 d, Argo-float-derived velocity estimates are subject to uncertainty stemming from the assumption of a linear trajectory between any two positions (see example from a synthetic float in Fig. a). With velocity calculated as the distance traveled per 10 d, a shorter trajectory length for 10 d sampling (blue line in Fig. a) than for daily sampling (black line) implies that the velocity derived from 10 d positions is underestimated relative to that derived from daily positions. We use the synthetic E3SMv2 floats to compare the difference in the 10 d trajectory length (equivalent to the 10 d averaged velocity) when (a) knowing the respective float's position once per day and (b) only knowing the respective float's position on day 1 and day 11 (as for Core Argo floats) of each 10 d period (Fig. ). Analysis of all our modeled synthetic floats reveals that the true distance traveled by the floats over a 10 d period is longer than indicated by their 10 d position differences. Our analysis shows that the mismatch in the trajectory length between daily and 10 d float profiling frequencies can be substantial (Fig. b), with 10 % of all 10 d trajectories in the Southern Ocean south of 30° S being at least 50 % longer when the float position is known every day as opposed to only at the start and end of the trajectory (orange box in Fig. b). In other words, for 10 % of all Southern Ocean synthetic float trajectories, velocities derived from the float positions are more than 50 % too low if the float positions are only known at the start and end of any 10 d period. In general, the smaller the error in trajectory length/velocity ( $x$ axis in Fig. b), the more trajectories are affected. In addition, trajectories in more northerly latitudinal bands are more severely affected by this error than those in the Southern Ocean. For example, while 80 % of all trajectories north of 60° N ( $<$ 70 % in the Southern Ocean south of 60° S) display a difference of at least 10 % in 10 d trajectory length/velocity for daily vs. 10 d float profiling frequencies, the number of affected trajectories amounts to nearly 40 % (10 % in the Southern Ocean) for a 50 % mismatch. These regional differences reflect the general gradient from high velocities at 1000 m in the Southern Ocean (Antarctic Circumpolar Current) to lower velocities in the Northern Hemisphere (see Figs. c and a). Since the 10 d trajectory length forms the basis for float-derived velocity estimates (Fig. ; ), our analysis illustrates the bias introduced by the absence of more frequent knowledge of every float's position. Acknowledging that it remains unclear to what extent the absence of eddy-permitting or eddy-resolving grid resolution at extratropical latitudes affects these results (Fig. ), our analysis demonstrates that this bias can be quite substantial in certain instances.

Figure 5

(a) Example synthetic 10 d trajectory from the Southern Ocean to compare daily and 10 d sampling. (b) Comparison of the 10 d trajectory length of all synthetic floats for daily and 10 d sampling. Plotted is the fraction of all 10 d trajectories ( $y$ axis) displaying a trajectory length at daily sampling that exceeds the length at 10 d sampling by a certain percentage ( $x$ axis). Data are grouped into 30° latitudinal bands (see colors). Percent longer trajectory as shown on the $x$ axis is equivalent to percent lower velocity. The orange box refers to the example given in the text. (c) Difference in 10 d trajectory length of the synthetic floats for daily and 10 d sampling plotted as a function of the average 10 d velocity at 1000 m in centimeters per second as derived from the float positions. Shown are the 50th (dark blue) and 90th percentiles (light blue).

[Figure omitted. See PDF]

3.3 Case studies

Here, we present four example mini-studies using output from the synthetic floats in E3SMv2 to illustrate some of the capabilities of this modeling tool. In particular, we will use the synthetic floats to quantify variability in ecosystem stressors as derived from synthetic float snapshots at different sampling frequencies (Sect. ), the impact of float sampling density on the float-based detection of changes in deep-ocean water-mass properties (Sect. ), the impact of sea-ice cover on estimates of trajectory length (Sect. ), and the impact of float sampling density on float-derived phytoplankton bloom phenology (Sect. ). The analysis in each of the following subsections is not meant to be comprehensive, and many more applications are imaginable, some of which will be outlined further. Each of the following subsections will be structured like a mini-paper, with a motivation followed by methods specific to the respective case study, before presenting and discussing the results.

3.3.1 Case study I: float-based quantification of seasonal variability in marine ecosystem stressors

Our first case study quantifies the synthetic float-derived amplitude of seasonal variations in physical and biogeochemical marine ecosystem stressors, i.e., temperature, nutrient availability, oxygen levels, and carbonate chemistry, as recorded at different sampling frequencies. This case study is motivated by the need to understand the present-day exposure of marine organisms to certain environmental conditions, as this can inform their potential for acclimation and adaptation to future environmental change . Observations have revealed ongoing change in the seasonal amplitude of upper-ocean carbonate chemistry, e.g., south of Australia (mooring-based; ) and in the global ocean (ship-based; ), making an earlier exceedance of thresholds critical to ecosystems likely and highlighting the need to better understand seasonal and sub-seasonal variability in all marine ecosystem stressors. Based on regional hydrographic, glider, and mooring observations, we know that substantial short-term variability in ecosystem stressors is caused by, e.g., the diurnal cycle , tides , or ocean weather . High temporal resolution data can be provided by satellites for the global surface ocean (although temporal composites are often necessary due to data gaps at a sub-monthly scale; ), by gliders for the upper ocean in small regions , and by profiling floats, which offer advances over these aforementioned technologies in terms of their spatiotemporal coverage of the global ocean, especially at the subsurface. Yet, given the floats' 10 d sampling cycle, it remains unclear to what extent these data capture extreme conditions that are not representative of the seasonal cycle. Further, the contribution of daily variability to float-derived estimates of seasonal variability remains unquantified.

In this case study, we use the ability of the synthetic floats in E3SMv2 to sample the water column each day to assess how sampling frequency affects estimates of the seasonal amplitude of marine ecosystem stressors. In particular, we assess the difference in daily and 10 d sampling for estimates of the seasonal amplitude in temperature, nitrate, oxygen, and pH (Fig. ). Defining the seasonal amplitude as the difference between the maximum and minimum of a given property over any given calendar year during 2012 to 2017 (see Fig. a for an example), we quantify the seasonal amplitude along all 1-year trajectories of synthetic floats in E3SMv2. We particularly focus our analysis on the top 300 m of the water column where seasonal variability is largest (Fig. c) and on the tropical region between 21° S and 21° N, where our model simulation is run at eddy-permitting resolution (see Fig. ). Only retaining float trajectories that stay within the tropical latitudinal bounds for any full calendar year, this results in 34 098 estimates of seasonal amplitude for this region, and we report the mean $\pm$ 1 standard deviation in Fig. c.

Figure 6

Case study “Quantifying physical-biogeochemical variability”: (a) evolution of surface pH along an example 1-year float trajectory for daily (blue) and 10 d (black) sampling. The seasonal amplitude, defined as the maximum minus the minimum pH over the whole year, is given for both sampling frequencies on the right side. (b) Map showing the difference in seasonal pH amplitude in percent between daily and 10 d sampling for each float in the year 2012. (c) Synthetic float-derived vertical profiles for the top 300 m of (left) the seasonal amplitude with 10 d sampling and (right) the difference in the seasonal amplitude between daily and 10 d sampling in percent for temperature (black), nitrate (blue), oxygen (orange), and pH (red). All floats that stayed in the tropical region between 21° S and 21° N for any full calendar year during 2012 to 2017 have been included in the analysis, resulting in 34 098 estimates of seasonal amplitude. The solid lines correspond to the average over all estimates, and the shading denotes 1 standard deviation.

[Figure omitted. See PDF]

The sampling frequency of the synthetic floats substantially affects estimates of seasonal amplitude, with the effect varying both horizontally (Fig. b) and vertically (Fig. c). Acknowledging that differences in seasonal surface pH amplitude of more than 15 % are simulated for all ocean regions, the average difference for surface pH is largest in the tropics (Fig. b), where the eddy-permitting model grid resolution facilitates a stronger spatiotemporal variability in the simulated tracer fields. In the tropics, the average seasonal amplitude between 50 and 300 m derived from daily sampling exceeds the amplitude derived from 10 d sampling by 12.4 % (temperature), 11 % (nitrate), 10 % (oxygen), and 9.6 % (pH; Fig. c). Above 50 m, the difference is similar to the one for below 50 m for nitrate and pH, while being smaller and larger for temperature (6.5 %) and oxygen (14.7 %), respectively. The larger discrepancy for oxygen is likely associated with the timescale for air–sea exchange of oxygen (a few weeks; ). Our findings imply that atmosphere–ocean oxygen disequilibrium manifests more strongly in seasonality estimates derived from daily float snapshots. For all marine ecosystem stressors, the variability (expressed as 1 standard deviation around the mean) in the impact of sampling frequency is substantial, with differences in estimates of seasonal amplitude often exceeding 20 % (all variables), 25 % (subsurface temperature), or even 30 % (near-surface oxygen).

While it is unsurprising that higher-frequency sampling captures more temporal variability, the 10.6 $\pm$ 10.9 % larger seasonal amplitude in daily float sampling in the tropics in our model experiment (mean $\pm$ 1 standard deviation averaged over all ecosystem stressors in the top 300 m of the water column) highlights the uncertainty associated with the float-based detection of changes in prevalent environmental conditions in a given region based on 10 d sampling. Differences in seasonality of physical and biogeochemical properties across large-scale ocean regions are well-established , but the simulated spatial variability within any large-scale region in E3SMv2 (Fig. b) underscores the importance of both sampling distribution and frequency when aiming to adequately capture large-scale dynamics with observations. This is especially apparent for the tropics at eddy-permitting grid resolution (Fig. ). Yet, we note that even at non-eddy-permitting resolution, the synthetic floats suggest differences in seasonal amplitude for the two sampling frequencies of comparable magnitude to those in the tropics for some float trajectories in, e.g., the Southern Ocean (Fig. b). Altogether, this suggests that while 10 d sampling with floats provides unprecedented global observational coverage to quantify spatiotemporal variability in marine ecosystem stressors, targeted regional assessments of sub-10 d variability, e.g., with gliders e.g., or moorings e.g.,, are necessary to adequately quantify exposure of marine organisms to varying environmental conditions, which is a key factor in determining an organism's resilience to environmental change .

3.3.2 Case study II: float-based detection of changes in deep-ocean water-mass properties

To obtain basin-scale estimates of variability and trends in the properties of climatically important water masses such as Antarctic Bottom Water or North Atlantic Deep Water, a denser observing network in both space and time is required than hydrographic observations can provide . To that end, a key goal of Deep Argo is the detection of changes in deep-ocean heat content, which will facilitate the tracking of the deep ocean's contribution to steric sea-level rise . Further, the addition of oxygen sensors on Deep Argo will enable the detection of changes in deep-ocean oxygenation, particularly in regions downstream of deep- and bottom-water formation in the North Atlantic and Southern Ocean . While any difference in the spatiotemporal variability in deep-ocean temperature and deep-ocean oxygen will directly impact the number of floats required to capture large-scale changes in each variable over time, this difference remains unquantified to date.

In this case study, we use different float densities to quantify the error associated with capturing larger-scale temporal variability in deep-ocean temperature, salinity, and oxygen in the North Atlantic (between 30–60° N and 10–60° W) and Southern Ocean (south of 60° S). To facilitate the comparison of errors across variables, we calculate the normalized root-mean-square error (NRMSE; normalized by 1 standard deviation of all monthly Eulerian values averaged over the respective subarea; see Fig. for spatial distribution of variables) between Eulerian and synthetic float model output. By normalizing by 1 standard deviation, the underlying assumption is that this metric captures sufficient variability in the true tracer distribution to facilitate drawing conclusions on the required float density to reproduce the temporal evolution of different variables. In particular, we calculate the NRMSE between the 6-year-long monthly time series of the Eulerian model output and the float-derived monthly time series constructed from 10 d sampling for a given float density. For each float density, we randomly subsample all available floats in each subregion 10 000 times to obtain NRMSE percentiles of the time series mismatch (see violin plots in Fig. ).

Figure 7

Case study “Deep ocean”: (a) temperature in °C below 2000 m and averaged over 2012–2017. (b) Violin plots of the normalized root-mean-square error (NRMSE) of the monthly temperature time series for the area south of 60° S (orange box in panel (a)) between the full Eulerian output and a float-based estimate using 10 d sampling and subsets of all synthetic floats corresponding to different global target float densities ( $x$ axis). The subsampling is repeated 10 000 times. The horizontal red lines denote the 90th, 50th, and 10th percentiles (from top to bottom). The horizontal black line corresponds to the NRMSE using all synthetic floats in the area. The RMSE is normalized by the area-weighted standard deviation of monthly mean temperature values of all grid cells in the area. The unnormalized RMSE is denoted in grey. Panel (c): same as (b) but for the North Atlantic between 30–60° N and 10–60° W (green box in panel (a)). Panels (d)–(f) and (g)–(i): same as (a)–(c) but for (d)–(f) salinity and (g)–(i) oxygen (in mmol m $^{- 3}$ ), respectively.

[Figure omitted. See PDF]

Our analysis reveals that (1) the NRMSE decreases when more floats are used in the calculation of the error across all variables and in both regions, (2) the NRMSE is overall higher in the North Atlantic than in the Southern Ocean, and (3) for both regions, the NRMSE is highest for temperature and lowest for oxygen. The first point aligns with that of other studies e.g.,. The second point can possibly be attributed to more variability in bottom topography in the North Atlantic (note the presence of the mid-Atlantic ridge in Fig. a), making a float-based estimate of any water-mass property in this region more dependent on the float distribution than in the Southern Ocean, where the average spatial variability is much smaller (Fig. a, d, g). This finding is unaffected by the normalization of the error metric; the root-mean-square error (RMSE) is approximately 2 (temperature), 4 (salinity) and 2 (oxygen) times larger in the North Atlantic than in the Southern Ocean for all float densities over the 6-year time series (see printed gray numbers in violin plots in Fig. ).

That the NRMSE is highest for temperature and lowest for oxygen in both regions has implications for the design of a deep-ocean float array. A generally lower NRMSE for biogeochemical (oxygen) than for physical (temperature and salinity) water-mass properties implies that to capture property changes at comparable accuracy (at least in terms of normalized error metrics), it would be sufficient to equip a subset of all floats with oxygen sensors, thereby reducing the overall cost of a deep-ocean float array. However, we note that based on our analysis, the RMSE for a deep-ocean array consisting of 400 floats, i.e., one-third of the global target density , amounts to 1.4 mmol m $^{- 3}$ (Southern Ocean; printed gray numbers in Fig. h) and 3.62 mmol m $^{- 3}$ (North Atlantic; Fig. i), which is approximately 6 times larger than the range of regionally and monthly averaged Eulerian model output over the 6-year time series (0.24 and 0.60 mmol m $^{- 3}$ ; not shown). Our analysis thus suggests that such a reduced deep-ocean oxygen array would complicate the float-based detection of interannual variability and possibly trends in deep-ocean ventilation on large spatial scales. Importantly, the RMSE is reduced by 33 % and 43 % in the Southern Ocean and the North Atlantic, respectively, when equipping a full global network of 1200 floats with oxygen sensors instead of only 400 floats (Fig. h–i), enhancing our ability to capture variability in oxygen concentrations over large spatial scales with floats.

3.3.3 Case study III: float trajectories under Southern Ocean sea-ice cover

Argo floats rely on localization via GPS upon surfacing in ice-free waters (Fig. b). Since the risk of damaging instrumentation is high upon contact with sea ice, conventional floats follow an ice-avoidance protocol, which makes them abort their ascent when subsurface temperatures indicate high likelihood of sea ice present at the surface . In the absence of position data, the trajectory of such a float is then linearly interpolated between the last position before and the first position after the under-ice period. Depending on how long a float cannot be localized, this procedure potentially causes large uncertainties in the estimated trajectory .

In this case study, we use our ability to localize synthetic floats at all times, including under sea-ice cover (Fig. a), to quantify the impact of sea-ice presence on float trajectories for different sectors of the Southern Ocean. Using all Southern Ocean 1-year float trajectories from our 6-year proof-of-concept simulation, we compare the 1-year trajectory length of floats (1) when knowing the position of the float every day of the year and (2) when linearly interpolating a float's position when it is under substantial sea-ice cover. For this analysis, we use the modeled sea-ice concentration as sampled by the synthetic floats to determine whether substantial sea-ice cover is present for a given location and time, and we quantify the difference in trajectory length for different sea-ice concentration thresholds defining “substantial sea-ice cover”, ranging from 5 %–95 %. Lastly, for our analysis, we divide the Southern Ocean south of 60° S into four sectors, i.e., the Weddell Sea (between 300° E and the prime meridian), East Antarctica (0–160° E), the Ross Sea (160–210° E), and the Amundsen and Bellingshausen seas (210–300° E).

Linear interpolation of float position data in E3SMv2 under-ice floats introduces biases in under-ice trajectory length and estimated under-ice statistical tracer properties (Fig. ). In agreement with observations , annual mean sea-ice concentration in E3SMv2 is highest in the southwestern Weddell Sea (Fig. a). In winter (June–August), large areas of the high-latitude Southern Ocean are fully ice covered, implying a lack of exact position data for Core Argo floats. The trajectory of an example synthetic float in E3SMv2 reveals that the mismatch in the 1-year trajectory length can be substantial (58 % in the example in Fig. b, assuming unknown float positions when sea-ice cover exceeds 50 %). As the linear interpolation of position data also re-locates the associated data of physical and biogeochemical water-mass properties (as illustrated with nitrate in Fig. b), this procedure adds uncertainty to the float-derived tracer distribution field in the sea-ice zone, and challenges our ability to accurately determine decorrelation length scales of these tracers (as done in, e.g., ).

Figure 8

Case study “Sea-ice cover”: (a) Annual mean (left) and winter mean (June, July, August; right) sea-ice concentration in percent in each grid cell of E3SMv2 averaged over 2012–2017. (b) Example illustrating the effect of unknown float positions under sea ice on the trajectory length. The daily location of the float is plotted in latitude–longitude space and colored as a function of surface nitrate concentrations (in mmol m $^{- 3}$ ). For this example, the location is assumed unknown whenever the sea-ice concentration exceeds 50 % (crosses) and known elsewhere (squares). When the float location is unknown, the trajectory is obtained by linearly interpolating between the last and first known positions (black line). (c) Difference in trajectory length in percent for all 1-year trajectories between 2012 and 2017 with known or unknown daily positions as a function of the sea-ice concentration threshold in percent. The position of a float is considered unknown whenever a given sea-ice threshold is exceeded, and data gaps are linearly interpolated. Results are shown separately for the Weddell Sea (black), Ross Sea (purple), Amundsen and Bellingshausen seas (red), and the east Antarctic (blue). See map inset for the initial float positions of all considered 1-year trajectories.

[Figure omitted. See PDF]

The difference in under-ice trajectory length depends on the threshold used to identify critical sea-ice conditions and on the location of the float (Fig. c). The difference in trajectory length is larger the more often a float encounters critical sea-ice conditions. Thus, the higher average sea-ice concentration in the Weddell Sea than in other sectors explains the largest average differences in trajectory length in this region ( $\sim$ 40 % for sea-ice concentration thresholds $\leq$ 10 %; see Fig. c). Even for a sea-ice concentration threshold of 50 %, trajectories are between 10 % (East Antarctic) and 25 % (Weddell Sea) longer when accounting for all floats' true positions than when linearly interpolating positions under sea-ice cover. Given that Core Argo floats aim to avoid any direct contact with sea ice to minimize the risk of damage, their ice-avoidance procedures are very risk-averse , which corresponds to a low sea-ice concentration threshold for surfacing shown in Fig. . Acknowledging that the horizontal grid resolution used here is too coarse in the Southern Ocean to adequately resolve eddies (Fig. ), our analysis of the synthetic floats in E3SMv2 showcases the possibly large uncertainty in float trajectories when float positions for such conditions are unknown. Alternative approaches to locating floats under sea ice, such as via acoustic tracking or via contours of potential vorticity, sea level, or density , offer the potential to reduce errors in the hydrographic measurements and the geopositioning of the associated tracer data.

3.3.4 Case study IV: deriving phytoplankton phenology from biogeochemical floats

Biogeochemical floats can provide new information about surface and subsurface phytoplankton abundance, helping to elucidate their role in global carbon and nutrient cycling. In the Southern Ocean, biogeochemical floats have already provided a more detailed description of phytoplankton phenology (including under sea-ice cover; see, e.g., ), phytoplankton biomass loss , and net community production (e.g., ). Similar advances are expected in other regions e.g., as more biogeochemical floats are deployed globally (see https://www.go-bgc.org/array-status and https://maps.biogeochemical-argo.com/bgcargo/, last access: 28 June 2024). As a result, we urgently need to assess our ability to capture large-scale characteristics of phytoplankton dynamics with float networks differing in density.

In this case study, we assess the ability of float networks differing in float density to capture subsurface phytoplankton bloom characteristics, i.e., the timing (day of the year) and magnitude of phytoplankton biomass peaks, in the subtropical Pacific (between 15–30° N and 120–170° W), where subsurface maxima in phytoplankton biomass are commonly observed . For each calendar year during 2012 to 2017, we compare the timing and magnitude of maximum total phytoplankton carbon biomass at 30 and 60 m depth between 10 d Eulerian output and 10 d data from synthetic floats. Synthetic float data at 10 d intervals were obtained from the daily data by randomly assigning each float a sampling start day between 1 and 10 so that floats sample the modeled ocean on different days. To obtain a statistically robust estimate of the mismatch, we subsample the 197 available synthetic floats in the subtropical Pacific 5000 times to float densities ranging from 2 to 28 in this subregion (corresponding to between 100 and 1200 floats globally). This results in 30 000 estimates of bloom characteristics over the 6 simulation years for each float density, and we report the average mismatch $\pm$ 1 standard deviation in Fig. .

Figure 9

Case study “Phytoplankton phenology”: (a) time series from 2012 to 2017 of average 10 d total phytoplankton carbon biomass (in mmol m $^{- 3}$ ) in the subtropical northeast Pacific between 15–30° N and 120–170° W at 30 m in the full Eulerian output (black) and based on all available floats (light grey). The corresponding time series at 60 m are displayed in dark blue and light blue, respectively. (b–e) Example fields of daily averaged total phytoplankton carbon biomass for (b–c) 5 March 2014 and (d–e) 15 March 2014 in the subtropical northeast Pacific at (b, d) 30 m and (c, e) 60 m. (f) Ratio of the annual peak phytoplankton carbon biomass (mean $\pm$ 1 standard deviation) between the 10 d float output subsampled to different global float densities ( $x$ axis) and the full Eulerian 10 d output at 30 m (black) and at 60 m (blue) in the subtropical northeast Pacific. The subsampling is repeated 5000 times, and the ratio is thus quantified 30 000 times (6 years of data) for each global float density. Horizontal lines denote the mismatch between the full float output and the Eulerian output (197 floats). Panel (g): same as (f) but for the mismatch in days in the timing of peak phytoplankton carbon biomass.

[Figure omitted. See PDF]

In the subtropical Pacific, E3SMv2 peak bloom magnitude and timing is different in the surface and subsurface ocean (Fig. ). Peak biomass levels display more year-to-year variability and are up to 2 times higher at 30 m (black and grey lines in Fig. a) than at 60 m (dark- and light-blue lines). Further, based on 10 d averages, maximum biomass is simulated to occur earlier in the year at 30 m than at 60 m, possibly illustrating differences in environmental factors such as light and nutrient availability in driving phytoplankton dynamics throughout the year . Overall, the spatiotemporal variability is high at both depth levels (Fig. a–e), suggesting that a certain number of floats is needed to adequately capture large-scale bloom dynamics in this region.

The peak bloom magnitude and timing in the subtropical northeast Pacific is dependent on the total number of floats sampling the region (Fig. f and g). We conduct random subsampling of the full set of 197 available 6-year synthetic float time series in the area 5000 times for different float densities. For the vast majority of repetitions in the subsampling exercise, the magnitude of the bloom peak derived from the synthetic floats is larger than the true bloom peak derived from the Eulerian output (ratio $>$ 1 in Fig. f). We acknowledge that the sampling time of all floats (midnight Greenwich Mean Time) likely causes a slight systematic discrepancy between the full-day average of the Eulerian model output and the synthetic float-based estimates (see Fig. a). Increasing the float density improves the ability of the float network to capture the true magnitude of the bloom peak at both depths (Fig. f). While the mismatch at float densities of $\leq$ 200 global floats is larger for the bloom at 30 m, the opposite holds for float densities larger than this. At 60 m, even the full synthetic float network in this region, whose density corresponds to 8400 floats globally, overestimates the magnitude of the bloom peak by $\sim$ 5 % (horizontal blue line), while it is captured well at 30 m (horizontal black line). On average, the timing of the bloom is reasonably well captured for all float densities ( $\pm$ 10 ds at most; Fig. g). However, the variability around the mean is large for both depths and all float densities (1 standard deviation ranges from 21 to 75 d at 30 m and from 40 to 61 d at 60 m across float densities; see whiskers in Fig. g). Given that the bloom duration in the ocean typically amounts to a few weeks (see Fig. and, e.g., ), a misrepresentation of the bloom timing of $>$ 3 weeks combined with an error in the magnitude of $>$ 10 % complicate the detection of any long-term trend. In this context, we further note that the analysis of phytoplankton bloom phenology based on float-derived 10 d snapshots of phytoplankton biomass, as done here to mimic the typical real-world sampling frequency, possibly masks substantial biomass variability at sub-10 d timescales (compare panels (b) and (d) as well as panels (c) and (e) in Fig. ). While we focus our analysis on the uncertainty in deriving phytoplankton bloom characteristics stemming from the float density in a given focus area, the synthetic float observations should also be used in future work to assess the uncertainty related to the sampling frequency. Our subtropical Pacific-focused case study highlights the spatiotemporal variability in phytoplankton dynamics in this region and underlines the difficulty of adequately capturing bloom dynamics on large spatial scales from sparse measurements. Future studies should investigate the magnitude of these uncertainties in other ocean regions see, e.g.,.

4 Limitations and future work

While biogeochemical float capabilities in E3SMv2 are a remarkable new tool for the ocean modeling and observational communities, the synthetic floats sample the model fields somewhat differently than Argo floats sample the real ocean. Argo floats can be laterally displaced by ocean circulation while profiling the upper ocean and while transmitting data at the ocean surface. This study does not account for the possibility of lateral displacement in the synthetic floats during this profiling, adding uncertainty to the comparison of velocity estimates derived from synthetic and Argo floats (Fig. ). Previous work has quantified the effect of velocity shear to amount to up to 1.2 cm s $^{- 1}$ in the tropics, leading to an uncertainty of $\pm$ 8° in the current direction . Acknowledging that the absence of this effect in E3SMv2 likely has to be considered for regional applications of the synthetic floats, we assume this shortcoming to be of lesser importance for the basin-scale applications presented here. In contrast to Argo floats, all synthetic floats in E3SMv2 sample the water column at the same time each day, i.e., midnight Greenwich Mean Time. This means that the sampling of all variables undergoing strong diurnal fluctuations, e.g., light-sensitive processes such as biological productivity and hence biomass, is skewed towards a particular phase of the respective diurnal cycle, increasing the discrepancy between float-derived estimates and daily averaged model output. As the seasonal cycle directly impacts the variability in the diurnal cycle over the course of the year, this effect is expected to be less pronounced in tropical regions (see Sect. ) than in polar regions.

For this study, synthetic floats in E3SMv2 are seeded uniformly as a function of grid resolution (Fig. ). While uniform ocean coverage by Argo floats is the ultimate goal , achieving this is complicated by the dependence on ships for float deployment. In addition, in contrast to the unlimited lifetime of synthetic floats, the typical lifetime of today's Argo floats is $\sim$ 5–7 years (typically lower for BGC Argo than Core Argo; ), causing spatial gaps in our observing system if a timely re-deployment of a float is not possible for a given region. While we did not account for random failures of some sensors or entire floats in the 6-year proof-of-concept simulation analyzed for this paper, imperfect synthetic float datasets could easily be constructed offline to assess the impact of a temporary or permanent absence of observations in a specific region or for a specific time period. Future work should assess the impact of data scarcity and of using different mapping methods on reconstructed fields of biogeochemical tracers and, e.g., air–sea CO $_{2}$ fluxes . Lastly, we note that on smaller spatial scales, model biases and structural limitations of the E3SMv2 configuration used here due to, e.g., model resolution and parametrizations of internal wave dynamics, could reduce the utility of the synthetic biogeochemical float capabilities as an ideal test bed, as the simulated variability on small spatial scales might differ from the variability experienced by real-world Argo floats.

5 Conclusions

We implement synthetic biogeochemical float capabilities into the Energy Exascale Earth System Model version 2 (E3SMv2-LIGHT-bgcArgo-1.0). The synthetic floats are advected with ocean circulation at 1000 m online during the model integration, sampling all desired prognostic and diagnostic model variables throughout the whole water column at the frequency prescribed by the end user. Using E3SMv2 with synthetic floats as a perfect test bed, in which the true distribution of modeled physical, biogeochemical, and biological variables is known, we demonstrated the utility of this new tool in different use cases. In particular, acknowledging the remaining uncertainty stemming from the used model configuration not explicitly resolving eddy dynamics, we demonstrated that by sampling every 10 d, float-derived velocities are biased low ( $>$ 50 % for some 10 d trajectories). Similarly, 1-year Southern Ocean float trajectories derived from linearly interpolated float positions under sea-ice cover differ substantially from the true simulated float trajectories, especially in regions of high sea-ice cover such as the Weddell Sea ( $> 40$ % mismatch for conservative sea-ice thresholds). We further showed that, on average, synthetic daily float-based snapshots of marine ecosystem stressors in the tropics result in 10.6 $\pm$ 10.9 % larger estimates of seasonal amplitude than 10 d snapshots (mean $\pm$ 1 standard deviation for all ecosystem stressors between the surface and 300 m). Lastly, our results highlight the importance of the float network size for adequately capturing spatiotemporal biogeochemical dynamics, e.g., for detecting trends in deep-ocean heat content, deep-ocean ventilation, or upper-ocean phytoplankton bloom dynamics.

Even though differences exist in how synthetic and Argo floats sample the simulated and real ocean, respectively, the synthetic floats in E3SMv2 can be used in the future to improve our understanding of how Argo floats see the ocean, thereby contributing to the interpretation of existing observational records. For example, the synthetic float capabilities could be used to (i) assess uncertainties in deriving biogeochemical fluxes such as air–sea CO $_{2}$ exchange or net community production from float-based observations; (ii) assess uncertainties in mapping float-based observations or derived quantities to global, gridded datasets arising from, e.g., float distributions, sampling frequency, or sensor inaccuracies including drift; or (iii) inform future float deployment strategies within the One Argo program. Computing synthetic floats online eliminates both the need to store high-frequency model output and the uncertainty associated with time-averaging model output to extract synthetic observations offline. Given that Argo floats do not sample time-averaged water-mass properties but provide a snapshot view of the ocean, the online computation produces a more realistic dataset of synthetic observations. Expanding the online synthetic observing capabilities in E3SMv2 or other models by further sampling methodologies, e.g., ship-based hydrography, deep-sea moorings, gliders, or surface drifters, should be a key focus of future work, with the aim to improve our global ocean observing system.

Code and data availability

The model source code of E3SMv2 including synthetic biogeochemical float capabilities is available on Zenodo: 10.5281/zenodo.10094349 . The synthetic biogeochemical float dataset and the corresponding Eulerian model fields from E3SMv2 are deposited in the PetaLibrary of the University of Colorado, Boulder, and can be accessed via Globus (https://www.globus.org/, ). To find the data, enter “E3SM-BGCArgo” as the name of the collection in the file manager.

Author contributions

CN, NSL, MM, and ARG conceived the study. CN performed the analysis and wrote the paper. NSL and ARG acquired the funding. MM implemented the synthetic floats into E3SMv2 and performed the test simulation analyzed here. YT performed the Eulerian model simulation serving as the spinup for the test simulation with synthetic floats. All authors gave input on the case studies and commented on the paper.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Acknowledgements

This research used resources of the National Energy Research Scientific Computing Center (NERSC), a US Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory, operated under contract no. DE-AC02-05CH11231 using NERSC award BER-ERCAPm4003, and used a high-performance computing cluster provided by the BER Earth System Modeling program and operated by the Laboratory Computing Resource Center at Argonne National Laboratory. Data storage is supported by the University of Colorado, Boulder, PetaLibrary. Argo data were collected and made freely available by the international Argo project and the national programs that contribute to it. SOCCOM data were collected and made freely available by the Southern Ocean Carbon and Climate Observations and Modeling (SOCCOM) project funded by the National Science Foundation, Division of Polar Programs (NSF PLR-1425989, with extension NSF OPP-1936222); by the Global Ocean Biogeochemistry Array (GO-BGC) project funded by the National Science Foundation, Division of Ocean Sciences (NSF OCE-1946578), supplemented by NASA; and by the International Argo Program and the NOAA programs that contribute to it. The Argo Program is part of the Global Ocean Observing System (; https://www.ocean-ops.org/board?t=argoTS5, last access: 26 August 2024).

Financial support

This research has been supported by the US Department of Energy (grant no. DE-SC0022243).

Review statement

This paper was edited by Andrew Yool and reviewed by Paul Chamberlain and two anonymous referees.

Word count: 9950

Show less

© 2024. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Since their advent over 2 decades ago, autonomous Argo floats have revolutionized the field of oceanography, and, more recently, the addition of biogeochemical and biological sensors to these floats has greatly improved our understanding of carbon, nutrient, and oxygen cycling in the ocean. While Argo floats offer unprecedented horizontal, vertical, and temporal coverage of the global ocean, uncertainties remain about whether Argo sampling frequency and density capture the true spatiotemporal variability in physical, biogeochemical, and biological properties. As the true distributions of, e.g., temperature or oxygen are unknown, these uncertainties remain difficult to address with Argo floats alone. Numerical models with synthetic observing systems offer one potential avenue to address these uncertainties. Here, we implement synthetic biogeochemical Argo floats into the Energy Exascale Earth System Model version 2 (E3SMv2), which build on the Lagrangian In Situ Global High-Performance Particle Tracking (LIGHT) module in E3SMv2 (E3SMv2-LIGHT-bgcArgo-1.0). Since the synthetic floats sample the model fields at model run time, the end user defines the sampling protocol ahead of any model simulation, including the number and distribution of synthetic floats to be deployed, their sampling frequency, and the prognostic or diagnostic model fields to be sampled. Using a 6-year proof-of-concept simulation, we illustrate the utility of the synthetic floats in different case studies. In particular, we quantify the impact of (i) sampling density on the float-derived detection of deep-ocean change in temperature or oxygen and on float-derived estimates of phytoplankton phenology, (ii) sampling frequency and sea-ice cover on float trajectory lengths and hence float-derived estimates of current velocities, and (iii) short-term variability in ecosystem stressors on estimates of their seasonal variability.

Details

Title

LIGHT-bgcArgo-1.0: using synthetic float capabilities in E3SMv2 to assess spatiotemporal variability in ocean physics and biogeochemistry

Author

Nissen, Cara¹

; Lovenduski, Nicole S¹

; Maltrud, Mathew²

; Gray, Alison R³; Takano, Yohei⁴; Falcinelli, Kristen³

; Sauvé, Jade³; Smith, Katherine²

¹ Department of Atmospheric and Oceanic Sciences and Institute of Arctic and Alpine Research, University of Colorado, Boulder, Boulder, CO, USA
² Fluid Dynamics and Solid Mechanics (T-3), Los Alamos National Laboratory, Los Alamos, NM, USA
³ School of Oceanography, University of Washington, Seattle, WA, USA
⁴ Fluid Dynamics and Solid Mechanics (T-3), Los Alamos National Laboratory, Los Alamos, NM, USA; Polar Oceans Team, British Antarctic Survey, Cambridge, United Kingdom

Pages

6415-6435

Publication year

2024

Publication date

2024

Publisher

Copernicus GmbH

ISSN

1991962X

e-ISSN

19919603

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.5194/gmd-17-6415-2024

ProQuest document ID

3098549532

LIGHT-bgcArgo-1.0: using synthetic float capabilities in E3SMv2 to assess spatiotemporal variability in ocean physics and biogeochemistry

Jump to:

Full text

Abstract

Details

Suggested sources