1 Introduction
Oxygenic photosynthesis by marine phytoplankton is a critical planetary-scale process supplying solar energy to the biosphere by fixing inorganic carbon; it is responsible for roughly half of global annual net primary productivity (e.g., ). Ocean ecosystems play a key role in Earth’s carbon cycle and climate by affecting atmospheric via the biological carbon pump, which sequesters some of the fixed carbon to the deeper ocean for longer timescales (e.g., ; ; ; ; ). The biological pump is influenced by the structure and function of oceanic ecosystems (e.g., ; ); therefore, mechanistic, predictive understanding of ocean ecosystems is of high priority to Earth systems and climate research (e.g., ). Satellite remote sensing of ocean color is a key tool for the global characterization of ocean ecology (e.g., ). This has led to large efforts to elucidate biological pump mechanisms using multiple platforms, including satellites, e.g., the EXPORTS program .
Phytoplankton cell size (diameters varying from 0.5 to (e.g., ) is a key trait that affects multiple phytoplankton characteristics (), as well as sinking rates (e.g., ; ; ; ). Phytoplankton size classes (PSCs) thus tend to closely correspond to phytoplankton functional types (PFTs; e.g., ). Importantly, phytoplankton cells also affect the inherent optical properties (IOPs) (e.g., absorption and backscattering coefficients) of the water column in a size-dependent manner (e.g., ; ; ; ). This is because particle size (relative to the incident light wavelength) is one of the governing variables affecting the magnitude and spectral shape of light scattering and absorption caused by a particle (e.g., ). Therefore the particle size distribution (PSD) of phytoplankton (and other suspended particles in seawater) is a key property affecting both optical properties and cellular physiological and biogeochemical properties; i.e., it is a fundamental property linking ocean color remote sensing and ecosystem/biogeochemical characteristics. The size distribution of particles suspended in near-surface ocean waters is often described as a power law, given in differential form as follows (e.g., ; ; ; ; ; ; ):
1 where is particle diameter; () is the differential number concentration of particles per unit volume seawater and per bin width of particle diameter; is the particle number concentration at a reference diameter, here ; and is the power-law slope of the PSD. Equation () has to be integrated over a given diameter range to get the total particle number concentration in that range, ().
Ocean color is quantified by the spectral shape and magnitude of the remote-sensing reflectance, (; also denoted simply for brevity below), where is the wavelength of light in vacuo. The Kostadinov–Siegel–Maritorena 2009 (KSM09, ) algorithm retrieves the parameters of an assumed power-law PSD ( and in Eq. ) from ocean color remote-sensing observations using the spectral shape and magnitude of the particulate backscattering coefficient, (). can be retrieved using existing inherent optical property (IOP) inversion algorithms; KSM09 uses the IOP inversion. Subsequently, the retrieved PSD parameters allow the quantification of absolute and fractional PSCs: picophytoplankton, nanophytoplankton, and microphytoplankton based on bio-volume or phytoplankton carbon (henceforth TK16) via allometric relationships . Phytoplankton carbon (phyto C) is the key variable of interest for carbon cycle and climate studies and modeling, and TK16 (data set available: ) represents a relatively unique carbon-based approach among PSC/PFT algorithms as it is based on knowledge of the PSD and allometric relationships to get at size-partitioned phyto C. and retrieve phytoplankton-specific PSD and size-partitioned phyto C based on the phytoplankton absorption coefficient.
The KSM09 PSD algorithm (and the TK16 phyto C/PSC derived from it) is built on the assumption of a single population of particles (approximated by homogeneous spheres), representing backscattering due to the entire oceanic particle assemblage: phytoplankton cells and non-algal particles (NAPs). However, particle internal composition and shape influence its optical properties (e.g ; ). Recent results suggest that the structural complexity of oceanic particles enhances backscattering significantly and can explain the so-called “missing backscattering” in the ocean , i.e., the lack of optical closure between theoretically modeled and measured . Coated spheres (i.e., spheres consisting of concentric layers/shells of different material properties) can be used to better represent phytoplankton cells and their internal heterogeneity and composition (e.g., ), and they have significantly enhanced backscattering compared to their homogeneous equivalents .
Here, we introduce a major improvement of the PSD algorithm formulation. Two separate particle populations are modeled: living phytoplankton cells and NAPs. Phytoplankton cells are modeled as coated spheres, following the Equivalent Algal Populations (EAP) framework . EAP explicitly models intracellular chlorophyll concentration, Chl, as governing the imaginary index of refraction and thus allows for bulk chlorophyll concentration (Chl) to be computed from a specific PSD. The coated-sphere EAP-based model is useful to better represent phytoplankton cells specifically; however, not all backscattering particles are phytoplankton , and in fact, sub-micron NAPs even smaller than the smallest autotroph ( in diameter) are critical for determining the spectral shape of , which is key for PSD retrieval with KSM09 and the algorithm presented here. Particles other than and smaller than phytoplankton are likely to significantly contribute to backscattering in spite of evidence that phytoplankton/larger particles contribute more than Mie theory predicts, based on homogeneous spheres (e.g., ). Thus, a two-component particle model is used here, separately modeling NAPs as homogeneous spheres of wider size range than phytoplankton, so that bulk of oceanic waters can be modeled (e.g., ; ; ). NAPs are modeled as having generally organic detrital composition, but with some allowance for higher indices of refraction to account for minerogenic particle contributions. The PSD forward model can thus also produce a first-order estimate of particulate organic carbon (POC) and the percent contribution of phytoplankton and NAPs to .
Subsequent sections present details of the two-component, EAP-based forward IOP model, the inversion methodology developed for operational application of the PSD algorithm, and the use of the satellite-derived PSD to retrieve derived products (following the methods of TK16 with some modifications), namely absolute and fractional size-partitioned phytoplankton carbon (henceforth phyto C) (i.e., carbon-based PSCs), as well as Chl and POC estimates. The novel algorithm is applied operationally to monthly data from the multi-sensor merged Ocean Colour Climate Change Initiative (OC-CCI) v5.0 data set ; examples are shown in the paper, and the entire data set is publicly available and linked below (see “Data availability”). We then present and discuss an initial effort of validation of the new PSD algorithm and derived products using global compilations of PSD, picophytoplankton carbon, and POC in situ data. A comparison with other existing methods to retrieve phyto C is presented. We also discuss algorithm uncertainties, assumptions, and limitations as well as future work directions.
2 Data and methods2.1 Particle optical model input specification for phytoplankton and NAPs
The contributions of two separate particle populations to bulk backscattering are modeled using Mie theory for homogeneous spherical particles and the Aden–Kerker method for coated spheres. Living phytoplankton cells are represented by the first particle population, and all other suspended particles of any origin (i.e., NAPs) are represented by the second population. Living phytoplankton cells are modeled as coated spheres using the Equivalent Algal Populations (EAP) framework for determining optical model inputs, in particular the complex indices of refraction of the particle core and coat. NAPs are modeled as homogeneous spheres meant to represent organic detritus, but also allowing for their real index of refraction to vary over a wider range to take into account the contribution of mineral particles.
A characteristic of the PSD algorithm presented here is that it is mechanistic to the extent feasible, i.e., based on first principles and causality, even at the expense of increasing complexity. For example, as in EAP, the imaginary refractive index (RI) of the cell is a function of intracellular chlorophyll concentration, Chl. We vary some optical model inputs in a Monte Carlo simulation in order to assess uncertainty and base the PSD inversion on an ensemble of forward runs rather than a single set of inputs. Details of uncertainty estimation and propagation are given in Supplement Sect. S1. Details of how each input parameter for phytoplankton cells and for NAPs is specified, as well as the statistical distributions from which the Monte Carlo simulation instances were picked, are specified in Tables and .
As in the EAP model, the chloroplast is represented by the particle coat. Its relative volume, , is picked from a distribution as shown in Table . The chloroplast's imaginary refractive index (RI) (relative to seawater) at 675 , , is then computed as follows :
2 where Chl is the theoretical maximum specific absorption coefficient of chlorophyll at 675 when dissolved in water , Chl is the intracellular chlorophyll concentration in kilograms of chlorophyll per cubic meter of cellular material, and is seawater's absolute real RI at 675 . A hyperspectral basis vector from the EAP model (based on measurements; for details see ; ) is then scaled using the value at 675 from Eq. (), obtaining a hyperspectral relative imaginary RI for the coat as chloroplast. In Eq. (), Chl applies to the whole cell and is therefore scaled using to obtain for the coat alone. The nominal chloroplast's relative real RI is then picked from a distribution as shown in Table and modified as a function of its imaginary RI according to the Kramers–Kronig relations (implemented as a Hilbert transform) .
The cell cytoplasm is represented by the particle core. Its relative real RI is picked from a distribution given in Table , and it is modified by the Kramers–Kronig relations using a constant hyperspectral detritus-like imaginary RI, i.e., having a colored dissolved organic matter (CDOM)-like exponential spectral shape, resulting in a spectrally varying hyperspectral relative real RI. The phytoplankton particle population relative RIs and their Monte Carlo variability are summarized in Supplement Fig. S1.
Table 1Inputs for the coated-sphere Aden–Kerker optical scattering computations for the phytoplankton particle population. Modeling inputs common to both phytoplankton and NAPs (see Table ) are given in the first three table rows. stands for a normal distribution with mean and standard deviation . For the indices of refraction, apart from and , see , , , and .
Input parameter | Symbol, units, and notes |
---|---|
Pure seawater absolute real RI | , after , usingtemperature 15 C and salinity = 33 |
PSD slope | [2.5, 6] in steps of 0.05 (the same value applies to both phytoplankton and NAPs; for the slope range,see, e.g., ) |
Wavelengths in vacuo | [400, 700] , hyperspectral – in steps of 1 , band-averaging used for the nominal wavelengths of satellite sensors |
Phytoplankton population inputs | |
Intracellular chlorophyll concentration | Chl [0.5, 10], picked from, (2.5,2.5); Chl cellular material |
Coat (chloroplast) relative volume | [5, 35] %, picked from (20,5), resulting in mean coat relative thickness as fraction of cell radius (cf. ); |
Coat (chloroplast) relative real RI | [1.06, 1.22], picked from, (1.14,0.08); ; wavelength-dependent via Kramers–Kronig relations |
Core (cytoplasm) relative real RI | [1.01, 1.03], picked from, (1.02,0.01); ; wavelength-dependent via Kramers–Kronig relations |
Coat (chloroplast) relative imaginary RI | computed from a hyperspectral basis vector (from ; ) that is scaled to the value at using Eq. () |
Core (cytoplasm) relative imaginary RI | with a prescribed constant magnitude and detritus-like (exponential) spectral shape, with spectral slope , resulting in for |
Minimum outer particle diameter | (smallest autotroph; e.g., ) |
Maximum outer particle diameter | [20, 200] , picked from, (50,50); |
Differential number concentration at | , used in the forward modeling |
Inputs for the homogeneous-sphere Mie scattering code for the NAP population. Modeling inputs common to both phytoplankton and NAPs are given in the first three table rows of Table . stands for a normal distribution with mean and standard deviation .
NAP population inputs | |
---|---|
Relative real RI | [1.01, 1.2], picked from, (1.02,0.06); ; wavelength-dependent via Kramers–Kronig relations |
Relative imaginary RI | with a prescribed constant magnitude and detritus-like (exponential) spectral shape , with spectral slope , resulting in for |
Minimum particle diameter | (see Supplement Fig. S4 and ; ) |
Maximum particle diameter | [200, 500] (e.g., ), picked from, (400,10); |
Differential number concentration at | , resulting in total particle population , used in the forward modeling ( constitutes an important model assumption, and it is discussed in the text) |
The NAP population is represented by a homogeneous sphere, the relative RIs of which are picked so that its absorption spectrum is detritus-like (same as the core of phytoplankton), and its real RI is allowed to vary over a wider range of values, meant to represent mostly organic detritus, but with some minerogenic contributions, resulting in a mean nominal relative real RI of 1.06. The input RIs and other input parameters for NAPs are summarized in Table .
Specification of the input PSD parameters and the relationship of NAPs to phytoplankton PSDs is key to the construction of the forward and inverse models. Necessarily, some key simplifying assumptions are made here in order to construct an algorithm with operational application to modern multispectral ocean color sensors. The two key assumptions are that (1) phytoplankton and NAPs have a power-law PSD (Eq. ) with the same slope , and (2) the scaling parameter for NAPs is twice that of for phytoplankton (the forward model uses default values as in Tables and ). The latter assumption is chosen so that it results in a phyto C : POC ratio of (see ; ; and Sect. here) (as long as they are both estimated using the same size ranges). Together, these assumptions allow for the retrieval of one common PSD parameter set pertaining to the total particle population PSD (one value and one total equal to the linear sum of the NAPs and phytoplankton values).
2.2 Backscattering calculationsThe backscattering efficiencies, , for a single phytoplankton cell and a single non-algal particle were computed using the inputs described above in Sect. and Tables and . The coated-sphere code of was used for both coated and homogeneous spheres. This code is included with the algorithm development scientific code of the PSD algorithm (see “Data availability”). Calculations were run for instances of Monte Carlo simulations, each with a unique randomly picked combination of inputs for phytoplankton and NAPs. This resulted in 3000 sets of hyperspectral values. High sampling resolution in diameter space was picked for the coated spheres (10 000 samples between minimum and maximum diameter) in order to minimize the influence of resonance spikes in . For NAPs, 1000 samples of were used.
Indices of refraction for both phytoplankton and NAPs are specified hyperspectrally (Supplement Fig. S1), and the computations are performed from 400 to 700 wavelength in vacuo with a step of 1 , allowing the resulting hyperspectral values to be adapted for use with any combination of visible optical wavebands pertaining to recent and currently operating ocean color multispectral sensors or for planned (e.g., PACE ) or existing hyperspectral sensors.
Before calculation, hyperspectral backscattering efficiencies, , for each Monte Carlo run were first pre-processed by applying quality control and band-averaging using a moving-average 11 wide top-hat filter (using as central wavelengths the nominal bands of the following ocean color sensors: Sea-viewing Wide Field-of-view Sensor, SeaWiFS; Moderate Resolution Imaging Spectroradiometer, MODIS, Aqua; Medium Resolution Imaging Spectrometer, MERIS, and Ocean and Land Colour Instrument, OLCI; and the Visible and Infrared Imager/Radiometer Suite, VIIRS, on the Suomi National Polar-orbiting Partnership, S-NPP, plus 440 and 550 ), resulting in 19 unique bands for band-averaged backscattering efficiencies, denoted here as . The band-averaged spectral particulate backscattering coefficient, , was then calculated from the values and the input PSD as follows (e.g., ; ):
3 where is the complex index of refraction (specified separately for coat and core in the case of phytoplankton). Equation () is applied separately to the modeled phytoplankton and NAP values and for each of the 3000 Monte Carlo runs. Band-averaged total spectra are then calculated as the linear sum of phytoplankton and NAP backscattering.
2.3 PSD retrieval via spectral angle mapping2.3.1 End-member construction
Band-averaged total spectra were used to construct the backscattering end-members, , corresponding to specific input values of the PSD slope . First, individual total spectra from each Monte Carlo run () were normalized by the value at 555 . The median of all normalized spectra at each waveband was used as the end-member for each PSD slope, from to in steps of 0.05 (see Table ). This approach allows the isolation of the spectral shape (dependent on ) and spectral magnitude (dependent on ) (Eq. ). Using the hyperspectral underlying values, end-members can be constructed for any desired set of wavelengths.
2.3.2 PSD parameter retrieval and operational application to OC-CCI ocean color data
The PSD parameters and are retrieved using the backscattering end-members, , via the spectral angle mapping (SAM) technique (e.g., ). Briefly, the end-members and satellite-observed spectra are treated as -dimensional vectors, where is the number of bands. The spectral angle between a given end-member and the observed spectrum is then calculated using the vector dot product as
4
Thus, spectral angle is an index of spectral shape similarity between two spectra, with more similar spectral shapes resulting in lower spectral angles. Equation () was used to calculate the spectral angle between each of the 71 end-members, , and the input observed spectrum. The value of corresponding to the smallest spectral angle is then assigned as the retrieved PSD slope. Three wavebands were used, namely 490, 510, and 550 . For operational application to OC-CCI v5.0 remote-sensing reflectance () data (which do not have the 550 band), band-shifting was applied to the input to estimate the corresponding , which is used in the IOP inversion. The band-shifting was constructed using the band ratios between the respective original and target bands from a hyperspectral run of the (MM01) model. No other bands were shifted.
The parameter is subsequently retrieved as the ratio of (1) the satellite-observed value of and (2) the median value of the quantity corresponding to the end-member class of the retrieved and all statistically similar classes (see Supplement Sect. S1) across all Monte Carlo simulations.
2.4 Derived products: size-partitioned phytoplankton carbon, PSCs, POC, and chlorophyllOnce the PSD parameters are known, they can be used to compute derived products . Phytoplankton carbon in any size class spanning from cell diameter to cell diameter () can be estimated as
5 where , and () for the total PSD is the satellite-retrieved parameter from total particulate backscattering; the other PSD parameters are as in Eq. (). Equation () was used to compute size-partitioned phyto C in three size classes – picophytoplankton (0.2 to 2 in diameter), nanophytoplankton (2 to 20 in diameter), and microphytoplankton (20 to 50 in diameter) – as well as total phyto C as the sum of the three classes. Carbon-based PSCs are defined as the fractional contribution of each of the three size classes to total phyto C . Given the first-order correspondence between PSCs and PFTs (e.g., ), these PSCs can also be interpreted as PFTs. The allometric coefficients of are used here, namely and ; when cell volume is expressed in cubic micrometers, cellular carbon is computed in picograms of carbon per cell using these coefficients (Eq. ; see also ). Phyto C in Eq. is given in milligrams per cubic meter; the conversion factors in Eq. () are used to convert from cubic meters to cubic micrometers and from picograms to milligrams of carbon . The factor of is an assumption of the model (Tables and ). Thus, an estimate of POC (computed using the same size limits as total phyto C) was calculated as 3 phyto C.
Chlorophyll concentration was estimated from the PSD retrievals and the input intracellular chlorophyll concentration, Chl (Table ; ), as follows: 6
In Eq. (6), Chl, , , and all have to be expressed in consistent units so that Chl is obtained in milligrams per cubic meter. Here we use the median Chl across all Monte Carlo simulations to produce a single Chl estimate.
2.5 Validation and comparisonA data set of near-surface in situ PSD measurements was compiled for validation of the PSD parameter products, and (Eq. ). The data set consists of Coulter counter and Laser In-Situ Scattering and Transmissometry (LISST) measurements and a small set of PSDs derived from multiple instruments and modeling. Specifically, the compilation consists of the following data sets: (1) a compilation of several data sets of Coulter counter measurements, as used in the KSM09 algorithm validation in ; (2) LISST-100X (Sequoia Scientific©) measurements from the Plumes and Blooms project (e.g., ; ) in the Santa Barbara Channel, as used in ; (3) Coulter counter measurements from the Atlantic Meridional Transect cruise no. 26 (AMT26) , as compiled and used in ; (4) in-line LISST 100-X measurements from the cruises North Atlantic Aerosols and Marine Ecosystems Study 3 and 4 (NAAMES;
The compiled PSD data set was used to fit for the PSD parameters of Eq. () using the 2 to 20 diameter range. One data point was removed from the 2018 EXPORTS PSD data due to a poor fit to a power-law PSD. These in situ estimates were matched to satellite OC-CCI v5.0 using the same matching methods described below for POC and picophytoplankton carbon data. Matched reflectances were used as input to the novel PSD algorithm presented here. The in situ and satellite PSD parameters were then compared using a type II linear regression and several additional algorithm performance metrics (e.g., ), details of which are given in the caption of Fig. .
A large compilation of in situ POC data was collected from various public databases and private contributors and was used here to perform match-ups with satellite OC-CCI v5.0 data. In addition to the POC data (1997–2012) used in for algorithm validation (), this study also incorporated recent in situ POC data (2013–2020) from the SeaWiFS Bio-optical Archive and Storage System archive (
The in situ picophytoplankton carbon data set compiled and used for algorithm inter-comparison as part of the ESA POCO project was used here to generate match-ups with satellite OC-CCI v5.0 data for further validation. Match-ups were generated in the same way as described above for POC.
All in situ data described above were excluded from the validation if any of the following conditions were met: (1) average bathymetric depth from a 9 buffer around the in situ sample location was less than or equal to 200 , or any grid cell elevations in that buffer were 0 or higher, using a downsampled, 4 version of the NOAA ETOPO1 data set (
All duplicate in situ match-ups (in the sense of multiple in situ data points that are close in space and time and receiving the same satellite match-up) were combined into a single match-up point as follows: for the PSD, the medians of the PSD measurements from the NAAMES and EXPORTS cruises in each LISST size bin for such duplicates were used for the calculation of in situ PSD parameters (a large number of duplicates since the data are in-line); for the rest of the PSD data, the fit PSD parameters themselves were averaged (a small number of duplicates). For the POC (large number of duplicates) and picophytoplankton carbon data (small number of duplicates), the averages of the duplicate in situ data were used.
In addition to validation against in situ measurements of the PSD, POC, and picophytoplankton carbon, satellite chlorophyll (Chl) retrievals (using the standard algorithm of OC-CCI v5.0 at the match-up points) were compared with Chl estimated using the PSD retrieval (Eq. ). Finally, global algorithm retrievals of total phyto C for May 2015 (using OC-CCI v5.0 data as input) were also compared with two alternative methods of retrieving phyto C: (1) the algorithm and (2) the algorithm, as implemented by NASA's Ocean Biology Processing Group (OBPG). Modeling and processing of results presented here are done using the sinusoidal projection images; maps presented here are given in equidistant cylindrical projection (i.e., unprojected latitude/longitude).
3 Results and discussion
3.1 Forward and inverse modeling
The first step in the algorithm development is the generation of 3000 Monte Carlo realizations of backscattering efficiencies as a function of particle diameter and wavelength, . The important differences between backscattering efficiencies of homogeneous and coated particles is discussed in Supplement Sect. S2. Here, we continue the discussion with the resulting integrated backscattering spectra (Eq. ). Hyperspectral spectra modeled using a single forward optical model run are shown in Fig. . The computations use the median values of inputs varied in the Monte Carlo simulations (Tables and ). These normalized spectra illustrate the strong spectral shape dependence on the PSD slope . Phytoplankton spectral shapes are complex, with various peaks and troughs near the absorption peaks of chlorophyll, but are more linear in the 490 to 550 range, which is the one used for the multispectral operational PSD algorithm. Regardless, the SAM methodology of retrieval used here allows for any spectral shape and does not impose a power-law fit to the shape of , as is done in KSM09 (see also ). NAP backscattering exhibits smooth shapes due to the smooth shape of their absorption (Fig. b). Fundamentally, it is evident from Fig. a and b that the higher the PSD slope , the steeper the spectral shape becomes, with higher values in the blue, since smaller particles dominate the signal. This dependence is at the root of the principle of operation of the PSD algorithm. For completeness, corresponding absorption spectra are illustrated in Supplement Fig. S3.
Figure 1
Modeled hyperspectral backscattering coefficient of (a) phytoplankton, using EAP-based coated-sphere scattering computations, and (b) NAPs, modeled as homogeneous spheres, as a function of the input power-law PSD slope (color-coded solid lines, as in legend). All spectra are shown normalized to the respective values at 555 . See Sect. for more details.
[Figure omitted. See PDF]
The 71 end-members (EMs) created for operational application to existing major satellite ocean color missions and corresponding to PSD slope values between 2.5 and 6.0 with a step of 0.05 are displayed in Fig. a. They represent the modeled spectra against which satellite-measured spectra are compared using the SAM method (Eq. ). The spectral shape dependence on demonstrates the theoretical ability to retrieve this parameter from space.
Figure 2
(a) Normalized spectral shapes of the end-members (EMs) developed for spectral angle mapping (SAM), shown at the 19 unique wavelengths used for band-averaging (see Sect. ), and for PSD slopes as in the color legend. (b) The fraction of due to phytoplankton as a function of the PSD slope . The wavelengths shown are indicated in the legend (nm). The means across all 3000 Monte Carlo simulations are shown. (c) Uncertainties in the PSD slope retrieval using the SAM method, for each EM. Shown are the minimum and maximum value of the PSD slope for all end-member classes that are statistically similar to the given EM, according to the Kruskal–Wallis ANOVA (Supplement Sect. S1) (left axis), and the resulting range of PSD slopes (right axis) falling within these asymmetric uncertainty bounds. (d) Statistics of the parameter for each EM, calculated for all 3000 Monte Carlo simulations and across all neighboring EM classes determined to be statistically similar to the given EM. in the legend stands for the mean, and is the standard deviation. The standard deviation of this parameter is used to estimate uncertainties in the retrieval.
[Figure omitted. See PDF]
An important question in bio-optical oceanography is determining the sources of backscattering in the ocean and their relative contributions. This is still an unresolved issue , though progress has been made (e.g., ; ; ). This issue is of central importance to the PSD model, as it assigns varying fractions of the signal to phytoplankton vs. NAPs under certain assumptions (Tables and ). Since in the two-component PSD model presented here phytoplankton and NAP are modeled separately, the fraction of due to phytoplankton vs. NAPs can be calculated. For a given PSD slope and wavelength, the assumptions of the model dictate fixed fractional contributions by NAPs and phytoplankton to total , which are given in Fig. b. There is variability by wavelength, but the first-order variability is driven by the PSD slope, namely at low values (), phytoplankton contribution to is on the order of 30 % to 50 %, and it drops to near 0 % for higher slopes as approaches 6.0. The curves are not monotonic, and peak phytoplankton contribution to occurs at .
The fractional contributions of Fig. b are derived from the forward theoretical modeling, and they are influenced by all model assumptions and are not validated independently. In particular, the decisions on integration diameter limits for NAPs and phytoplankton, as well as on the distributions of the indices of refraction for phytoplankton and NAPs, will have a strong influence on these values. Since NAPs are permitted here to have higher RIs than RIs typical of organic detritus only, if NAPs were strongly dominated by or composed only of organic particles, then NAP contribution to would be overestimated here. Of course these RIs are likely to be spatially and temporally variable, and the algorithm can be further improved by investigating and implementing such variability. estimated global absolute due to NAPs and its fractional contribution to total using analysis of correlations with Chl. Qualitatively and to first order, their global pattern of percent due to NAPs agrees with the model results reported here, i.e., low relative NAP contributions at high latitudes and in eutrophic areas and higher relative contributions in more oligotrophic areas such as the fringes of the subtropical gyres (they exclude the gyres from their analysis) (cf. their Fig. 2c and b here). Note that the estimate pertains only to NAPs non-covarying with Chl, making comparison harder. Further investigation is warranted to more rigorously compare their product to the values implicit in the PSD algorithm described here. Apart from analyzing the relative contribution of phytoplankton vs. NAPs to total , it is of interest to investigate the relative contributions of various size ranges to the modeled backscattering coefficient. This is illustrated in Supplement Fig. S4 and further discussed in Supplement Sect. S3.
The uncertainty in PSD slope retrieval is illustrated in Fig. c. These estimates are not symmetric about the value and are derived via Kruskal–Wallis analysis of variance to determine class similarity (Supplement Sect. S1). As in KSM09, the general tendency is for the range of uncertainty in to increase for lower PSD slopes, but it is always less than 0.5. The uncertainty in the ratio used to retrieve the parameter is shown in Fig. d in log10 space. Mean and median values are similar, and the uncertainty about them does not vary much with PSD slope, also similarly to KSM09. The uncertainty in Fig. d at each value includes all statistically similar classes of EMs.
3.2 Operational application of the PSD–phyto C algorithm to OC-CCI v5.0 merged satellite data3.2.1 PSD parameters
The operational PSD algorithm presented here was applied to the monthly 4 km OC-CCI v5.0 data set . Both PSD parameters ( and ; Eq. ) and derived products were generated (Sect. and ). These data and their monthly and overall climatologies (and associated uncertainties) are made publicly available (see “Data availability”). Here, we use May 2015 data to illustrate and discuss the new algorithm.
The PSD slope map (Fig. a) reveals a global spatial pattern consistent with expectations and with KSM09; namely the subtropical oligotrophic gyres are characterized by high PSD slopes (relatively high numerical dominance of small particles), whereas more eutrophic areas such as coastal areas, equatorial upwelling zones, and high latitudes exhibit lower slopes (increasing relative abundance of larger particles). This is consistent with oligotrophic ocean ecosystems being dominated by picophytoplankton, whereas microphytoplankton contribute significantly to the phytoplankton assemblage in eutrophic areas and during blooms (e.g., ; ; and references therein). PSD slope values retrieved by the SAM-based algorithm span the full modeled range of . This is in contrast to KSM09, where values below 3.0 were not retrieved. The PSD parameter (Eq. ) is, as expected, higher in coastal, high-latitude, and eutrophic areas (indicating higher particle loads) and lower in the oligotrophic subtropical gyres (Fig. b). varies over a few orders of magnitude. Note 's units of (Eq. ) and that care should be taken when comparing Eq. () and to other formulations of the PSD, e.g., the parameter in , as these are related but not equivalent (see also ).
Figure 3
Example operational retrievals of the PSD parameters (Eq. ) and their uncertainties, using monthly OC-CCI v5.0 data for May 2015: (a) PSD slope , (b) parameter ( in log10 space), (c) uncertainty range for , and (d) standard deviation (SD) of log10 of .
[Figure omitted. See PDF]
Algorithm uncertainties are provided on a per-pixel basis. The uncertainty range estimates for (Fig. c) (not necessarily symmetric about the value) indicate that the gyres are characterized by lower uncertainties than the more eutrophic areas, as can be expected from Fig. c. These are partial uncertainty estimates, including those quantifiable and internal to the modeling, i.e., due to Mie parameter choices. Additional uncertainties inherent in the input OC-CCI values and those due to the IOP inversion algorithm used are not included in Fig. c and d and in subsequent propagated errors. Uncertainties in the parameter are more uniform spatially but higher in the gyres (Fig. d). Note that those are given in log10 space as a standard deviation, and a relatively small absolute value of the uncertainty translates to relatively large uncertainties in absolute particle concentrations.
3.2.2 Phytoplankton carbon and carbon-based PSCs, POC, and chlorophyll from the PSDGlobal patterns of total phytoplankton carbon (phyto C) retrieved via the PSD and allometric relationships (Fig. a) exhibit the expected lower values in the oligotrophic gyres and higher values elsewhere. Similarly to the results of , values range over approximately 3 orders of magnitude, which is a higher range than retrievals based on other methods, namely direct empirical algorithm POC retrieval or the method of scaling backscattering, and it is also higher than the range in CMIP5 model ensembles (cf. Fig. 1 in ). This putative underestimation in the gyres and overestimation in eutrophic areas suggests the need for algorithm tuning, which is discussed in Sect. along with implications of validation results. Global validation of phyto C retrievals with analytical phyto C measurements is planned but is currently challenging as phytoplankton-specific carbon data are relatively novel and still scarce. Here, an initial validation effort is undertaken using several other variables; see Sect. .
Figure 4
Example operational PSD-based retrievals of total and size-partitioned phytoplankton carbon, using monthly OC-CCI v5.0 data for May 2015: total phytoplankton carbon (a), picophytoplankton carbon (b), nanophytoplankton carbon (c), and microphytoplankton carbon (d). Units are milligrams per cubic meter, mapped in log10 space. The diameter limits for the three size classes are picophytoplankton (0.2 to 2 ), nanophytoplankton (2 to 20 ), and microphytoplankton (20 to 50 ).
[Figure omitted. See PDF]
A key feature of the PSD-based algorithm is that phyto C can be partitioned into any number of size classes by choosing appropriate integration limits of Eq. (). Absolute concentrations of pico-, nano-, and microphytoplankton are illustrated for May 2015 in Fig. b, c, and d, respectively. Picophytoplankton C is mapped on the same color scale as total phyto C (Fig. a), but pico- and nanophytoplankton C maps have differing scales, illustrating that while picophytoplankton C varies over 3 orders of magnitude spatially and globally, nanophytoplankton C varies over 4–5 orders of magnitude, and microphytoplankton varies over 7 orders of magnitude (see also ; ). Note that empirical tuning will affect these ranges of variability; see Sect. . Fractional contributions of each of the three PSCs used here to total phyto C are illustrated in Fig. . Picophytoplankton dominate much of the open-ocean, lower-latitude, oligotrophic areas, contributing nearly 100 % of the carbon biomass there (Fig. a); nanophytoplankton (Fig. 5b) contribute up to 50 % of the biomass in the higher-latitude and more eutrophic areas; and microphytoplankton (Fig. 5c) contribute significantly only in the most eutrophic areas, e.g., during the North Atlantic bloom at 45–50 N (May 2015 is shown). As previously noted , this general pattern is consistent with current understanding of ocean ecosystems.
Figure 5
Example operational retrieval of the percent contribution of each phytoplankton size class (PSC) to total phytoplankton carbon, determined via the PSD (Sect. ). Retrievals use monthly OC-CCI v5.0 data for May 2015. The PSCs are picophytoplankton (a), nanophytoplankton (b), and microphytoplankton (c).
[Figure omitted. See PDF]
The fractional carbon-based PSCs (Fig. ) are ratios of two integrals of Eq. (); thus they are analytical functions of the PSD slope and the allometric coefficient (as well as the limits of integration used for each class and total phyto C). These functions are plotted in Fig. , together with the satellite-observed histogram for May 2015, illustrating the most common values for the PSCs found in the ocean. Area-wise, the ocean is dominated by oligotrophic regions with high picophytoplankton contributions to C biomass.
Figure 6
Percent contribution of each PSC to total phytoplankton carbon (blue curves as in legend, left axis) as a function of the PSD slope . A histogram of the PSD slope from the (sinusoidal projection) OC-CCI v5.0-based image for May 2015 is shown in the background as brown bars (right axis). The three PSC curves are analytically derived from the model, and no satellite data are used in producing them.
[Figure omitted. See PDF]
As an illustration of uncertainty propagation to derived products, the propagated uncertainty in total phyto C (Fig. a) and fractional picophytoplankton C biomass (Fig. b) is shown. Comparison of Fig. a with Fig. a indicates that absolute total phyto C uncertainties are of the same order of magnitude as the values themselves. This is a partial uncertainty estimation due to the assumed distributions of the Mie inputs (Tables and ) and due to the allometric coefficients. The Mie inputs are varied over wide ranges to accommodate various environments in the global ocean, with the goal of having a single first-principles-based operational algorithm applicable to first order globally. This increases the uncertainty estimates. The uncertainty for the fractional PSC products depends only on the uncertainties in and ; thus they exhibit much lower internal algorithm uncertainty compared with absolute values. For picophytoplankton, they are % for the oligotrophic gyres and do not exceed % globally. This suggests that the fractional PSCs are more reliable products than the absolute values, and they can also be used with other products to partition them, e.g., total phytoplankton carbon estimated using the alternative methods shown in Fig. (namely , and ) or the or methods; POC products (e.g., ) can be partitioned this way as well. Per-pixel uncertainties are estimated for all products and composite imagery (climatologies) as well and are provided with the OC-CCI-based data set associated with this paper (“Data availability”).
Figure 7
Propagated uncertainty in (a) total phytoplankton carbon given as 1 standard deviation (in milligrams per cubic meter), mapped in log10 space, and (b) fractional contribution of picophytoplankton to total phytoplankton carbon, given as 1 standard deviation in percent. (c) POC derived using the PSD retrievals (in milligrams per cubic meter), mapped in log10 space, and (d) chlorophyll concentration (Chl) derived using the PSD retrievals (in milligrams per cubic meter), mapped in log10 space. Monthly OC-CCI v5.0-based data for May 2015 are shown in all panels.
[Figure omitted. See PDF]
The formulation of the PSD algorithm allows for both POC and Chl (Eq. ) to be estimated from the retrieved PSD. Due to the assumptions used, POC is phyto C multiplied by 3 (Fig. c). This is strictly true only if the POC estimate uses the same limits of integration as phyto C, which is an approximation of the usual POC operational definition (see, e.g., discussion of POC–PSD closure analysis in ). POC is thus estimated to first order, treating the retrieved NAPs as being composed of POC only and applying the same allometric relationship to NAPs as to phyto C, in spite of the fact that the assumed RI distribution of the NAPs is broader (Table ). These are simplifying assumptions of the two-component model; a more accurate POC representation can be achieved if organic and inorganic NAPs are modeled as separate particle populations (e.g., ). This is a planned development of the model in the future; the goal here is to build an operational PSD–phyto C algorithm (based on first principles, as mechanistic as feasible) for use with multispectral satellite data of limited degrees of freedom. Hyperspectral sensors such as PACE should allow for some more degrees of freedom and thus for more independent particle components and their PSDs to be modeled separately. However, note that even hyperspectral data have limits on their degrees of freedom, which are expected to be much fewer than the number of sensor bands . An important benefit of POC is that it is a widely observed variable, available for global validation efforts (Sect. ), as is Chl. Similarly to POC, there are benefits of the PSD-derived estimate of Chl (Fig. d); it can be used as additional verification/validation of model retrievals, and/or PSD-retrieved Chl can be used as a parameter to optimize (in algorithm tuning), as discussed shortly (Sect. ). Next, we discuss validation/verification and tuning efforts in which both PSD-derived POC and PSD-derived Chl are used.
Figure 8
(a) Comparison of PSD slope derived from in situ measurements with the matched-up satellite retrieval (Sect. ). Points are color-coded according to the corresponding matched-up satellite OC-CCI v5.0 Chl (colormap in milligrams per cubic meter in log10 space). In this figure (as well as in Figs. 10 and 11 and Supplement Figs. S6, S9, and S10), type II (reduced major axis, RMA) regression is used, and regression and validation statistics are given in the figure panels; “-int” stands for the intercept, RMS is root mean square (square root of the mean of squared differences between the in situ and satellite values), bias is the mean of the satellite minus in situ values, and MAE is the mean absolute error (the mean of the absolute values of the differences between the in situ and satellite values) (e.g., ). The values in parentheses are the standard deviations of the slope and intercept, respectively. (b) Same as in panel (a) for the parameter (Eq. ) (axes in log10 space).
[Figure omitted. See PDF]
3.3Algorithm validation, comparisons, and empirical tuning for the PSD parameter and absolute concentrations
In an initial validation effort, the novel PSD–phyto C algorithm is validated/verified using several variables. It is challenging to globally and thoroughly directly validate the major products of the algorithm – the PSD and size-partitioned phyto C – due to a paucity of globally spanning in situ observations, which are further reduced when performing satellite match-ups. Here, we validate or verify algorithm performance against compilations of the following variables: (1) in situ PSD observations (Sect. ), (2) in situ POC observations, (3) in situ picophytoplankton C observations, and (4) concurrent satellite observations of Chl. Maps of the locations of in situ observations are shown in Supplement Fig. S5. In addition, we compare phyto C retrievals to several existing methods using the example May 2015 OC-CCI v5.0 image. Further, based on these results, we suggest an empirical tuning of the algorithm.
Validation results for the PSD slope (Fig. a) indicate a statistically significant but noisy relationship between retrieved and observed slopes, with a positive bias for satellite retrievals and a regression slope substantially greater than unity. Most validation points are scattered in a cloud of data between 3.0 and 4.5 and do not exhibit much correlation, and there is a somewhat separate cluster of data centered about a slope of 5.25 in the satellite retrieval that have smaller corresponding in situ values of about 4.0. There is generally a clear tendency for points from more oligotrophic areas (as indicated by Chl color coding) to exhibit higher satellite values and more eutrophic areas to exhibit much lower satellite values. This tendency is weaker for the in situ observations, which tend to have a narrower range, mostly between 3.0 and 4.5. To first order, the satellite data are in the same range as in situ data, and the retrievals capture the in situ data trend; however, there is a pattern of having a bigger range of PSD slopes in satellite data than in the in situ match-ups, with the algorithm underestimating low values and overestimating high values.
Validation for the parameter (Fig. b) is statistically significant (somewhat higher than the regression) but also quite noisy. Strong clustering of the in situ observations around – is observed, and the majority of these observations are somewhat overestimated in the satellite retrievals, which cluster around . Notably, a separate cluster associated with low Chl (and low values of both satellite and in situ ) shows that the satellite retrievals substantially underestimate these field data. Since is the PSD scaling parameter, which generally controls absolute number, volume, and carbon concentration variability to first order, this has implications for the global pattern of phytoplankton carbon retrievals (Fig. ); namely it is consistent with underestimation in the oligotrophic gyres and overestimation in the eutrophic areas. Overall, both satellite and in situ data exhibit increasing values of with increasing Chl concentrations, as expected; i.e., more oligotrophic waters are associated with smaller overall particle number concentrations. Further discussion of the PSD validation by location of in situ data (Fig. S5a and b) is provided in Supplement Sect. S4 and illustrated in Supplement Fig. S6.
This pattern of under- and overestimation in the validation drives the slope of the validation regression to be much greater than unity and suggests an empirical tuning to absolute phytoplankton carbon estimates via a linear (in log10 space) tuning of , as done in TK16 , who based the tuning on the validation regression. A similar approach is proposed here, but it is derived differently. Details of the tuning derivation procedure are given in Supplement Sect. S5. The following global tuning equation was obtained: 7 where is the original (untuned) PSD parameter. This tuning changes retrievals in a similar fashion to the TK16 tuning and is consistent with a tuning suggested by the in situ validation presented here (Fig. b); namely, low satellite values are increased, and high values are decreased, narrowing the overall range of variability of retrieved and thus the range of the retrieved derived variables. This addresses the low bias in oligotrophic gyres and the high bias in eutrophic areas. The goal of the tuning is to get more realistic absolute retrievals of POC and Chl (hypothesizing that this should also lead to more realistic phyto C retrievals; however, see discussion about picophytoplankton validation in Sect. ).
The tuned parameter for May 2015 is mapped in Supplement Fig. S7a. The overall spatial pattern of higher values in more eutrophic areas is preserved, but the global range of values is reduced compared to the original formulation, increasing in the gyres and decreasing it in more productive areas. The resulting multiplicative factor to be applied in linear space to phyto C, POC, and Chl values is mapped in log10 space in Supplement Fig. S7b. Values in oligotrophic areas are generally greater than unity in linear space (mostly between 1 and 10), indicating that the tuning increases phyto C, POC, and Chl in these areas, up to about an order of magnitude (in limited areas mostly in the South Pacific Gyre) and more moderately elsewhere in the tropical and subtropical oligotrophic oceans. The equatorial upwelling areas and other transitional zones are not changed, and high-latitude oceans exhibit correction factors mostly less than unity in linear space (mostly between 0.1 and 1), which decreases phyto C and Chl up to an order of magnitude (rare, mostly less). This tuning is not applied to figures previously discussed here.
3.3.1 Comparison of the PSD-based phytoplankton carbon retrieval with existing satellite algorithmsIn this section, we compare PSD-based phyto C retrievals presented here with two existing methods for its retrieval. The May 2015 original total phyto C retrieval is compared with the tuned total phyto C and the retrievals of the absorption- and PSD-based algorithm of and with the algorithm in Supplement Fig. S8. The histograms of these four images are compared in Fig. . The tuned retrievals are similar to those of , whereas the original retrievals are similar to those of , and the latter two have wider ranges globally compared to the former two. Of these algorithms, the simplest is that of , as it is a direct scaling of , and it is based on in situ chemical analytical measurements of phyto C . These dichotomous inter-comparison results suggest that further algorithm inter-comparison and validation with direct in situ measurements of phyto C are needed to guide future algorithm developments; however these data are relatively novel and scarce globally. Validation results using in situ POC and picophytoplankton carbon (discussed next) exhibit a similar dichotomy.
Figure 9
Histograms of the images of Supplement Fig. S8, including the original PSD-based phyto C retrieval (Fig. a). Histogram counts are given on a log10 scale on the axis, and the variable ( axis) is log-transformed as well. All four histograms are derived from the sinusoidal projection images for May 2015, using monthly OC-CCI v5.0 data.
[Figure omitted. See PDF]
3.3.2 Validation using POC and picophytoplankton carbon in situ dataPSD-based estimates of POC are validated against in situ measurements for the original algorithm (Fig. a) and the tuned algorithm (Fig. c). Both regressions have satisfactory values and also illustrate that in general higher POC values are associated with higher Chl (colormap). Notably, the original algorithm validation has a slope of 2 and exhibits substantial underestimates at low POC and overestimates at high POC. As intended, the tuning corrects this range exaggeration and significantly improves the slope, intercept, bias, RMS, and MAE. The regression with the tuning applied should not be considered a truly independent validation because the algorithm has been empirically tuned to retrieve POC well; however, the tuning was done with global POC imagery (using monthly images for 2004 and 2015) that uses the empirical POC algorithm, not with these in situ POC data directly.
In addition to the validation with in situ POC, we performed a comparison of the matched satellite Chl and the corresponding PSD-based Chl estimate (Eq. ) for the original (Fig. b) and the tuned algorithm (Fig. d). Both comparisons exhibit very high values, and similarly to POC, the original algorithm underestimated Chl at low values and overestimated Chl at high values. The tuning successfully addresses this, leading to excellent overall comparison of the tuned algorithm, with a slope near 1.0 and a low intercept. However, for the lowest Chl values (Chl ), performance deteriorates. The tuned comparison is not a fully independent validation, as the algorithm was tuned to compare well with OC-CCIv5.0 satellite retrievals (using global monthly images for 2004 and 2015). Overall, the comparison with Chl is encouraging, indicating that the model is able to reasonably reproduce (with tuning) OC-CCI v5.0 standard satellite Chl values at the match-up points.
Figure 10
Validation of PSD-based POC retrievals vs. in situ POC measurements (a, c) and comparison of satellite retrievals of Chl using the standard OC-CCI v5.0 algorithm vs. Chl estimated from the PSD (b, d). Panels (a) and (b) have no tuning applied, whereas empirical tuning is applied to the parameter (Eq. ) for panels (c) and (d). The tuning procedure applies a linear correction to in log10 space to ensure reasonable retrievals of POC and Chl, based on monthly satellite images from 2004 and 2015; for details, see Supplement Sect. S5. The data points in panels (a) and (c) are colored by their matched standard OC-CCI v5.0 satellite Chl values (colormap, in milligrams per cubic meter in log10 space).
[Figure omitted. See PDF]
Validation against in situ picophytoplankton carbon is presented in Fig. a (with no tuning applied) and in Fig. c with the tuning applied. The corresponding Chl comparisons between matched standard OC-CCIv5.0 Chl and Chl derived via the PSD model are shown in Fig. b and d. As with the POC match-ups (Fig. b and d), comparisons with Chl are better for the tuned version of the algorithm, indicating that the tuning is needed to reproduce more realistic Chl values globally. However, the tuning does not lead to any improvement in the validation results of picophytoplankton C (cf. Fig. a and c). The validation regression without tuning is statistically significant (), albeit noisy (low ); satellite retrievals and in situ data cover approximately the same ranges, and increasing Chl and in situ picophytoplankton C generally correspond to increasing satellite values as well, with some tendency for under- and overestimation as with the other variables. However, the tuned satellite retrievals have a very narrow range that does not cover the range of the in situ data, and validation statistics are generally worse than those of the original validation (the regression is not significant at the level). These validation results are generally consistent with the results of , where the tuned version of the TK16 algorithm was used.
Figure 11
Validation of picophytoplankton carbon derived from the PSD model using daily OC-CCI v5.0 satellite data vs. in situ measurements, as used in the POCO project (a, c). Comparison of PSD-derived satellite Chl ( axes) with the matched satellite retrieval of Chl using the standard OC-CCI v5.0 algorithm at the locations of the in situ picophytoplankton carbon match-up points ( axes) (b, d). Panels (a) and (b) have no tuning applied, whereas empirical tuning is applied to the parameter (Eq. ) for panels (c) and (d). The tuning procedure applies a linear correction to in log10 space to ensure reasonable retrievals of POC and Chl, based on monthly satellite images from 2004 and 2015; for details, see Supplement Sect. S5. The data points in panels (a) and (c) are colored by their matched standard OC-CCI v5.0 satellite Chl values (colormap, in milligrams per cubic meter in log10 space).
[Figure omitted. See PDF]
3.4 Further discussion, summary, and conclusionsThe novel PSD–phyto C algorithm described here represents a major overhaul of the KSM09 algorithm (a comparison between KSM09 and the present algorithm is briefly discussed in Supplement Sect. S6). Unlike KSM09, two distinct particle populations are used: phytoplankton and NAPs. Phytoplankton backscattering is modeled using coated-sphere Mie calculations with inputs based on the Equivalent Algal Populations (EAP) approach . This model formulation allows assessment of the percent contribution of phytoplankton and NAPs to total as well as the estimation of Chl from the retrieved PSD. Underlying forward modeling is hyperspectral, facilitating adaptation of the algorithm to upcoming hyperspectral sensors like PACE . PSD retrieval is achieved via spectral angle mapping (SAM), and no spectral shape is imposed on ; operational end-members for current and past multispectral sensors and the OC-CCI v5.0 merged ocean color data set are created via band-averaging from the underlying hyperspectral modeled .
The algorithm has been used to create an accompanying data set based on the OC-CCI v5.0 data set (; see “Data availability”). We emphasize that the PSD parameters and derived retrievals presented here and in the accompanying data set are an experimental research satellite product with relatively large uncertainties. We do not claim that it is akin in validity and accuracy to the more established (and much more empirical!) algorithms for canonical products such as Chl and POC. As emphasized elsewhere in this text, the goal is to build an operational algorithm based on first principles as much as feasible, even at the expense of accuracy, in order to push the boundaries of what is retrievable from space and move the science of bio-optical algorithm development forward. Potential users of these PSD and derived data need to be aware of their limitations, uncertainties, and validation status before using them, for example, in building or validating/constraining biogeochemical models. The choice of IOP algorithm to retrieve is key for the PSD–phyto C algorithm, as the spectral shape of is what the PSD slope retrieval is based upon (Eq. ). The IOP algorithm is chosen here, as in KSM09, because it allows spectral retrievals that are not constrained by a specific spectral function or parameterization of as is done, for example, in the QAA (quasi-analytical algorithm; ) and the GSM (Garver–Siegel–Maritorena) algorithm . For the wavelengths used in the PSD slope retrieval, modeled and satellite-derived spectral shapes compare well when the algorithm is used, and global patterns of the retrieved PSD parameters appear reasonable. Preliminary tests with indicate that this algorithm is not as suitable for PSD retrieval in this regard. Use of , , and other IOP algorithms will be further investigated in future development of the PSD algorithm.
An important assumption of the model is that for NAPs is twice that for phytoplankton so that the phyto-C-to-POC ratio is a constant . This ratio is expected to vary in the real ocean, and the value used here is a reasonable average choice (e.g., ; ; ; and references therein). employed the cell sorting and chemical analysis methods of to measure phyto C in the equatorial Pacific and along the Atlantic Meridional Transect (AMT). Their results indicate that a phyto C : POC value of is reasonable, falling within their observed ranges; however, they do observe many higher values, particularly in the oligotrophic gyres. The ratio of phyto C to POC (applied to the May 2015 monthly image of the OC-CCI v5.0 data) indicates generally lower values of this ratio (with some high-latitude and coastal exceptions), and even lower values occur in the gyres, with values mostly below 0.1 in the low-latitude open ocean (data not shown, but consistent with work in progress by Shovonlal Roy). In light of this observation, note the difference between the and phyto C retrievals (Fig. and Supplement Fig. S8). Further direct analytical observations of phyto C and the reconciliation and better understanding of the spatiotemporal variability in the phyto-C-to-POC ratio should be a high priority in order to improve understanding of carbon pools and their relationships in the ocean and to retrieve phyto C reliably from space.
The relatively poor PSD parameter validation results should be interpreted with caution, as there are multiple reasons for discrepancies between the in situ and satellite data and for the observed poor regression statistics, and the in situ data have their own limitations. Importantly, the in situ data PSD parameters are fit over a much narrower diameter range than the size range optically contributing the bulk of (see, e.g., Supplement Fig. S4), at least according to the modeled spectra. It is recognized that in the real world the particle assemblage is very complex, and its sources of backscattering are still not fully resolved (e.g., ; ). In particular, the composition and PSD of small sub-micron particles appear to be of importance and are not well known; here we assume the same PSD and NAP composition across all size classes and globally. There is also a mismatch in temporal and spatial scales of sampling between the satellite and in situ data. For example, the matched in situ PSD data do not exhibit the same negative correlation between and that the satellite data do (Supplement Fig. S9). We note that this negative correlation in the satellite data has a theoretical underpinning because of what we know about global ocean ecosystems, namely that oligotrophic areas exhibit relative dominance of smaller phytoplankton (and smaller overall concentrations of particles/biomass), as opposed to increased importance of larger phytoplankton and increased biomass in more eutrophic areas. We thus expect backscattering in the ocean to become “bluer”, i.e., to have a steeper spectral slope, in oligotrophic areas. This is indeed observed in satellite data (e.g., ) and is the basis for our algorithm. Therefore, in the ocean, we expect to decrease with increasing globally and on average. This is not necessarily going to be captured by in situ data of limited spatiotemporal coverage and fit over a narrower size range.
We further note that the number of match-up points in the validation regression is different among PSD, POC, and picophytoplankton C, and their geographic distribution is different as well (Supplement Fig. S5). Namely, there are about an order of magnitude more POC match-ups than picophytoplankton carbon ones. Thus the different validation results presented here do not necessarily represent the same oceanographic conditions; e.g., the picophytoplankton C in situ data have less representation of eutrophic areas and span a smaller range of Chl than the POC validation, with very few points exceeding Chl 1.0 (cf. Figs. b and b).
The picophytoplankton C data in are derived from cell counts (abundance) converted to carbon using specific conversion factors for different species/groups. Namely, 60 C per cell was used for Prochlorococcus, 154 per cell for Synechococcus, and 1319 per cell for pico-eukaryotes. This differs from the PSD-based phyto C retrieval algorithm in which the conversion is a function of cell volume and is continuous. For the allometric coefficients of used here, the equivalent conversion factor is 53 per cell for cells of the smallest diameter within the picophytoplankton range (0.5 ) and is 1825 per cell for the largest-diameter cells within the picophytoplankton range (2.0 ), indicating first-order consistency, but not full equivalency, with the methods of .
The global relationships of the PSD parameters, derived PSD-based phyto C, and POC versus Chl for the May 2015 monthly image are illustrated in Supplement Fig. S10. Globally, as expected, there is a strong correlation of Chl with these variables, with increasing Chl associated with decreasing PSD slope and increasing , phyto C, and POC. While the relationship is strong, there is significant spread of the PSD parameters and phyto C data for a given Chl value, suggesting that there is added value in retrieving them separately and that they should not all be treated as simply correlates of Chl. We note that there is a need for further investigation to avoid uniqueness of retrieval issues and degrees of freedom/independence issues, as well as more comprehensive and complete error propagation, since a lot of ecosystem properties are indeed correlated with Chl, and all these retrievals come from the same multispectral data.
The power law (Eq. ) is a parameterization of real-world PSDs, and while there are theoretical underpinnings (e.g., ; ; ) and observations (e.g., ; ; see also references in ) supporting its applicability, particularly over large size ranges, real-world PSDs may deviate from the power law, especially in coastal zones (e.g., ; ; ; ). There is less information on living phytoplankton only and their specific PSDs because it has been historically difficult to separate living phytoplankton and measure, say, their PSD or carbon
The power law is not a converging PSD model; i.e., it is sensitive to the chosen limits of integration (for a sensitivity analysis to the integration limits, see ). Gamma functions may be a better choice to represent marine PSDs . However, we choose to use the power law because of its theoretical underpinnings and because the goal is to build an operational algorithm (based on first principles as much as possible) for existing multispectral data with limited degrees of freedom. We additionally assume that the PSD slope for both phytoplankton and NAPs is the same, limiting the number of parameters to be retrieved. Hyperspectral data and observations of phytoplankton- and NAP-specific PSDs and IOPs will be needed to relax these assumptions in the future. observed that the PSD slope steepened for small particles, deviating from a power law. This could partially explain the putative underestimates of the original algorithm in oligotrophic gyres. Moreover, the absolute number of particles retrieved is sensitive to uncertainties in the real index of refraction assumed. In this context, we note that the algorithm is able to pick up the concentration of particles, to first order, according to the validation (Fig. b). We find this to be impressive and consider it a success, given that the algorithm makes no a priori prescriptions about particle concentrations; they are solved for from the magnitude and shape of satellite-observed . While the goal here is to create a global algorithm which uses one set of end-members, we recognize that future implementations can be improved by assessing the impact of using regionally variable subsets of index-of-refraction distributions. The PSD parameterization and choices of Mie inputs, in particular complex indices of refraction, represent important sources of uncertainty and can also affect the need for tuning and the degree of suitability of estimating POC with our generic NAP population. Further algorithm analysis of performance and improvements need to focus on the index-of-refraction choices for the particle populations. For further discussion of algorithm uncertainties, see , , and .
observe a relationship between phyto C and that is stronger than that for other proxies. This is encouraging for the use of backscattering as a proxy for phytoplankton carbon biomass. However, the link between the PSD and spectral shape is a second-order effect that is not easily observed in in situ observations , even though theoretical modeling demonstrates a clear link (; this study). discuss some reasons why it may be difficult to observe this relationship in current in situ data, e.g., the fact that the PSD is fit over a narrow range of diameters compared to the size range theoretically affecting . Nevertheless, these considerations and the overall performance of the KSM09 homogeneous algorithm as compared to the algorithm presented here leads to the conclusion that there are four primary directions that should be priorities for moving forward. First, investigate the effect of choices of index-of-refraction distributions, as discussed above. Second, rather than relying only on for PSD and phyto C retrieval, a blended approach should be developed that also uses absorption; i.e., combine the approach here with that of . Third, investigate the ability of hyperspectral data to provide more degrees of freedom for retrieval of more variables simultaneously, allowing relaxation of some key assumptions and perhaps a third particle population to represent POC and mineral particles separately; this is important in light of the upcoming PACE mission . Hyperspectral absorption data in particular have the potential to increase information content and allow group-specific retrievals (e.g., ; but see also ). Finally, collect more global, comprehensive in situ data sets of all relevant variables, including and especially phyto C , for further model development and validation. With regard to the latter, agencies and investigators should focus on building quality-controlled, one-stop-shop data sets.
Appendix A Details on the OC-CCI v5.0 data set
Processing and analysis were done using the sinusoidal projection of OC-CCI v5.0. For user convenience, once the final products were generated, they were re-projected to equidistant cylindrical projection (unprojected latitude/longitude) before publication in the data repository linked above (“Data availability”). The empirical tuning (Sect. ) is not applied to the variables in the published data set (“Data availability”). Instead, the spatially explicit linear-space multiplicative tuning factor (Supplement Fig. S7b) is given. The choice to provide an optional tuning to be applied at the user's discretion is dictated by the validation and comparison results discussed in the paper. Monthly and overall climatologies with propagated uncertainties are also provided, and for these climatologies, both tuned and original variables are included.
Code and data availability
Code and data associated with algorithm development as well as operational application to OC-CCI v5.0 data are published on the Zenodo® repository and are available at the following DOI: 10.5281/zenodo.6354654.
An OC-CCI v5.0-based satellite PSD–phyto C data set (monthly, 1997–2020, plus monthly and overall climatologies) has been published on the PANGAEA® repository and is freely available in netCDF format and browse images at the following DOI: 10.1594/PANGAEA.939863.
The supplement related to this article is available online at:
Author contributions
TSK designed the study, conducted the modeling and algorithm development and data analyses, and wrote the manuscript. SB and LRL provided the EAP model code and technical support for EAP modeling. SM helped with error propagation estimates and designed the band-shifting methodology. CEK, BJ, VMV, and SS extracted match-ups and/or provided match-up in situ and satellite data set compilations. XZ provided the coated-sphere code and technical support for it, as well as validation PSD data. HL and DSFJ provided technical assistance with IOP code testing. EK tested backscattering spectral shape sensitivity. SR provided algorithm output data. SB, LRL, XZ, SM, EK, SR, CEK, BJ, SS, HL, and DSFJ read the manuscript and provided comments/edits.
Competing interests
The contact author has declared that none of the authors has any competing interests.
Disclaimer
The views and opinions expressed here are those of the authors and do not necessarily express those of NASA. Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Acknowledgements
Funding for this project was provided by NASA grant no. 80NSSC19K0297 to Tihomir S. Kostadinov, Irina Marinov, and Stéphane Maritorena. Tihomir S. Kostadinov also acknowledges support from California State University San Marcos. This work is a contribution to the Simons Foundation project Computational Biogeochemical Modeling of Marine Ecosystems (CBIOMES, 549947, Shubha Sathyendranath). This paper is also a contribution to the ESA projects Ocean Colour Climate Change Initiative (OC-CCI) and the Biological Pump and Carbon Exchange Processes (BICEP). Additional support from the National Centre for Earth Observations (UK) is also gratefully acknowledged. Shubha Sathyendranath also acknowledges additional support from the UK's National Centre for Earth Observations. NASA grants 80NSSC17K0568 and NNX15AE67G are acknowledged for support for EXPORTS and NAAMES LISST data collection.
We acknowledge Olaf Hansen, Harish Vedantham, Marco Bellacicco, Salvatore Marullo, Irina Marinov, Ivona Cetinic, Giorgio Dall'Olmo, Emmanuel Boss, Nils Haëntjens, and David Desailly for various help/useful discussions. We acknowledge the ESA OC-CCI and BICEP project teams and contributors. We acknowledge in situ PSD validation data contributors as follows: those given in ; David Siegel and the UCSB ERI/Plumes and Blooms project team; Emmanuel Boss, Nils Haëntjens, and the NASA EXPORTS and NAAMES cruise teams; and Giorgio Dall’Olmo, Emmanuele Organelli , and the AMT26 cruise team. We acknowledge all in situ data contributors to the BICEP/POCO project compilations of POC and picophytoplankton carbon data sets. The OC-CCI reference is , and the v5.0 specific reference is . The modeling and processing are done using the sinusoidal projection (one of the projections provided by OC-CCI), whereas maps here are presented in equidistant cylindrical projection (unprojected lat/long). Erik Fields, the ESA, BEAM (Brockmann Consult GmbH), and NASA are acknowledged for the re-projection algorithm.
Coastlines in maps shown here are from v2.3.7 of the GSHHS data set; see . The NOAA ETOPO1 data set (
The cividis colormap used in most visualizations is a variant of the viridis colormap optimized for color vision deficiency perception and is from and the function to implement it in MATLAB® is due to Ed Hawkins.
We acknowledge Emmanuel Boss and one anonymous reviewer for their very useful comments, which helped improve the manuscript.
Financial support
This research has been supported by the National Aeronautics and Space Administration (grant nos. 80NSSC19K0297, 80NSSC17K0568, and NNX15AE67G) and the Simons Foundation (grant no. 549947).
Review statement
This paper was edited by Aida Alvera-Azcárate and reviewed by Emmanuel Boss and one anonymous referee.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2023. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The particle size distribution (PSD) of suspended particles in near-surface seawater is a key property linking biogeochemical and ecosystem characteristics with optical properties that affect ocean color remote sensing. Phytoplankton size affects their physiological characteristics and ecosystem and biogeochemical roles, e.g., in the biological carbon pump, which has an important role in the global carbon cycle and thus climate. It is thus important to develop capabilities for measurement and predictive understanding of the structure and function of oceanic ecosystems, including the PSD, phytoplankton size classes (PSCs), and phytoplankton functional types (PFTs). Here, we present an ocean color satellite algorithm for the retrieval of the parameters of an assumed power-law PSD. The forward optical model considers two distinct particle populations: phytoplankton and non-algal particles (NAPs). Phytoplankton are modeled as coated spheres following the Equivalent Algal Populations (EAP) framework, and NAPs are modeled as homogeneous spheres. The forward model uses Mie and Aden–Kerker scattering computations, for homogeneous and coated spheres, respectively, to model the total particulate spectral backscattering coefficient as the sum of phytoplankton and NAP backscattering. The PSD retrieval is achieved via spectral angle mapping (SAM), which uses backscattering end-members created by the forward model. The PSD is used to retrieve size-partitioned absolute and fractional phytoplankton carbon concentrations (i.e., carbon-based PSCs), as well as particulate organic carbon (POC), using allometric coefficients. This model formulation also allows the estimation of chlorophyll
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details






1 Department of Liberal Studies, California State University San Marcos, 333 S. Twin Oaks Valley Rd., San Marcos, CA 92096, USA
2 Earth Observation, Smart Places, CSIR 7700, Cape Town, South Africa
3 Plymouth Marine Laboratory, Prospect Place, Plymouth, Devon, PL1 3DH, UK
4 Division of Marine Science, School of Ocean Science and Engineering, The University of Southern Mississippi, Stennis Space Center, MS 39529, USA
5 Earth Research Institute, University of California at Santa Barbara, Santa Barbara, CA 93106-3060, USA
6 SANSA, Enterprise Building, Mark Shuttleworth Street, Innovation Hub, Pretoria 0087, South Africa
7 Univ. Littoral Côte d'Opale, CNRS, Univ. Lille, IRD, UMR 8187 – LOG – Laboratoire d'Océanologie et de Géosciences, 62930 Wimereux, France
8 Department of Earth and Environmental Science, Hayden Hall, University of Pennsylvania, 240 South 33rd St., Philadelphia, PA 19104, USA
9 Department of Geography and Environmental Science, University of Reading, Reading, RG6 6DW, UK
10 National Centre for Earth Observation, Plymouth Marine Laboratory, Prospect Place, Plymouth, Devon, PL1 3DH, UK