1 Introduction
The capability to quantify atmospheric aerosols from spaceborne measurements arguably goes back to 1972 with the launch of the Multispectral Scanner System (MSS) aboard the first Landsat satellite
At present there are several dozen sensors of various types suitable for the quantification of aerosols in flight, and more that have begun and ended operations in between. In addition to the variety of instruments, a variety of algorithms have been developed to retrieve aerosol properties from these measurements
Table 1
Satellite instruments which have been used for column AOD retrieval; arranged by sensor type.
Acronym | Instrument full name | Orbit(s) | Operation period(s) |
---|---|---|---|
Multispectral imager | |||
ABI | Advanced Baseline Imager | Geostationary | 2016 |
AHI | Advanced Himawari Imager | Geostationary | 2014 |
AVHRR | Advanced Very High Resolution Radiometer | Sun-synchronous | 1978 |
CAI | Cloud–Aerosol Imager | Sun-synchronous | 2009 |
EPIC | Earth Polychromatic Imaging Camera | Lagrange point | 2015 |
(E)TM | (Enhanced) Thematic Mapper | Sun-synchronous | 1982 |
GOES Imager | Geostationary Operational Environmental | Geostationary | 1978–2018 |
Satellite Imager | |||
GOCI | Geostationary Ocean Color Imager | Geostationary | 2010 |
GLI | GLobal Imager | Sun-synchronous | 2002–2003 |
MERIS | MEdium Resolution Imaging Spectrometer | Sun-synchronous | 2002–2012 |
MODIS | MODerate resolution Imaging Spectrometer | Sun-synchronous | 2000 |
MSS | Multispectral Scanner System | Sun-synchronous | 1972–2013 |
OLCI | Ocean and Land Color Instrument | Sun-synchronous | 2016 |
OLI | Operational Land Imager | Sun-synchronous | 2013 |
SeaWiFS | Sea-viewing Wide Field-of-view Sensor | Sun-synchronous | 1997–2010 |
SEVIRI | Spinning Enhanced Visible and InfraRed Imager | Geostationary | 2004 |
VIIRS | Visible Infrared Imaging Radiometer Suite | Sun-synchronous | 2012 |
VIRS | Visible and Infrared Scanner | Precessing | 1997–2015 |
Multispectral, multiangle imager or polarimeter | |||
(A)ATSR | (Advanced) Along-Track Scanning Radiometer | Sun-synchronous | 1991–2012 |
CHRIS | Compact High Resolution Imaging Spectrometer | Sun-synchronous | 2001 |
MISR | Multiangle Imaging SpectroRadiometer | Sun-synchronous | 2000 |
POLDER | POLarization and Directionality of the Earth's | Sun-synchronous | 1996–1997; 2002; 2004–2013 |
Reflectances | |||
SGLI | Second-generation GLobal Imager | Sun-synchronous | 2017 |
SLSTR | Sea and Land Surface Temperature Radiometer | Sun-synchronous | 2016 |
Nadir-looking spectrometer | |||
AIRS | Atmospheric Infra-Red Sounder | Sun-synchronous | 2002 |
GOME | Global Ozone Monitoring Instrument | Sun-synchronous | 1995–2011 |
IASI | Infrared Atmospheric Sounding Interferometer | Sun-synchronous | 2006 |
OMI | Ozone Monitoring Instrument | Sun-synchronous | 2004 |
OMPS NM | Ozone Mapping Profiler Suite Nadir Mapper | Sun-synchronous | 2012 |
SCIAMACHY | SCanning Imaging Absorption SpectroMeter for | Sun-synchronous | 2002–2012 |
Atmospheric CHartographY | |||
TOMS | Total Ozone Mapping Spectrometer | Sun-synchronous | 1978–1994; 1996–2005 |
TROPOMI | TROPOspheric Monitoring Instrument | Sun-synchronous | 2017 |
As Table , except for satellite instruments which have been used for aerosol extinction profiling.
Acronym | Instrument full name | Orbit(s) | Operation period(s) |
---|---|---|---|
Lidar | |||
ALADIN | Atmospheric LAser Doppler INstrument | Sun-synchronous | 2018 |
CALIOP | Cloud–Aerosol LIdar with Orthogonal Polarization | Sun-synchronous | 2006 |
CATS | Cloud–Aerosol Transport System | Precessing | 2015–2017 |
GLAS | Geoscience Laser Altimeter System | Polar (varied) | 2003–2010 |
LITE | Lidar In-space Technology Experiment | Space shuttle | 1994 |
Limb or occultation profiler | |||
GOMOS | Global Ozone Monitoring by Occultation of Stars | Sun-synchronous | 2002–2012 |
MIPAS | Michelson Interferometer for Passive Atmospheric Sounding | Sun-synchronous | 2002–2012 |
OMPS LP | Ozone Mapping Profiler Suite Limb Profiler | Sun-synchronous | 2012 |
OSIRIS | Optical Spectrograph and InfraRed Imaging System | Sun-synchronous | 2001 |
SAGE | Stratospheric Aerosol and Gas Experiment | Precessing | 1979–1982; 1984 |
SAM | Stratospheric Aerosol Measurement | Precessing | 1975; 1979–1993 |
Retrieval algorithms are used to process the calibrated observations (referred to as level 1 or L1 data) to provide level 2 (L2) data products, consisting of geophysical quantities of interest. These L2 products are typically on the L1 satellite observation grid (or a multiple of it) and often further aggregated to level 3 (L3) products on regular space–time grids. For further background and a discussion of satellite data processing levels, see . Table provides acronyms and full names for some of the L2 processing algorithms which have been applied to L1 measurements from these instruments. Again, many of these algorithms have been applied (identically or with small modification) to multiple sensors. This table is provided as a convenience to the reader to decode acronyms and decrease clutter in later tables and discussions; specific relevant details and references are provided later. Acronyms often summarize either the principle of the technique or the institution(s) which developed the algorithm. Some algorithms are not listed in this table as they do not have acronyms and are typically referred to by data producers or users by the sensor or mission name. Further, this is not an exhaustive list as numerous other approaches have been proposed in the literature; the criteria for inclusion and broader discussion in this study are that data have been (1) processed and (2) also made generally available for scientific use. Likewise, algorithms which provide aerosol properties as a by-product but not a focus (e.g. land–ocean surface atmospheric correction approaches) are not discussed as often the aerosol components are less detailed and/or used as a sink for other error sources in the algorithm
Acronyms for some aerosol retrieval algorithms, data records, and/or institution names applied to one or more satellite instruments from Tables and .
Acronym | Algorithm full name |
---|---|
AAC | Aerosols Above Clouds |
ADV | (A)ATSR Dual View |
AerGOM | Aerosol profile retrieval prototype for GOMOS |
ASV | (A)ATSR Single View |
BAR | Bayesian Aerosol Retrieval |
CISAR | Combined Inversion of Surface and AeRosol |
DB | Deep Blue |
DT | Dark Target |
EDR | Environmental Data Record |
ESA | European Space Agency |
GACP | Global Aerosol Climatology Project |
GRASP | Generalized Retrieval of Aerosol and Surface |
Properties | |
IMARS | Infrared Mineral Aerosol Retrieval Scheme |
JAXA | Japan Aerospace eXploration Agency |
LDA | Land Daily Aerosol |
LMD | Laboratoire de Météorologie Dynamique |
MAIAC | Multi-Angle Implementation of Atmospheric |
Correction | |
MAPIR | Mineral Aerosol Profiling from Infrared |
Radiances | |
MODACA | MODIS Above-Cloud Aerosol |
NOAA | National Oceanic and Atmospheric |
Administration | |
OMACA | OMI Above-Cloud Aerosols |
OMAERO | OMI Multi-wavelength AEROsol product |
OMAERUV | OMI AERosol UV product |
ORAC | Optimal Retrieval of Aerosols and Clouds |
PMAp | Polar Multi-sensor Aerosol product |
SOAR | Satellite Ocean Aerosol Retrieval |
SU | Swansea University |
SYNAER | SYNergetic AErosol Retrieval |
ULB | Université Libre de Bruxelles |
xBAER | eXtensible Bremen AErosol Retrieval |
L2 retrieval algorithm development is typically guided by information content studies, sensitivity analyses, and retrieval simulations to gauge which quantities a given sensor and algorithmic approach can retrieve and with what uncertainty
Increases in the quality of instrumentation, retrieval algorithms, models, and computational power have prompted an increasing desire for the provision of pixel-level uncertainty estimates in L2 aerosol data products. This has been driven in part by data assimilation (DA) applications, which need a robust error model on data for ingestion into numerical models , often in near-real time. Diagnostic uncertainty estimates are less useful here since the true state is not known (only the retrieved state), so a prognostic (predictive) uncertainty model is needed instead. Early aerosol DA applications either treated diagnostic uncertainty estimates as prognostic ones
Driven by these needs, many AOD data sets now provide prognostic uncertainty estimates; in some cases these additions have been developed to satisfy these user needs, while in others they have always been available as they are inherent to the retrieval technique. Unlike AOD validation, however, which has had a fairly standard methodology , there is not yet a robust and well-used framework for evaluating these uncertainty estimates (sometimes called “validating the validation”). This study arose from discussions as part of the international AeroSat group of aerosol remote sensing researchers as a step toward remedying that gap. AeroSat is a grass-roots community who meet once a year, together with researchers involved in aerosol modelling (the AeroCom group) and measurement, to discuss and move toward solving common issues in the field of aerosol remote sensing. The purpose of this study is threefold:
-
to briefly review the ways in which uncertainty information has been conveyed in satellite aerosol data products (Sect. );
-
to provide a framework for the evaluation of pixel-level AOD uncertainty estimates in satellite remote sensing, which can be adopted as a complement to AOD validation exercises going forward, and use this framework to assess AOD uncertainty estimates in several AOD retrieval products (Sect. ); and
-
to discuss the strengths and limitations of each these approaches, and suggest paths forward for improving the quality and use of L2 (pixel-level) uncertainty estimates in satellite aerosol remote sensing (Sects. , ).
2.1 Terminology
The International Standards Organization document often known as the GUM (Guide to Uncertainty in Measurement) provides standardized terminology for discussing uncertainties . In the interests of standardization and in line with other treatments of uncertainty and error in remote sensing
-
A measurand is a quantity to be determined (measured), in the case of this study the AOD.
-
A measurement is the application of a technique to quantify the measurand, in this case the application of L2 retrieval algorithms to L1 satellite observations.
-
The measured value is the output of the measurement technique, i.e. here the result of the L2 retrieval algorithm, often referred to as the “retrieved AOD”.
-
The uncertainty is in the general sense an expression of the dispersion of the measurand. For most of the data sets discussed in this study it is presented as a 1 standard deviation () confidence interval around the retrieved value (which is defined as the standard uncertainty by the GUM). The true value of the measurand (AOD) is expected to lie within this confidence interval % of the time (corresponding to 1 standard deviation, colloquially ), following Gaussian statistics.
-
The error is the difference between the measured and true values of the measurand, i.e. here the difference between true and retrieved AOD. Following the GUM convention, a positive error means that the measured value minus the true value is positive (and vice versa).
The error can only be known when the true value of the measurand is also known, which is rare. This is the province of validation exercises: note that in the remote sensing community (and adopted here), validation refers to a quality assessment of a data set, which is a different definition from that of the metrology community. While omit mention of aerosols, the points discussed there are applicable to aerosol remote sensing as well. They also note that some authors
For validation exercises AERONET AOD data are often taken as a reference truth because the uncertainty on AERONET AOD data
In contrast to error, the uncertainty can be estimated for each individual measured value (retrieval). The term “expected error” (EE) is often used in the aerosol remote sensing literature
Table 4
AOD and extinction data sets providing prognostic uncertainty estimates as well as associated key references for uncertainty estimate calculation. Where applicable, algorithm names are given first with instrument names in parentheses. See Tables , , and for acronyms.
Data set | Key references for uncertainty | Note |
---|---|---|
ADV/ASV (ATSR2, AATSR) | Jacobians at retrieval solution | |
AerGOM (GOMOS) | Maximum likelihood with smoothness constraints | |
BAR (MODIS) | Maximum likelihood, retrieves whole granule at once | |
CALIPSO | Propagation of contributions through lidar equation | |
CATS | Propagation of contributions through lidar equation | |
CISAR (CHRIS, SEVIRI) | Optimal estimation with smoothness constraints | |
DB AAC (MODIS, SeaWiFS, VIIRS) | Maximum likelihood | |
DB land (MODIS) | Empirical expression from AERONET validation results | |
GOCI | Empirical expression from AERONET validation results | |
GRASP (MERIS, POLDER) | Maximum likelihood with smoothness constraints | |
IMARS (IASI) | Propagated measurement and forward model terms | |
JAXA (AHI) | Optimal estimation | |
LDA (SEVIRI) | Optimal estimation | |
LMD (AIRS, IASI) | Parametric from sensitivity studies and validation | |
MAIAC (MODIS) | Propagated from uncertainty on surface reflectance | |
MAPIR (IASI) | Optimal estimation | |
MIPAS | Maximum likelihood with smoothness constraints | |
MISR dark water | Width of cost function distribution vs. AOD | |
MISR heterogeneous land | Standard deviation of well-fitting aerosol models | |
MODACA (MODIS) | Maximum likelihood | |
NOAA EDR (VIIRS) | Empirical expression from AERONET validation results | |
OMPS LP | Confidence envelope based on aerosol signal strength | |
ORAC (ATSR2, AATSR, SEVIRI, SLSTR) | Optimal estimation | |
PMAp | Standard deviation of aerosol models | |
OSIRIS | Optimal estimation | |
SAGE | Propagated measurement plus interfering species error | |
SU (ATSR2, AATSR, MERIS+AATSR) | Second derivative of error function | |
ULB (IASI) | Propagated measurement and forward model terms |
Examples of existing prognostic uncertainty estimates for AOD or aerosol extinction data sets are given in Table . These fall into two broad camps: formal error propagation techniques accounting for individual terms thought to be relevant to the overall error budget and more empirical methods. The term “error budget” (not defined in the GUM, but in common colloquial use) here refers to, dependent on context, the overall collection of contributions to input or output uncertainty. Strictly, one might refer instead to “uncertainty budget” and “uncertainty propagation”, but for reader ease, the commonly used terms are adopted here.
2.2.1 Formal error propagation
The formal methods which have been applied to date are in general Bayesian approaches, which can be expressed in the formalism of , and are often referred to as optimal estimation (OE). OE approaches provide the maximum a posteriori (MAP) solution to the retrieval problem: maximization of the conditional probability of the retrieved state vector , where and represent the satellite measurements and any a priori information on , respectively. The MAP solution is achieved by minimization of a cost function , and the formalism allows for the calculation of various contributions to the total uncertainty on the retrieved state. OE accounts for uncertainty on the satellite measurements, retrieval forward model (e.g. atmospheric and surface structure assumptions, ancillary data), a priori information, and smoothness constraints (on e.g. spatial, temporal, or spectral variation of parameters). While notation differs between authors
1
where and are covariance matrices; describes the measurement and forward model uncertainty, describes the a priori uncertainty, and is the forward-modelled measurements. The third term represents a generic smoothness constraint on the state vector (which might be spatial, temporal, spectral, or otherwise), where is a block diagonal matrix and its associated uncertainty; the ellipses in Eq. () indicate the potential for the expansion of to include additional smoothness terms. These smoothness constraints were first introduced in the context of aerosol remote sensing by
for AERONET sky-scan inversions. In recent years they have become more widespread in satellite aerosol remote sensing as more capable sensors (e.g. POLDER) and/or algorithms with increased (spatiotemporal, spectral, or directional) dimensionality of measured or retrieved quantities
have been developed. Candidate algorithms for aerosol retrieval from information-rich future sensors also tend to use smoothness constraints
Note that here represents the total of measurement uncertainty, forward model uncertainty (due to approximations made in the radiative transfer), and the contribution of uncertainties in forward model parameters to the simulated signal at the top of the atmosphere (TOA). These model parameters are factors which affect the TOA signal but typically insignificantly enough to be retrieved. For example, many AOD retrieval algorithms ingest meteorological reanalysis to correct for the impact of absorbing trace gases (such as ) on the satellite signal at TOA and to provide wind speed to calculate glint and whitecap contributions to sea surface reflectances . Sometimes these are represented in instead by a “model parameter error” matrix denoted and similar squared deviations, although mathematically since the terms in Eq. () are additive the two formalisms are equivalent if the model parameter uncertainty is transformed into measurement space and included in (as is typically the case).
As and (etc.) are square matrices, correlations between wavelengths or parameters can (and, where practical, should) be accounted for. These terms often affect several satellite bands such that an error in e.g. reanalysis data ingested as part of an AOD retrieval would manifest in a correlated way between these bands. However, due to the difficulty in estimating these off-diagonal elements, in practice they are frequently neglected and the covariance matrices are often assumed to be diagonal (which does not, however, mean that is diagonal). Dependent on the magnitude and sign of these correlations, their neglect can lead to overestimates or underestimates in the level of confidence in the solution. When the cost function has been minimized, the uncertainty on the retrieved state is given by 2 where , known as the weighting function or Jacobian matrix, is the sensitivity of the forward model to the state vector , typically calculated numerically. The 1 uncertainty on the retrieved AOD is then the square root of the relevant element on the diagonal of (dependent on the contents of the state vector). Many current approaches in Table omit a priori and/or smoothness constraints, in which case the corresponding terms in Eqs. () and () vanish. Only BAR and CISAR include both a priori and smoothness constraints. AerGOM, GRASP, and the MIPAS stratospheric aerosol data set use smoothness constraints without a priori on the aerosol state. Others (LDA, JAXA AHI, MAPIR, ORAC) use a priori but no smoothness constraints. Smoothness constraints are attractive for algorithms such as the GRASP application to POLDER, which includes the retrieval of binned aerosol size distribution and spectral refractive index (which are expected to be smooth for physical reasons), as well as those (e.g. BAR, CISAR, GRASP) moving beyond the independent pixel approximation to take advantage of the fact that certain atmospheric and/or surface parameters can be expected to be spatially and/or temporally smooth on relevant scales.
These smoothness and a priori constraints provide a regularization mechanism to suppress “noise-like” variations in the retrieved parameters when they are not well-constrained by the measurements alone, although there is a danger in that overly strong constraints can suppress real variability. As a result, a priori constraints on AOD itself are often intentionally weak compared to those on other retrieved parameters. Strictly, the MAP is a maximum likelihood estimate (MLE) only if the retrieval does not use a priori information, although it is often referred to as an MLE regardless
The rest of the error propagation methods in Table , whether formulated as OE or not, are essentially propagating only measurement (and sometimes forward model) uncertainty through to the retrieval solution through Jacobians. MAIAC is a special case here because, rather than using the measurement uncertainty directly, it propagates the uncertainty of surface reflectance in the 470 band, which is thought to be the leading contribution to the total error budget . It is important to note that the cost function and uncertainty estimate calculations in Eq. () are conditional on several factors.
-
The forward model must be appropriate to the problem at hand and capable of providing unbiased estimates of the observations. Typically if the forward model is fundamentally incorrect, and/or any a priori constraints strongly inappropriate, the retrieval will frequently not converge to a solution or have unexpectedly large . For this reason, high cost values are often used in post-processing to remove problematic pixels (e.g. undetected cloud or snow) or candidate aerosol optical models from the provided data sets .
-
The covariance matrices , and (on measurements, a priori, and smoothness) must be appropriate; if systematically too large or small, uncertainty estimates will likewise be too large or small. These can be tested, to an extent, by examining the distributions of residuals (on measurements and a priori) and the cost function and comparing to theoretical expectations
e.g. . -
The forward model must be approximately linear with Gaussian errors near the solution. This assumption sometimes breaks down if the measurements are uninformative on a parameter and a priori constraints are weak or absent, and the resulting state uncertainty estimates will be invalid. This can be tested by performing retrievals using simulated data, perturbing their inputs according to their assumed uncertainties, and assessing whether the dispersion in the results is consistent with the retrieval uncertainty estimates.
-
The retrieval must have converged to the neighbourhood of the correct solution (i.e. near the global, not a local, minimum of the cost function), which can be a problem if there are degenerate solutions. In practice algorithms try to use reasonable a priori constraints, first guesses, and make a careful selection of which quantities to retrieve vs. which to assume
e.g. . Note that the iterative method of convergence to the solution is not important in itself.
A detailed further discussion on these conditions from the perspective of temperature and trace gas retrievals, which share some similar conceptual challenges to aerosol remote sensing, is provided by .
2.2.2 Other approachesA particular challenge for the formal error propagation techniques is the second point above: how to quantify the individual contributions to the error budget necessary to calculate the above covariance matrices? This difficulty has motivated some of the empirical approaches in Table .
used the results of validation analyses against AERONET to construct an empirical relationship (discussed in more detail later) expressing the uncertainty in MODIS DB AOD retrievals as a function of various factors. This basic approach was later adopted for other data sets, including GOCI and NOAA VIIRS EDR aerosol retrievals . This has some similarity to diagnostic EE envelopes, although importantly these prognostic estimates are framed in terms of retrieved rather than reference AOD. An advantage of this method is that, if AERONET can be taken as a truth and collocation-related uncertainty is small , it empirically accounts for the important contributions to the overall error budget without having to know their individual magnitudes. However, there are some disadvantages: if validation data are sparse or do not cover a representative range of conditions, there is a danger of overfitting the expression, and for an ongoing data set there is no guarantee that past performance is indicative of future results as sensors age and the world changes. For a quantity without available representative validation data, the method cannot be performed. Further, programmatically, it requires processing data twice: once to perform the retrievals and do the validation analysis to derive the expression and a second time to add the resulting uncertainty estimates into the data files. The LMD IASI retrieval has a similar parametric approach , although as validation data are sparse, the parametrization draws on the results from retrieval simulations as well.
The MISR algorithms use different approaches. Both the land and water AOD retrieval algorithms perform retrieval using each of 74 distinct aerosol optical models (known as “mixtures”) and calculate a cost function for each. In earlier algorithm versions uncertainty was taken as the standard deviation of AOD retrieval from mixtures which fit with a cost below some threshold. This is equivalent to assuming that aerosol optical models are the dominant source of uncertainty in the retrieval and that the 74 mixtures provide a representative sampling of microphysical and optical properties.
This approach was refined (for retrievals over water pixels) by by considering the variation of retrieval cost with AOD for each model and transforming this to give a probability distribution of AOD, with the uncertainty taken as the width of this distribution. A similar approach has been proposed for the OMAERO retrieval by , although it has not yet been implemented on a large scale. It has conceptual similarities with the propagation of measurement error in Eq. (), except calculating across the whole range of AOD state space rather than an envelope around the solution and summing the results from multiple distinct retrievals (corresponding to the aerosol mixtures). These methods are, however, reliant on the set of available optical models being sufficient.
Table 5
AOD and extinction data sets providing sensitivity analyses and/or diagnostic uncertainty estimates, with associated key references for uncertainty. Where applicable, algorithm names are given first with instrument names in parentheses. See Tables , , and for acronyms.
Data set | Key references for uncertainty | Note |
---|---|---|
ALADIN | Sensitivity analysis | |
DB land (AVHRR, SeaWiFS, VIIRS) | Envelope from sensitivity analysis and/or validation | |
DT land (MODIS) | Envelope from sensitivity analysis and/or validation | |
DT water (MODIS) | Envelope from sensitivity analysis and/or validation, asymmetric | |
GACP (AVHRR) | Sensitivity analysis, some AERONET validation | |
JAXA CAI | Sensitivity analysis, some AERONET validation | |
JAXA GLI | Sensitivity analysis | |
NOAA Enterprise (ABI, VIIRS) | Validation statistics stratified by AOD and surface type | |
NOAA ocean (AVHRR, VIRS) | Sensitivity analysis, some AERONET validation | |
OMACA | Sensitivity analysis, some airborne validation | |
OMAERO | Sensitivity analysis, validation over western Europe | |
OMAERUV | Envelope from sensitivity analysis and/or validation | |
JAXA SGLI | Sensitivity analysis | |
SOAR (AVHRR, SeaWiFS, VIIRS) | Envelope from sensitivity analysis and/or validation | |
SYNAER | Sensitivity analysis, some AERONET validation | |
TOMS | Envelope from sensitivity analysis and/or validation | |
xBAER (MERIS) | Sensitivity analysis, some AERONET validation |
Available AOD data sets which do not currently provide prognostic uncertainty estimates are listed in Table . In these cases, algorithm papers typically summarize the results of sensitivity analyses to provide a rationale for choices made in algorithm development and to provide a summary of expected performance. Sensitivity analyses often include similar aspects to those employed in error propagation approaches: namely, characterization of the expected effects of uncertainties in sensor calibration and forward model limitations (e.g. assumed aerosol optical models, surface reflectance) on the retrieval solution, singly or jointly. In most cases these are provided for a subset of geometries and atmosphere–surface conditions. Compared to formal error propagation, this has the advantage of being easier to communicate to a reader concerned about a particular assumption (provided the results of the sensitivity analysis are representative), but on the other hand the summary results are specific to only the simulations performed, and real-world uncertainties may be more complicated, particularly when multiple retrieval assumptions are confounded.
Sensitivity analyses are often complemented by dedicated validation papers which summarize the results of comparisons against AERONET, MAN, or other networks
2.4 Systematic and random contributions to uncertainty
Both the diagnostic and prognostic techniques typically (implicitly or explicitly) make the assumption that the sensor and retrieval algorithm are unbiased and that the resulting uncertainty estimates are unbiased and symmetric. However, it is well-known that many of the key factors governing retrieval errors are globally (e.g. sensor calibration, ) or seasonally–regionally (e.g. aerosol optical model, surface reflection, cloud contamination, ) systematic and that true random error (i.e. propagated noise) is often small. While these systematic factors may partially cancel each other out over large ensembles of data (drawn from e.g. different regions, seasons, or geometries), this is not a given.
Uncertainty propagation approaches such as OE can in principle account for systematic uncertainty sources, as they (and any spectral or parameter correlations) can be included in the required covariance matrices. This can produce estimates of total uncertainty which are reasonable for an individual retrieval, but the true (large-scale) error distributions would then not be symmetric, lessening their value. Likewise, systematically biased priors can lead to systematically biased retrievals. As a result, it would be desirable to remove systematic contributions to the retrieval system uncertainty as far as possible. In practice this is often done through validation exercises, whereby diagnostic comparisons can provide clues as to the source of biases, which are then (hopefully) lessened in the next version of the algorithm. Distributions of the residuals of predicted measurements at the retrieval solution can also be indicative of calibration and forward model biases at the wavelength in question.
A possible solution to this is to perform a vicarious calibration, calculating a correction factor to the sensor gain as a function of time and band by matching observed and modelled reflectances at sites where atmospheric and surface conditions are thought to be well-known (e.g. thick anvil clouds, Sun glint, and AERONET sites). The derived correction factor then accounts for the systematic uncertainty on calibration and the radiative transfer forward model, although if this latter term is non-negligible then the vicariously calibrated gains will still be systematically biased (albeit less so for the application at hand). This has the advantage of transforming the calibration uncertainty from a systematic to a more random error source at the expense of creating dependence on the calibration source and radiative transfer model. There is therefore a danger in creating a circular dependence between the vicarious calibration and validation sources as it can hinder understanding of the physics behind observed biases. Further, this has the side effect of potentially increasing the level of systematic error in other quantities or in conditions significantly different from those found at the vicarious calibration location if the forward model contribution to systematic uncertainty is significant . Vicarious calibration is common within the ocean colour community , in which retrieval algorithms are in some cases more empirical and amenable to tuning than physically-driven aerosol retrieval algorithms. It has also been used for on-orbit calibration of instruments lacking on-board capabilities to track absolute calibration and degradation
3 Statistical framework to evaluate pixel-level AOD uncertainty estimates
3.1 Background and methodology
The notation adopted herein is as follows. The AOD is denoted ; unless specified otherwise, references to AOD indicate that at 550 . The reference (here AERONET) AOD is and satellite-retrieved AOD is . The estimated uncertainties on these are denoted and , respectively. If the reference AOD is assumed to be the truth, then the error on the satellite AOD is given by .
Figure 2
Scatter density joint histogram (on a logarithmic scale) of the simulated expected uncertainties and retrieval errors in Fig. b. The line is shown in black. Bins containing no data are shown in white.
[Figure omitted. See PDF]
An important nuance which bears repeating is that the distributions of estimated uncertainty and actual error in Fig. are quite different in shape. This is because the estimated uncertainty distribution is one of the expectations of (given the AOD distribution), while the distribution of errors is one of the realizations of (draws from) . Recall again the distinction between the expectation of rolling an unbiased die (i.e. a result of 3.5) and the actual realization (result) of rolling a die (1, 2, 3, 4, 5, or 6). The latter distribution is broader. This illustrates why comparing errors and uncertainties on a basis, or comparing distribution magnitudes, is not expected to yield agreement, and an evaluation of consistency requires a statistical approach. Figure shows this more directly: there is little correspondence between the two on an individual basis.
When comparing satellite and reference data, the total expected discrepancy (ED) between the two for a single matchup, denoted , should account for uncertainties on both the satellite and reference (here AERONET) data, 3 adding in quadrature under the assumption that the uncertainties on satellite and AERONET AOD are independent of one another. One can then define a normalized error as the ratio of the actual error to the ED, i.e. 4
Figure 3
(a) PDF and (b) CDF of normalized error distributions drawn from the numerical simulations in Fig. ; theoretical (grey shading) and simulation (red) results lie on top of one another. Note that the CDF is of absolute normalized error. Dashed lines indicate various well-known percentage points of Gaussian distributions.
[Figure omitted. See PDF]
In the ideal case , in which case the shape of is dominated by the uncertainty and errors on the satellite-retrieved AOD. If the uncertainties on satellite and reference AOD have been calculated appropriately and the sample is sufficiently large, then the normalized error should approximate a Gaussian distribution with mean 0 and variance 1. Thus, the distribution of can be checked in several ways against expected shapes for Gaussian distributions, for example the probability distribution function (PDF) and cumulative distribution function (CDF) as shown in Fig. .
The above distribution analyses are informative on the overall magnitude of retrieval errors compared to expectations (as well as, in the case of the PDF analysis, whether there is an overall bias on the retrieved AOD). However, alone they say little about the skill in assessing variations in uncertainty across the population. Taking things a step further, the data can be stratified in terms of ED and a quantile analysis performed to assess consistency with expectations. This is equivalent to taking a single location along the axis in Fig. and assessing the distribution of retrieval errors found for the points from that histogram. These, too, should follow Gaussian statistics.
Figure 4
Expected AOD discrepancy against percentiles of absolute AOD retrieval error. Symbols indicate binned results from the numerical simulation; within each bin, paler to darker tones indicate the 38th, 68th, and 95th percentiles (approximate , , points) of absolute retrieval error. Dashed lines (, , , respectively) show theoretical values for the percentiles of the same colour.
[Figure omitted. See PDF]
An example of this is shown in Fig. . The data are divided by expected discrepancy into 10 equally populated bins, and within each bin the 38th, 68th, and 95th percentiles (i.e. approximate , , points, following Gaussian statistics) of absolute retrieval error are plotted. If the uncertainties are appropriate, these should lie along the , , and lines. This analysis provides a way of checking the validity of the uncertainty estimates across the spectrum from low to high estimated uncertainties as opposed to population-average behaviour (i.e. do the distributions of retrieval error change in the expected way as the estimated uncertainty varies?). The 68th percentile is of the most direct interest as it corresponds most directly to the expectation of the retrieval error, but examining other percentiles provides a way to assess whether the distribution is broader or narrower than expected (due to, perhaps, the presence of more or fewer outliers than expected).
The binned analysis is similar to the assessment of forecast calibration in meteorology . Note in a forecast sense that the term calibration refers to a comparison of forecast vs. observed frequencies or magnitudes, distinct from the common meaning of calibration to refer to radiometric accuracy in remote sensing. By further analogy to the forecast community
Figures and provide the basis for the framework proposed in this study. An earlier version of this method was designed during the development and assessment of prognostic uncertainty estimates for MODIS DB retrievals by . It has been further advanced through discussions at annual AeroSat meetings. These ideas have been further practically applied to NOAA VIIRS AOD data by , to GOCI data by , to retrievals of absorbing aerosols above clouds against airborne measurements by , and to the latest MISR product over ocean by . The idea of looking at normalized retrieval error distributions was also explored for AOD by and when evaluating ESA Climate Change Initiative (CCI) aerosol products and in a more general sense (with cloud-top height as an example) by . Indeed, the method is not restricted to AOD, although AOD has the advantage of comparatively readily available, high-quality reference data in AERONET and other networks.
3.2 Practical application to satellite data products3.2.1 AERONET data used and matchup criteria
Here, the reference AOD is provided using level 2.0 (cloud-screened and quality assured) direct-Sun data from the latest AERONET version 3 . As AERONET Sun photometers do not measure at 550 , the AOD is interpolated using a second-order polynomial fit to determine the coefficients , and for each measurement,
6 where is the wavelength. All available (typically four) AOD measurements in the 440–870 wavelength range are used in the fit, which is more robust to calibration problems in individual channels than a bispectral approach and accounts for spectral curvature in . The uncertainty on mid-visible AOD is dominated by sensor calibration and is . The sampling cadence is typically once per 10 in cloud-free, daytime conditions but is more frequent at some sites.
Table 6AERONET sites used and their categorization.
Site | Latitude ( N) | Longitude ( E) | Complexity |
---|---|---|---|
For land algorithm evaluation | |||
Avignon | 43.93 | 4.88 | Straightforward |
Goddard Space Flight Center (GSFC) | 38.99 | Straightforward | |
Palencia | 41.99 | Straightforward | |
Ilorin | 8.48 | 4.67 | Complex |
Kanpur | 26.51 | 80.23 | Complex |
Pickle Lake | 51.45 | Complex | |
For water algorithm evaluation | |||
Ascension Island | Straightforward | ||
Midway Island | 28.21 | Straightforward | |
University of California Santa Barbara (UCSB) | 34.42 | Straightforward | |
Cape Verde | 16.73 | Complex | |
International Centre of Insect Physiology and Ecology (ICIPE) Mbita | 34.21 | Complex | |
Venice | 45.31 | 12.51 | Complex |
Data from a total of 12 AERONET sites, listed in Table , are used here to assess the AOD uncertainty estimates in various satellite data sets. This is evenly split to provide six sites to evaluate AOD retrievals from algorithms over land and six over water. Each category is further split; three sites are described as “straightforward”, for which the AOD retrieval problem is comparatively uncomplicated (i.e. likely no significant deviations from retrieval assumptions) and so the uncertainty estimates might be expected to be reasonable, and three sites are “complex”. These complex sites were chosen as they have complicating factors which are not well-captured by existing retrieval forward models and might be expected to lead to breakdowns in the techniques used by the retrieval algorithms to provide uncertainty estimates.
The reasons for identifying a particular site as complex are as follows. Over land, Ilorin (Nigeria) and Kanpur (India) can exhibit complicated mixtures of aerosols with distinct optical properties and vertical structure . Many AOD retrieval algorithms, in contrast, assume a single aerosol layer of homogeneous optical properties. Pickle Lake (Canada) is in an area studded by lakes of sizes similar to or smaller than satellite pixel size. This might be expected to interfere with data set land masking (which often determines algorithm choice) and surface reflectance modelling in a non-linear way . Over water, Cape Verde (on Sal Island, officially the Republic of Cabo Verde) is characterized by frequent episodes of Saharan dust outflow; these particles have complex shapes, which are often approximated in AOD retrieval algorithms by spheres or spheroids. This assumption leads to additional uncertainties in modelling the aerosol phase matrix and absorption cross section, which are larger than for many other aerosol types and may not be accounted for fully in the retrieval error budget . ICIPE Mbita (hereafter Mbita, on the shore of Lake Victoria in Kenya) is similar to Pickle Lake but for water retrievals; i.e. it allows for the sampling of nominal water pixels which may be influenced by partial misflagging of coastlines, 3-D effects from the comparatively bright land, and outflow into the water affecting surface brightness. Finally, Venice (Italy) is in the northern Adriatic Sea, slightly beyond the outflow of the Venetian lagoon, and its water colour is strongly divergent from the Case 1
This breakdown is inherently subjective as all retrievals involve approximations; the dozen sites chosen are illustrative of different aerosol and surface regimes but not necessarily indicative of global performance. The purpose of this study is to define and demonstrate the framework for evaluating pixel-level uncertainties and provide some recommendations for their provision and improvement. It is hoped that, with growing acceptance of the need to evaluate pixel-level uncertainties, this approach can be applied on a larger scale. The sites were chosen as they are fairly well-understood and have multi-year data sets (data from all available years were considered from the analysis). Note that some of the satellite data sets considered here do not provide data at some sites for various reasons (discussed later).
Figure 5
Example results of matchup and filtering criteria for MISR data at Ascension Island. Red points indicate matchups included for further analysis on the basis of filters described in the text, and grey indicates those excluded from analysis. Horizontal and vertical error bars indicate the uncertainty on AERONET and MISR data, respectively. The line is dashed black.
[Figure omitted. See PDF]
The matchup protocol is as follows. AERONET data are averaged within of each satellite overpass (providing ) and compared with the closest successful satellite retrieval which has a pixel centre within 10 of the AERONET site. This provides and . Each satellite data set's recommended quality assurance (QA) filtering criteria are applied as provided in the data products. The AERONET uncertainty, , is taken as the quadrature sum of the AERONET measurement uncertainty
These matchup criteria are stricter than what is commonly applied for AOD validation
This work considers satellite AOD products from seven algorithm teams; five of these contain both land and water retrievals (albeit sometimes with different algorithms), while two only cover land retrievals. Only pixels retrieved as land are used for comparison with AERONET data from land sites in Table , and vice versa for water sites. These data sets are briefly described below, and the reader is referred to the references cited here and in Tables and for additional information. Note in the discussion that the term “pixel” refers to individual L2 retrievals, sometimes referred to “superpixels” in the literature as they are often coarser than the source L1 data.
3.2.2 MODIS data setsFour of the data sets (three land, one water) are derived from MODIS measurements; there are two MODIS sensors providing data since 2000 and 2002 on the Terra and Aqua satellites, respectively. The sensors have a 2330 swath width, which is advantageous in providing a large data volume for analysis. Since launch, the MODIS aerosol data products have included AOD from the DT algorithm family, which has separate algorithms for water and vegetated land pixels . These data sets provide only diagnostic uncertainty estimates of the form ; in practice (and here) these are often treated as if they were framed instead in terms of with the same coefficients and when a prognostic estimate is needed. For retrievals over land, , which is consistent with the expected performance of the algorithms at launch . Over water, the estimate has been revised since launch to . Limited validation based on Collection 6 data by suggested that there might be an asymmetry to the envelope with the 1 range over water being from to . This has not yet been corroborated by a global validation of C6 or the latest Collection 6.1 (C6.1), and it is also plausible that calibration updates in C6.1 may have ameliorated some of this bias. As a result the symmetric envelope is used here.
The DB algorithm retrieves AOD only over land and was introduced to fill gaps in DT coverage due to bright surfaces such as deserts (although it has since been expanded to include vegetated land surfaces as well). The latest version is described by . Prognostic AOD retrieval uncertainties are estimated as described in ,
7 where and are the cosines of solar and view zenith angles, respectively, and and are coefficients depending on the QA flag value, sensor, and (since C6.1) surface type. The latest values of and are given by .
BAR also performs retrievals only over land; it uses the same radiative transfer forward model as DT but reformulates the problem to retrieve the MAP solution of aerosol properties and surface reflectance simultaneously for all vegetated pixels in a single granule . This includes both a priori information and spatial smoothing constraints. Uncertainty estimates are provided organically by the MAP technique (Eq. ). Note that BAR data are only available at present for 2006–2017.
For all MODIS products, data from the latest C6.1 are used. All products are provided at nominal (at-nadir) horizontal pixel size. Identical algorithms (and approaches for estimating uncertainty) are applied to both Terra and Aqua measurements, and the results of the evaluation were not distinguishable for Terra and Aqua data. For conciseness and to increase data volume Terra and Aqua data are not separated in the discussion going forward.
3.2.3 MISR data setsThe MISR sensor also flies on the Terra platform and consists of nine cameras viewing the Earth at different angles, with a fully overlapped swath width around 380 . The latest version 23, used here, provides AOD retrievals at horizontal pixel size. Both land and water retrievals attempt retrieval using each of 74 candidate aerosol mixtures, although they differ in their surface reflectance models and uncertainty estimates. The overland “heterogeneous surface” retrieval estimates uncertainty as the standard deviation of AOD retrieved using those aerosol mixtures which provide a sufficiently close match to TOA measurements . The “dark water” approach looks at the variation of a cost function across the range of potential AOD and aerosol mixtures,
8 where the sum is over aerosol mixtures and is a cost function similar to the first term of Eq. (). The uncertainty is then taken as the full-width at half maximum of , which is often found to be monomodal and close to Gaussian . Note that MISR does not provide retrievals over Mbita or Venice as the dark water algorithm logic excludes pixels within the matchup radius used here as too bright and unsuitable; thus, the approach cannot be evaluated at those sites.
3.2.4 ATSR data setsThe ATSRs were dual-view instruments measuring near-simultaneously at nadir and near 55 forward. ATSR2 (1995–2003) and AATSR (2002–2012) had four solar and three infrared bands, with approximately 1 pixel sizes and a 550 swath (although ATSR2 operated in a narrow-swath mode over oceans). Their predecessor ATSR1 lacked three of the solar bands and so has not been used widely for AOD retrieval. In 2016 the first of a new generation of successor instruments (the SLSTRs) was launched; SLSTR has several additional bands, a rear view instead of forward, the native spatial resolution of solar bands is finer, and the swath broader . This study uses two data sets derived from this family of sensors.
ORAC is a generalized OE retrieval scheme which has been applied to multiple satellite instruments. Here, the version 4.01 ATSR2 and AATSR from the ESA CCI are used , along with an initial version 1.00 of data from SLSTR. ORAC provides AOD retrievals over both land and ocean surfaces; the retrieval approaches are the same except for the surface reflectance models, which also inform the a priori and covariance matrices. Over water, surface reflectance is modelled according to with fairly strong a priori constraints. Over land, two approaches have been implemented in ORAC; the one used here is a model developed initially for the SU (A)ATSR retrieval algorithm , which assumes that the ratio between forward and nadir surface reflectance is spectrally invariant and has very weak a priori constraints. Note that AOD and aerosol effective radius have weak and strong a priori constraints, respectively. Retrievals are performed at native resolution, and cost functions and uncertainty estimates are as in Eqs. () and () without smoothness constraints. ORAC simultaneously retrieves aerosol and surface properties, performing an AOD retrieval for each of a number (here, 10) of candidate aerosol optical models
ADV uses the ATSR dual view over land to retrieve the contribution to total AOD from each of three aerosol CCI components (with the fraction of the fourth dust component prescribed from a climatology) by assuming that the ratio of surface reflectance between the sensor's two views is spectrally flat. This has some similarity with the approach, except for ADV the ratio is estimated from observations in the 1600 band at which the atmosphere is typically most transparent, rather than being a freely retrieved parameter . Over the water, the algorithm only uses the instruments' forward view as this has a longer atmospheric path length and is less strongly affected by Sun glint. Because of this, the water implementation is often called ASV rather than ADV (Table ), although for convenience here the term ADV is used throughout. Water surface reflectance is modelled as a combination of Fresnel reflectance and the chlorophyll-driven model of . The land and water algorithms treat other factors (e.g. aerosol optical models) in the same way. Unlike ORAC, ADV aggregates to a grid before performing the retrievals. ADV uncertainty estimates are calculated using Jacobians at the retrieval solution, i.e. the first component of Eq. (), with assumed diagonal. The uncertainty on the TOA measurements is taken as 5 %, which is somewhat larger than that assumed by ORAC, so ADV is implicitly adding some forward model uncertainty into this calculation. Version 3.11 of the data sets , also from the ESA aerosol CCI, is used here.
Aside from pixel and/or swath differences, for both ADV and ORAC the implementation of the algorithms is the same for the three sensors. Matchups from the two (for ADV) or all three (for ORAC) sensors are combined here in the analysis to increase data volume due to the similarity in sensor characteristics and algorithm implementation. Note, however, that the difference in viewing directions between (A)ATSR and SLSTR (i.e. forward vs. rear) means that different scattering angle ranges are probed over the two hemispheres, which influences the geographic distributions of retrieval uncertainties. For both of these data sets, a large majority of matchups (75 % or more) obtained are with AATSR, as the ATSR2 mission ended before the AERONET network became as extensive as it is at present, and the SLSTR record to date is short. The results do not significantly change if only AATSR data are considered.
3.2.5 CISAR SEVIRI
Unlike the other data sets considered here, the SEVIRI sensors fly on geostationary rather than polar-orbiting platforms. This analysis uses data from the first version of the CISAR algorithm applied to SEVIRI aboard Meteosat 9; due to computational constraints, only SEVIRI data for 2008–2009 have been processed and included here. This sensor has a sampling cadence of 15 and observes a disc centred over North Africa, covering primarily Africa, Europe, and surrounding oceans. The horizontal sampling distance is 3 at nadir, increasing to around 10 near the limits of useful coverage. This sampling means that several of the AERONET sites (GSFC, Kanpur, Midway Island, Pickle Lake, UCSB) are not seen by the sensor and cannot be analysed.
Table 7
Number of matchups obtained for each AERONET site and data set, together with climatological cloud fraction.
[Figure omitted. See PDF]
Figure 8
As Fig. , except for AERONET sites used for over-water retrieval evaluation.
[Figure omitted. See PDF]
Graphical evaluations of the pixel-level uncertainties are shown in Figs. and for land and water retrievals, respectively. In both of these the left-hand column shows CDFs of absolute normalized error against theoretical expectations (see Fig. b), and the middle and right columns show the ED and twice ED binned against the and points of absolute retrieval error , respectively (see Fig. ). Due to the very different sampling between data sets and sites (Table ), the number of bins is taken as the lesser of or (rounded to the nearest integer). This choice is a balance between well-populated bins to obtain robust statistics and the desire to examine behaviour across a broad range of . These figures also include an estimate of the digitization uncertainty on the binned values: for example, in a bin containing 100 matchups, the uncertainty on the 68th percentile ( point) binned value shown is taken as the range from the 67th to 69th matchup in the bin. For the MODIS-based records (which have the highest sampling) this digitization uncertainty is often negligible, but for others (ADV, MISR, ORAC) it is sometimes not.
Figure 9
Mean and standard deviation of normalized error obtained for each AERONET site and satellite data set for (a) land and (b) water sites. Horizontal and vertical bars indicate the standard errors on the estimates of the mean and standard deviation, respectively. Diamonds and triangles indicate straightforward and complex AERONET sites (Table ). Note that the axis is truncated and the axis is logarithmic.
[Figure omitted. See PDF]
A further way to look at the data is provided by Fig. , which shows the mean and standard deviation of for each data set and AERONET site; for unbiased retrievals with perfectly characterized errors (see Fig. a) the results should fall at coordinates (0, 1). This is a complement to the previously shown CDFs as it also provides measures of systematic bias in the AOD retrieval and systematic problems in estimating error magnitude: horizontal displacement from the origin indicates the relative magnitude and direction of systematic error, and vertical displacement indicates a general underestimation or overestimation of the typical level of error. Further, it shows how closely (or not) results from the different sites cluster together. For a larger-scale analysis of hundreds of AERONET sites, this type of plot could be expanded to a heat map. The CDFs in Figs. and assess the overall magnitude of normalized errors and the shape of the distribution, while the binned ED assesses the overall skill in the specificity of the estimates. In these figures, the top and bottom three rows show sites expected to be straightforward or complicated test cases for the uncertainty estimate techniques (Table ). Table provides the overall calibration skill scores for error at each site (Eq. ), plus the coefficient of determination (where at least three bins were available) between binned uncertainty and error from the middle columns of Figs. and . Together, these facilitate a visual and quantitative evaluation of the pixel-level uncertainty estimates.
Table 8Calibration skill scores and coefficient of determination from binned uncertainties in Figs. and .
Data set | AERONET site | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Land calibration skill scores; | ||||||||||||
Straightforward sites | Complex sites | |||||||||||
Avignon | GSFC | Palencia | Ilorin | Kanpur | Pickle Lake | |||||||
ADV | 0.58 | 0.87 | 0.72 | 0.29 | 0.99 | 0.99 | 0.08 | 0.99 | ||||
BAR | 0.55 | 0.94 | 0.57 | 0.37 | 0.84 | 0.11 | 0.88 | |||||
CISAR | 0.67 | – | – | 0.075 | 0.94 | – | – | – | – | |||
DB | 0.98 | 0.57 | 0.98 | 0.65 | 0.96 | 0.99 | 0.97 | 0.61 | 0.86 | |||
DT | 0.57 | 0.89 | 0.47 | 0.89 | 0.53 | 0.92 | 0.04 | 0.69 | 0.91 | 0.50 | 0.99 | |
MISR | 0.84 | 0.85 | 0.97 | 0.99 | 0.93 | 0.98 | 0.62 | 0.75 | 0.38 | 0.87 | 0.96 | |
ORAC | 0.82 | 0.70 | 0.36 | 0.82 | 0.95 | 0.88 | 0.84 | |||||
Water calibration skill scores; | ||||||||||||
Straightforward sites | Complex sites | |||||||||||
Ascension Island | Midway Island | UCSB | Cape Verde | Mbita | Venice | |||||||
ADV | – | – | 0.40 | 0.79 | 0.91 | 0.35 | 0.11 | 0.70 | ||||
CISAR | 0.11 | – | – | – | – | 0.42 | 0.28 | 0.33 | ||||
DT | 0.72 | 0.87 | 0.38 | 0.95 | 0.45 | 0.99 | 0.73 | 0.93 | 0.62 | 0.92 | 0.63 | 0.98 |
MISR | 0.52 | 0.94 | 0.45 | 0.80 | 0.97 | 0.78 | 0.94 | – | – | – | – | |
ORAC | 0.84 | 0.48 | 0.06 | 0.92 | 0.047 | 0.063 |
Turning to the land sites (Fig. ), all the techniques show some skill in that the ED generally increases with retrieval error. There is, however, considerable variation between sites (which points to the utility of considering results site by site for this demonstration analysis) and data sets. For the straightforward sites, there is an overall tendency for the uncertainty estimates to be too large. This may indicate that the retrieval error budgets are a little too pessimistic; since overall errors and uncertainties also tend to be small at these sites, it is also possible that the uncertainty on the AERONET data (which can be a non-negligible contribution to ED here) is overestimated. A notable exception here is MISR, for which uncertainty estimates are very close to theoretical expectations. This implies that the overall assumptions made by this technique (that the principle contribution to error is in aerosol optical model assumptions, and the 74 mixtures provide a representative set such that the standard deviation of retrieved AOD between well-fitting mixtures is a good proxy for uncertainty) are valid. A second exception is CISAR, which more significantly overestimates the uncertainty, indicating that the retrieval is more robust than expected. For these sites the binned plots of and retrieval error vs. ED look similar, suggesting that, within each bin, the retrieval errors are roughly Gaussian (even if the magnitudes of uncertainty are not perfectly estimated). MODIS DT tends to overestimate uncertainty on the low end and underestimate on the high end, suggesting (at least for these sites) that the first and second coefficients in the expression may need to be decreased and increased, respectively.
For the complex land sites, the picture is different. At Ilorin, MODIS DB and ADV tend to overestimate uncertainty, while the others underestimate it. This site was chosen as a test case because of the complexity of its aerosol optical properties, which are more absorbing than assumed by many retrieval algorithms and can show large spatiotemporal heterogeneity due to a complex mix of sources . Using aircraft measurements, found mid-visible single-scattering albedo (SSA) from smoke-dominated cases between 0.73 and 0.93, with a central estimate for the smoke component of 0.81. DB has a regional SSA map with more granularity , while the other algorithms do not contain sufficiently absorbing particles, leading to a breakdown in their uncertainty estimates when strong absorption is present.
The most absorbing component in the MISR aerosol mixtures has an SSA of 0.80 at 558 ; mixtures including this component have SSA from 0.81 to 0.96, and all other MISR mixtures have SSA
The case at Pickle Lake is more diverse: similar to the straightforward sites, MODIS DT, DB, and BAR all overestimate uncertainty. ADV and MISR are fairly close to theoretical values; despite this, their skill scores are fairly low (Table ) as the magnitudes of their uncertainties are not perfect and the range of 1 retrieval errors is fairly small. All these algorithms provide retrievals significantly less often than would be expected by the site's cloud cover, latitude, and AERONET availability (Fig. ). This implies that the algorithms may be coping with a potential violation of assumptions (i.e. land mask issues from numerous small lakes) by simply not providing a retrieval at all. ORAC underestimates uncertainties at this site but provides retrievals relatively more frequently than the other data sets. As the land–sea mask is determined at full (1 ) resolution and used to set the surface model, it is likely that some of the pixels within the 10 grid will be affected by misflagging and/or mixed surface issues, contributing to additional errors which are not being caught by these quality checks. Which behaviour is more desirable (no data vs. more uncertain data than expected) is a philosophical and application-dependent matter. As it lies outside the SEVIRI disc, CISAR provides no retrievals at this site.
Aside from DB, DT, and MISR, skill scores (Table ) are in most cases negative; for the former two the uncertainty estimates are somewhat empirical and not independent of the AERONET data, so the fact they are fairly well-calibrated is not surprising. Despite this is typically not negligible (although the small number of bins means the estimates of are somewhat uncertain). This implies that, while the absolute magnitudes of estimated uncertainty are often too small or large, the techniques do show some skill at predicting which retrievals are comparatively less or more uncertain at a variety of locations. Neither nor should be overinterpreted in terms of site-to-site variations, as these depend strongly on the number of bins, the range in estimated uncertainties, and the range in actual retrieval errors at a given site. The main points of note are whether and whether there is a positive association between binned uncertainty and error.
3.3.2 Water sites
For the water sites (Fig. ), only five satellite data sets are available – also recall that the MODIS DT uncertainty envelope is narrower than over land, and the MISR uncertainty is a PDF based on a cost function composited over AOD and aerosol mixtures rather than (as over land) a simple standard deviation. At the straightforward sites there is some commonality with the land sites. Specifically, the MISR approach works fairly well, CISAR overestimates uncertainty (although of the three, only Ascension Island is within the SEVIRI disc), and MODIS DT slightly overestimates uncertainty overall, with a tendency to overestimate on the low end and underestimate on the high end. In general a similar picture is also seen in terms of and : most data sets are not well-calibrated, although there is skill at assessing variations in uncertainty at individual sites.
ADV and ORAC are more systematic in their underestimation of uncertainty over water compared to over land, although as the over-water errors are often fairly small in absolute terms, they appear fairly large in relative terms. This difference in the ATSR-based records between land and ocean sites is intriguing. ADV assumes 5 % uncertainty in the TOA signal, while ORAC includes separate measurement and forward model terms for a slightly lower total uncertainty overall (typically 3 %–4 % dependent on band and view), which in part explains ORAC's larger normalized errors. The common behaviour either implies (1) that the calibration of the sensors may be biased or more uncertain than expected for these fairly dark ocean scenes or (2) that the over-water surface reflectance models or (for ORAC) their uncertainties (either in their contribution to forward model error in or the strength of the a priori constraint in ) might be less reliable than assumed. Figure implies that there is a significant systematic error source in ORAC contributing to a positive bias over water. A thorough comparison between the two data sets using the matchups collected here is difficult due to the fairly low data volumes involved, especially for ADV. ADV provides significantly fewer retrievals overall than ORAC (for both land and water), implying stricter pixel selection and/or retention criteria; this is consistent with ESA CCI validation analysis of earlier versions of these data sets by and .
Despite the expected complexities at Cape Verde from mixtures of low-level sea spray and higher-altitude nonspherical mineral dust , the error characterization at this complex site does not appear different from that obtained at the more straightforward sites. Interestingly, these algorithms seem more selective about when to provide retrievals at the three straightforward sites than they are at Cape Verde (Fig. ). The reasons for this are unclear unless the estimate provided by (Eq. ) is not a good approximation for these sites; each is close to the coast and all should be roughly equally affected by Sun-glint sampling-related losses.
Mbita is in some sense the inverse of the land site Pickle Lake, and similar comments apply. MODIS DT uncertainties are reasonable, although the data volume is fairly low relative to expectations from Fig. . ADV and ORAC retrieve more frequently and perform well but with more high-error outliers than expected, likely due to mixed or misflagged land–water pixels. CISAR retrieves with a similar frequency at Mbita as Ascension Island (that is, less than expected but no less so than at the straightforward site). Looking at the binned ED vs. error, the errors for the 1 points (Fig. n) are slightly overestimated and those for the points (Fig. o) underestimated, implying more extreme outliers than expected and indicating possible surface contamination issues. Note that MISR does not provide retrievals at this site as the algorithm does not consider Lake Victoria to be dark water.
Venice is sampled close to the expected rates by ADV, CISAR, MODIS DT, and ORAC (Fig. ), and again it is excluded by MISR due to the bright, turbid water. Here, the CISAR retrieval error is and the error is about double that, regardless of the ED; the uncertainty estimates do not show skill overall. As SEVIRI's wavelengths (640, 810, 1640 ) are less strongly affected by water turbidity than the other sensors, the issues causing complexity here may not apply, and the overall tendency for CISAR to report too large an uncertainty may be dominating. ADV and DT results are reasonably in line with expectations, implying either that the turbid water is not a hindrance for the algorithm or that the additional uncertainty from this factor is compensated for by lower uncertainties in some other aspect of the algorithm. ORAC tends to more strongly underestimate the retrieval uncertainty. The water surface reflectance model is based on low-turbidity Case I water , so it is likely providing a low-biased a priori for the retrieval with too strong a constraint, leading to a high bias in AOD retrievals with overly high confidence in the solution, which becomes large when expressed in normalized terms.
4 Conclusions and path forward
Pixel-level uncertainty estimates in AOD products are an important complement to the retrievals themselves to allow users to make informed decisions about data use for data assimilation and other applications. Ideal estimates are prognostic (predictive), and these are increasingly being provided within data sets; when they are absent, diagnostic estimates can be used as a stopgap. This study has reviewed existing diagnostic and prognostic approaches, provided a framework for their evaluation against AERONET data, and demonstrated this framework using a variety of satellite data products and AERONET sites. It is hoped that this methodology can be adopted by the broader community as an additional component of data product validation efforts. Several conclusions about the performance of these existing estimates follow.
-
All tested techniques show skill in some situations (in that the association between estimated uncertainty and observed error is positive, and on average magnitudes are reasonable), although none are perfect, and there is no clear single best technique. Small data volumes for some sensors and locations limit the extent to which performance in the high-uncertainty regime can be probed.
-
The points in Fig. tend to cluster by data set more strongly than by site. This implies that some of the quantitative limitations in the uncertainty estimates provided within the current data sets are large-scale issues (e.g. persistent underestimate or overestimate of some aspect of the retrieval error budget). Further, as the performance at expected straightforward vs. complex AERONET sites was not always distinct, these limitations (or other unknown factors) may at present be more significant error sources than the issues associated with the ground sites.
-
While skilful, the uncertainties are not always well-calibrated; i.e. they are often systematically too large or too small. If characterization of the error budgets of the retrievals cannot be significantly improved, it is plausible that a simple scaling (using e.g. averages of the standard deviations on the axis in Fig. ) could be developed to bring the magnitudes more into line with the expected values.
-
The formal error propagation techniques (employed here by BAR, CISAR, and ORAC) are very powerful. Their differing behaviour and performance illustrate the difficulties in appropriately quantifying terms for the forward model, a priori covariance matrices, and appropriate smoothness constraints. For these sites, CISAR tends to overestimate the uncertainty most strongly, BAR to overestimate slightly, and ORAC to underestimate (more strongly over water than land). The simpler approach taken by ADV (Jacobians from a flat 5 % error on TOA reflectance) tends to be about right over land but also underestimates the true uncertainty over water.
-
The empirical validation-based MODIS DB approach works well but on average overestimates the total uncertainty and at these sites has little bias overall. That may indicate that the sites used here are coincidentally better-performing than the global results used to fit the expression. This points to the fact that the expression (which draws on AOD, geometry, quality flag, and surface types) captures many, but not all, of the factors relevant for quantifying total uncertainty.
-
The diagnostic MODIS DT approaches perform reasonably well if used instead as prognostic uncertainty estimates; they have a tendency to be insufficiently confident (overestimate uncertainty) on the low end and overconfident (underestimate uncertainty) on the high end. Despite the possibility for unphysical negative AOD retrievals in the DT land product, both land and ocean results indicate a systematic positive bias in the retrievals.
-
MISR's two approaches (applied for land and water surfaces) are both based on diversity between different candidate aerosol optical models. They both perform well at most sites, although they have a tendency to underestimate the total uncertainty slightly. The implication from this is that the diversity in AOD retrievals from different candidate optical models does capture the leading cause of uncertainty in the MISR retrievals. The fact that they are underestimates does imply at least one remaining important factor which is not captured by this diversity, which could perhaps be a systematic error source such as a calibration or retrieval forward model bias.
More broadly, these results suggest paths for the development and refinement of pixel-level AOD uncertainty estimates for existing and new data sets. For algorithms attempting AOD retrievals from multiple candidate aerosol optical models, the diversity in retrieved AOD between these different models could be a good proxy for part of the retrieval uncertainty. The MODIS DT ocean and ORAC algorithms both perform retrievals for multiple optical models. As ORAC is already an OE retrieval, this aerosol-model-related uncertainty is one of the few components not directly included in the existing error budget, so it could perhaps be added in quadrature to the existing uncertainty estimate. MODIS DT provides only a diagnostic AOD uncertainty estimate; diversity between possible solutions (which draw from 20 possible combinations of four fine modes and five coarse modes) could be explored as a first-order prognostic extension or replacement of that. One caveat is that this metric is only useful when the candidate set of optical models is representative; results at Ilorin, where aerosol absorption is often stronger than assumed in retrieval algorithms and the MISR approach does not perform well, illustrate that this is not always the case.
A general principle behind the error propagation techniques is the assumption of Gaussian departures from some underlying forward model. When this is not true, the techniques tend to fail. The Ilorin case is one such example of this. Another is the higher-level issue of coastal or lake areas, as most algorithms make binary retrieval decisions with non-linear implications (e.g. treat pixel as land or water for surface reflectance modelling), which cause problems if pixels are either misflagged or “contaminated” and contain mixed water or land. The algorithms tested here tend to deal with this in one of two ways. The first is simply to fail to provide a valid retrieval at all; in this case, the uncertainty estimates for available retrievals tend to be reasonable, although the data volume is significantly less than expected. The second option is to provide a retrieval but consequently provide a poor estimate (and typically an underestimate) of the associated uncertainty. Neither is entirely satisfactory. Performing retrievals at a higher spatial resolution with strict filtering might ameliorate these issues, as a smaller fraction might then be contaminated or misflagged; however, the resolutions of the sensor measurements and land mask (and its quality) place hard constraints on what could be achieved. Another option might be to attempt retrievals using both land and water algorithms for these pixels and either report both or an average (including the difference between them as an additional contribution to the uncertainty estimate). This would provide some measure of the potential effect of surface misclassification and at the least provide a larger uncertainty estimate to alert the data user about problematic retrieval conditions. A deeper understanding of the representativity of AERONET sites on satellite retrieval scales would be useful to better understand the distributions of retrieval success rates and errors. This is a topic of current research
A further difficulty in the assumption of Gaussian random errors is that sensor calibration uncertainty tends to be dominated by systematic effects rather than random noise. While in practice it is often (as in the algorithms assessed here) treated as a random error source, when it is a dominant contribution to the retrieval error budget it will tend to skew the retrievals toward one end of the notional uncertainty envelopes. This may explain some of the systematic behaviour along the axis in Fig. within individual data sets (although the position along this axis is determined not only by the actual error, but also the estimated uncertainty). As discussed in Sect. , a pragmatic method for amelioration of this (if the forward model contribution to the systematic uncertainty cannot be significantly reduced by improvements to retrieval physics) would be to perform a vicarious calibration. Ship-borne AOD observations were also used as one part of the MISR calibration strategy for low-light scenes ; if this removes the bulk of the systematic calibration error, it may help explain why the uncertainty estimation technique (dispersion in possible solutions with different aerosol optical model assumptions) generally works so well.
The framework for evaluating uncertainties here is general and not restricted to AOD. In practice, however, it is difficult to extend it to other aerosol-related quantities at the present time. For profiling data sets (such as lidar), uncertainties in extinction profiles are often strongly vertically correlated as the effects of assumptions propagate down the profile . An assessment would also have to account for the vertical resolution of the sensors and compute appropriate averaging kernels ; this is by no means intractable and has been done using ground-based lidar systems for aerosol properties
For the total column, other key quantities of interest include the Ångström exponent (AE), fine-mode fraction (FMF) of AOD, and aerosol SSA. The AE can easily be assessed using this framework, although AERONET AE itself can be quite uncertain in the low-AOD conditions which predominate in many locations around the globe . In that case the expected discrepancy would include significant contributions from AERONET uncertainty, so the comparison would be less informative about the quality of the satellite uncertainty estimate. These issues are somewhat lessened in high-AOD conditions, however. Similar comments apply to AERONET FMF, which has an uncertainty of the order of in moderate- to high-AOD conditions and larger when AOD is low . The framework presented here would not become invalid in these cases (although it becomes statistically problematic for locations where FMF is close to the bounds 0 or 1) but would become a measure of the joint consistency of both satellite and AERONET uncertainties, rather than a test primarily of the satellite uncertainty estimates. Some of these issues are lessened if, instead of FMF, fine-mode AOD (i.e. the product of FMF and AOD) and coarse-mode AOD are used. While AOD is also positive definite, numerical issues associated with AOD near 0 can be removed if retrievals are performed in log space, reflecting the closer-to-lognormal distributions of AOD found in nature ; ORAC, for example, retrieves AOD in log space.
Issues with SSA are somewhat more difficult; AERONET almucantar inversions have an uncertainty in SSA around under favourable conditions (moderate to high AOD and large solar zenith angle) but uncertainties can be significantly larger otherwise . Given that SSA (like FMF) is inherently bounded in the range 0–1, and most aerosol types have SSA in the visible spectral region around 0.8–1
Data availability
AERONET data are available from
Author contributions
AMS conceptualized the study, provided MODIS DB data, performed the analysis, and led the writing of the paper. ACP provided ORAC data. PK, AL, and TM provided ADV and BAR data. FP provided MODIS DT data. MW provided MISR data. YG and ML provided CISAR data. TP and KS provided general guidance and insight through ESA aerosol CCI and AeroSat validation and uncertainty characterization activities; TP also contributed significantly to table outlining and referencing approaches to uncertainty characterization. All authors contributed to editing the paper.
Competing interests
The authors declare that they have no conflict of interest.
Acknowledgements
The work of lead author Andrew M. Sayer was performed as part of development for the forthcoming NASA Plankton, Aerosol, Cloud, ocean Ecosystem (PACE) mission (
Financial support
This research has been supported by the NASA.
Review statement
This paper was edited by Alexander Kokhanovsky and reviewed by three anonymous referees.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2020. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Recent years have seen the increasing inclusion of per-retrieval prognostic (predictive) uncertainty estimates within satellite aerosol optical depth (AOD) data sets, providing users with quantitative tools to assist in the optimal use of these data. Prognostic estimates contrast with diagnostic (i.e. relative to some external truth) ones, which are typically obtained using sensitivity and/or validation analyses. Up to now, however, the quality of these uncertainty estimates has not been routinely assessed. This study presents a review of existing prognostic and diagnostic approaches for quantifying uncertainty in satellite AOD retrievals, and it presents a general framework to evaluate them based on the expected statistical properties of ensembles of estimated uncertainties and actual retrieval errors. It is hoped that this framework will be adopted as a complement to existing AOD validation exercises; it is not restricted to AOD and can in principle be applied to other quantities for which a reference validation data set is available. This framework is then applied to assess the uncertainties provided by several satellite data sets (seven over land, five over water), which draw on methods from the empirical to sensitivity analyses to formal error propagation, at 12 Aerosol Robotic Network (AERONET) sites. The AERONET sites are divided into those for which it is expected that the techniques will perform well and those for which some complexity about the site may provide a more severe test. Overall, all techniques show some skill in that larger estimated uncertainties are generally associated with larger observed errors, although they are sometimes poorly calibrated (i.e. too small or too large in magnitude). No technique uniformly performs best. For powerful formal uncertainty propagation approaches such as optimal estimation, the results illustrate some of the difficulties in appropriate population of the covariance matrices required by the technique. When the data sets are confronted by a situation strongly counter to the retrieval forward model (e.g. potentially mixed land–water surfaces or aerosol optical properties outside the family of assumptions), some algorithms fail to provide a retrieval, while others do but with a quantitatively unreliable uncertainty estimate. The discussion suggests paths forward for the refinement of these techniques.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details






1 GESTAR, Universities Space Research Association, Columbia, MD, USA; NASA Goddard Space Flight Center, Greenbelt, MD, USA
2 Rayference, 1030 Brussels, Belgium
3 Finnish Meteorological Institute, Atmospheric Research Centre of Eastern Finland, Kuopio, Finland
4 Deutsches Zentrum für Luft-und Raumfahrt e. V. (DLR), Deutsches Fernerkundungsdatenzentrum (DFD), 82234 Oberpfaffenhofen, Germany
5 National Centre for Earth Observation, University of Oxford, Oxford, OX1 3PU, UK
6 Atmosphere and Climate Department, NILU – Norwegian Institute for Air Research, 2007 Kjeller, Norway
7 Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109, USA