1 Introduction
Carbon monoxide () is an atmospheric pollutant compromising air quality. It is a colourless, odourless, and tasteless gas that can disrupt the transport of oxygen by haemoglobin in the red blood cells after inhalation of high doses, thus having the ability to cause severe health problems . Its lifetime of about 1 to 2 months allows it to be used as a tracer for the long-range transport of pollution. plays a central role in tropospheric chemistry by acting as a precursor to tropospheric ozone , which is another pollutant considered harmful to public health and a greenhouse gas. Moreover, is the largest direct sink of the hydroxyl radical () affecting the self-cleansing capacity of the atmosphere, as the consumed cannot deplete other atmospheric constituents such as methane anymore. Hence, can be interpreted as an indirect agent of climate change because it is affecting concentrations of direct greenhouse gases.
Methane () is an important long-lived anthropogenically released greenhouse gas. It is second only to carbon dioxide (), which accounts for the largest share of radiative forcing caused by human activities since 1750. is less abundant in the atmosphere than , but it has a considerably higher global warming potential per unit mass. An accurate understanding of the sources and sinks of is indispensable to reliably predict future climate. Due to its relatively long atmospheric perturbation lifetime (budget lifetime multiplied by feedback factor) of about 12 years , is well-mixed in the atmosphere and the signals in question are typically only small variations on top of large background concentrations. Therefore, requirements for the precision and accuracy of atmospheric measurements are demanding .
Detailed and continuous observations with global coverage of both gases are needed to improve our understanding of the climate system, tropospheric chemistry, and atmospheric transport processes. This objective can only be achieved using satellite instruments. Several spaceborne instruments have been measuring and on a global scale up to now, including the Atmospheric Infrared Sounder (AIRS) , the Tropospheric Emission Spectrometer (TES) , and the Infrared Atmospheric Sounding Interferometer (IASI) , which observe emissions in the thermal infrared (TIR) and are mainly sensitive to mid- to upper-tropospheric abundances. For this category is expanded by the Measurement of Pollution in the Troposphere (MOPITT) instrument , which combines observations of spectral features in the TIR and in the shortwave infrared (SWIR), increasing surface-level sensitivity in some scenes .
Nearly equal sensitivity to all altitude levels, including the boundary layer, can be achieved from radiance measurements of reflected solar radiation in the SWIR part of the spectrum. This was first demonstrated by retrievals from the SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY) instrument onboard ENVISAT for and in the or spectral range. ENVISAT was launched in 2002 and the end of the mission was declared after 10 years in orbit due to unexpected loss of contact with the satellite in 2012. The Thermal And Near infrared Sensor for carbon Observations Fourier-Transform Spectrometer (TANSO-FTS) onboard the Greenhouse gases Observing SATellite (GOSAT) , which was launched in 2009, also yields atmospheric with high near-surface sensitivity but with a fairly sparse spatial sampling interval of about in five-point across-track mode between its diameter circular footprints. Its successor GOSAT-2 (launched in 2018) has an extended spectral range and is designed to additionally measure .
The launch of the Sentinel-5 Precursor (Sentinel-5P) satellite in October 2017 with the TROPOspheric Monitoring Instrument (TROPOMI) onboard can be considered a game changer for the determination of atmospheric composition from space. TROPOMI is a spaceborne nadir-viewing imaging spectrometer measuring solar radiation reflected by the Earth in a push-broom configuration. It has a swath width of and allows for the analysis of several atmospheric species with an unprecedented level of detail by combining high precision and spatial resolution with daily global coverage. TROPOMI measures radiances between the ultraviolet (UV) and the shortwave infrared (SWIR) in eight bands. The characteristics of the TROPOMI NIR and SWIR bands are summarised in Table .
Table 1
Summary of the TROPOMI NIR and SWIR spectral bands and their key features .
Spectrometer | NIR | SWIR | ||
---|---|---|---|---|
Band ID | ||||
Spectral range (nm) | 661–725 | 725–786 | 2300–2343 | 2343–2389 |
Spectral resolution FWHM (nm) | 0.34–0.35 | 0.34–0.35 | ||
Spectral sampling (nm) | ||||
Spatial sampling (km) | ||||
Detector binning factor | 1 |
and can be retrieved from radiance measurements in TROPOMI's SWIR bands. For these bands, the spatial resolution of the nadir measurements is typically 7 km 7 km, which is almost 40 times finer than for SCIAMACHY. In contrast to TANSO, the imaging capabilities of TROPOMI provide 3 orders of magnitude more measurements without gaps, thus facilitating real global maps of and in a short time. The unique combination of high precision, spatiotemporal resolution, and coverage enables new fields of application. As large sources are readily detected in a single overpass, emission monitoring and air quality assessments are only two examples of the new prospects TROPOMI offers. The first applications concerning and have already been highlighted and demonstrated in recent publications .
As in the fields of weather and climate modelling, ensemble approaches have recently acquired an increased importance in the context of satellite observations, aiming at benefitting from a larger range of possible realisations of different physical aspects or to analyse to what extent specific geophysical findings depend on the particular characteristics of an algorithm or instrument . Along these lines, it is worthwhile to have a set of distinct retrieval algorithms for each analysed atmospheric constituent at hand.
Here we introduce a scientific algorithm to retrieve and simultaneously from TROPOMI that has the objective of complementing the operational algorithms in the sense described above and to provide new geophysical insights, whilst performing within the mission requirements concerning random and systematic errors at the same time. The presented scientific algorithm differs from the operational algorithms in several respects (see also Sect. for a summary of the differences), and the corresponding products are thus predestined to be used together with the operational products in an ensemble approach. After a thorough description of the algorithm, including error characteristics based on synthetic data and validation with independent reference data, we present the first results of our new algorithm for both trace gases, demonstrating broad consistency with the operational products for example cases and the potential to advance the new application fields, for which TROPOMI's groundbreaking features pave the way.
2 WFM-DOAS retrieval algorithmThe Weighting Function Modified Differential Optical Absorption Spectroscopy (WFM-DOAS) algorithm is a linear least-squares method based on scaling (or shifting) preselected atmospheric vertical profiles. The vertical columns of the desired gases are determined from the measured sun-normalised radiance by fitting a linearised radiative transfer model to it. A concise mathematical algorithm description and the key settings and adjustments for the simultaneous and retrieval from TROPOMI's radiance measurements are summarised in the following subsections. The data products are based on TROPOMI Level 1b V01.00.00 files comprising spectra from the nominal operational mode, which started at the end of April 2018, and reprocessed spectra from the previous 6-month commissioning phase. The corresponding version is referred to as TROPOMI/WFMD (or WFMD in abbreviated form) v1.2.
2.1 Forward model
The forward model is derived from the radiative transfer model SCIATRAN in pseudo-spherical atmosphere mode. To enable a fast retrieval, a lookup table scheme for the radiances and their derivatives has been implemented, containing 17 280 reference spectra for varying solar zenith angle, altitude, albedo, water vapour, and temperature. The reference spectra are computed with high spectral resolution in line-by-line mode and subsequently convolved to the TROPOMI spectral resolution of the SWIR bands using an instrument-specific fixed spectral response function extracted from the TROPOMI ISRF Calibration Key Data v1.0.0 for nadir at . The auxiliary input data include US Standard Atmosphere profiles with methane scaled to , the SCIATRAN aerosol model using the background scenario described in , and HITRAN 2016 spectroscopic parameters .
2.2 Inversion procedure
The linearised radiative transfer model (appropriately chosen from the lookup table according to the relevant parameters) plus a low-order polynomial is linear least-squares fitted to the logarithm of the measured sun-normalised radiance. The trace gas vertical profiles (, , ) are scaled for the fit (i.e. the profile shape is not varied). Additional fit parameters are the shift of a preselected temperature profile, a scaling factor for the pressure profile, and parameters for a 2nd-order polynomial.
Let be the number of spectral points in the fitting window and the number of state vector elements (fit parameters) with . The modelled radiance at wavelength is given by
1 with state vector , linearisation point , and polynomial coefficients of 2nd-order polynomial . A derivative with respect to a vertical column thereby refers to the change in the top-of-atmosphere radiance caused by a scaling of a preselected absorber concentration vertical profile. There are equations of this type, one for each detector pixel in the fitting window. The objective is to find the optimal state so that the linear model best fits the observed radiance. This problem can be rewritten as 2 with (log-)radiance difference of the measurement and linearised model due to a deviation of the state vector from the multidimensional linearisation point, weighting function (Jacobian) matrix (with derivatives at the linearisation point and polynomial basis functions as columns) and the sum of forward model error and (normally distributed log-transformed) instrument noise .
The covariance matrix associated with measurement noise is given by . To give larger weight to spectral points with smaller error variances and to obtain error estimates of the retrieval parameters via error propagation from the uncorrelated measurement errors , a weighted least-squares approach is applied with a matrix of weights defined by . With the posterior probability of given , the most probable inference of the inversion is obtained by minimising 3 with respect to , where is the matrix transpose. Hence, 4 provides the solution of the inverse problem, where is the covariance matrix of solution . The errors of the retrieval parameters are estimated by 5
Due to the potential non-linear dependencies of the radiances with respect to water vapour and temperature within their natural variability, the algorithm treats both parameters iteratively. The algorithm starts with lookup table elements representing US Standard Atmosphere water vapour amount and temperature. If the retrieved parameter pair after the fit is closer to another lookup table element, the process is repeated with the corresponding reference spectrum. Usually convergence is achieved after one iteration step.
As the lookup table only covers direct nadir conditions to limit its dimension to a reasonable size, a geometric path length correction has been implemented to remove the path extension and associated enhancement of the retrieved vertical columns for off-nadir conditions with a non-vanishing viewing zenith angle.
The spectral fitting windows in TROPOMI band 7 were optimised to retrieve and simultaneously as accurately as possible (determined by an error analysis based on simulated measurements). They are shown in Fig. together with the absorption features of the relevant trace gases. Note that is a much weaker absorber compared to and . The apparent albedo is retrieved in the preprocessing by comparison of the measured continuum radiance with precalculated values from a lookup table. Cloud information is obtained from strong absorption lines in band 8 (see Fig. ) by comparing the measured radiances to reference radiances for cloud-free conditions. As the absorption in these lines is strong, the measured radiance is small in the clear-sky case. In the presence of clouds, most of the atmospheric is shielded and the measured backscattered radiance coherently increases . The corresponding ratio of measured to reference radiance for the selected strong absorption lines is thus an indicator of cloud contamination.
Figure 1
Fitting windows (grey, – and – ) and trace gas transmittances for the SWIR bands of TROPOMI for US Standard Atmosphere concentrations. The strong absorption lines between and used to obtain cloud information are shown in light blue . The apparent albedo is retrieved in the continuum at (dashed line).
[Figure omitted. See PDF]
2.3 Sensitivity and error analysis using synthetic dataThe sensitivity of the retrievals to different atmospheric layers is demonstrated by the vertical column averaging kernels (Fig. ). Compared to measurements in the thermal infrared spectral region, which are primarily sensitive to mid- or upper-tropospheric gas abundances in the absence of high thermal contrast, the advantage of the shortwave infrared spectral region is the sensitivity to all altitude levels, including the boundary layer, which is important to analyse emissions originating from the Earth's surface.
Figure 2
and column averaging kernels reflecting the altitude sensitivity of the retrievals.
[Figure omitted. See PDF]
As described in the previous subsection, the retrieval noise is determined via error propagation from the measurement noise. To assess the theoretical precision performance, we assume a simple shot noise-limited noise model, which is defined in the following way: the reference signal-to-noise ratio is in the continuum (radiance phot s cm nm sr) for a dark scene () with low sun (solar zenith angle of 70) and is scaled according to 6 for other radiances. The resulting absolute precision is widely independent of the current concentrations. For US Standard Atmosphere values, the corresponding relative retrieval noise for different albedos and solar zenith angles is shown in Fig. . It is below % for solar zenith angles smaller than 75 and albedos larger than in the case of . As the absorption is considerably weaker than the absorption, the retrieval exhibits larger relative noise, which is below % for albedos larger than .
Figure 3
TROPOMI/WFM-DOAS and relative retrieval noise for US Standard Atmosphere conditions.
[Figure omitted. See PDF]
The analysis of systematic errors is performed using simulated measurements. That means that for different scenarios defined by specific atmospheric conditions, radiances and irradiances are calculated with the radiative transfer model, which are subsequently used as measurement input in the retrieval. The errors are then defined as the deviation of the retrieved from the true quantities. The corresponding results for several scenarios are summarised in Table . All scenarios already include interpolation between different wavelength grids (for measured and reference spectra) unless otherwise stated.
The analysis includes basic scenarios testing if perturbations of the state vector elements can be retrieved, quantifying lookup table interpolation errors, and analysing errors caused by off-nadir conditions. In order to examine the sensitivity to vertical profile variations, the scenario class of profiles includes several realistic model atmospheres based on measurements and theoretical predictions , with all methane profiles scaled to have surface values of in each case to better represent current atmospheric conditions. The respective atmospheres differ from the US Standard Atmosphere with respect to temperature, pressure, water vapour, carbon monoxide, and methane profiles (see Appendix A of for a visualisation of the different vertical profiles). These scenarios are more difficult to deal with than the basic ones because the perturbations are not consistent with the scaling assumption; i.e. they include proper variations of the profile shape.
Also examined is the sensitivity to the spectral albedo of the natural surface types shown in Fig. taken from the Advanced Spaceborne Thermal Emission Reflection Radiometer (ASTER) and United States Geological Survey (USGS) spectral libraries. The analysed aerosol scenarios are largely described in , with aerosol type definitions in the different atmospheric layers based on Optical Properties of Aerosols and Clouds (OPAC) . The retrieval errors due to undetected subvisual clouds are also investigated for different ice and water clouds.
This gives an impression of the magnitude of errors one can expect, assuming that thick clouds can be filtered out by cloud screening in the preprocessing or post-processing: typical systematic retrieval errors are below % for methane and below % for carbon monoxide, even for challenging scenarios.
Table 2Error analysis for different scenarios. Standard settings are direct nadir, sea level, solar zenith angle 50, albedo , and US Standard Atmosphere. Scenarios with include scaling of the and profiles by 10 %; for scenarios with the sensor zenith angle is set to 30 (relative azimuth 60). Standard cirrus clouds are located between 11 and 12 (cloud optical thickness ), consisting of fractal ice crystals with an edge length of 100 m. Standard cumulus clouds are located between 3 and 4 (), consisting of water droplets with an effective radius of 10 m.
error | error | ||
---|---|---|---|
Scenario | (%) | (%) | |
Basic | dry run (no interpolation) | 0.00 | 0.00 |
dry run | 0.00 | ||
dry run | |||
dry run | |||
temperature | 0.25 | ||
temperature | 0.06 | ||
pressure % | |||
pressure % | |||
albedo 0.2 | |||
Profiles | midlatitude summer | 0.12 | 0.35 |
midlatitude winter | 0.68 | ||
subarctic summer | 0.09 | 0.60 | |
subarctic winter | 0.63 | ||
tropical | 0.15 | ||
Spectral albedo | sand | ||
soil | 0.01 | ||
rangeland | 0.02 | ||
deciduous | 0.01 | ||
conifers | 0.01 | ||
snow | |||
ocean | 0.00 | ||
Aerosols | no aerosol | 0.01 | 0.10 |
urban | 0.11 | 0.04 | |
desert (sand albedo) | 0.41 | 0.40 | |
arctic (snow albedo) | |||
extreme in boundary layer | 0.34 | ||
extreme in boundary layer | 0.24 | ||
extreme in boundary layer | |||
Subvisual clouds | cirrus | ||
cirrus | |||
cirrus | |||
cirrus (fractal ) | |||
cirrus (hexagonal ) | 0.11 | ||
cirrus () | |||
cumulus | |||
cumulus ( ) | |||
cumulus ( ) | |||
cumulus () |
Figure 4
Spectral albedos of different natural surface types. Reproduced from the ASTER Spectral Library through the courtesy of the Jet Propulsion Laboratory, California Institute of Technology, Pasadena, California (© 1999, California Institute of Technology) and the Digital Spectral Library 06 of the United States Geological Survey.
[Figure omitted. See PDF]
Larger systematic errors in the case of thick clouds are expected because clouds are not explicitly considered in the forward model of the retrieval algorithm to retain the high processing speed. Therefore, the systematic biases due to clouds are further analysed in more detail. The results for water and ice clouds at different heights are summarised in Fig. . Thereby, clouds are modelled as a layer of vertical extent consisting of water droplets with an effective radius of or fractal ice crystals with an edge length of . The analysis is performed for three different cloud types: two water clouds with cloud-top heights (CTH) of and and an ice cloud with CTH of .
Figure 5
Systematic retrieval errors for different water clouds (cum.) and ice clouds (cir.) under various conditions.
[Figure omitted. See PDF]
As expected, the absolute value of systematic errors typically increases with increasing cloud optical thickness, increasing cloud-top height, increasing solar zenith angle, and decreasing albedo. In most cases, there is a considerable underestimation of the vertical column in the case of thick clouds. However, there are also conditions under which the absolute value of the error is small even at a cloud optical thickness of or occasionally turns to an overestimation for measurements over bright surfaces. Overall the systematic errors due to clouds are qualitatively similar for and .
The error analysis based on synthetic data has shown that perturbations of the state vector elements can be clearly retrieved and that the algorithm is theoretically suitable to successfully retrieve carbon monoxide and methane from real TROPOMI data for cloud-free scenes. In the case of thick clouds, systematic errors can become rather large, confirming that an efficient cloud-screening algorithm is necessary, in particular to meet the demanding requirements for the precision and accuracy of atmospheric measurements. An appropriate quality filter is implemented in the post-processing and described in Sect. . For it may be possible to relax the filter due to the less stringent requirements, but for now we employ a joint quality filter for both simultaneously retrieved trace gases.
2.4 High-resolution auxiliary dataAs a consequence of the high spatial resolution of the TROPOMI SWIR measurements, a digital elevation model required for the selection and interpolation of suitable precalculated reference spectra and a land cover characterisation data set necessary to provide land fraction and surface type as additional information have to be implemented in high resolution. For this purpose, the Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010) and the Global Land Cover Characterization (GLCC) of the United States Geological Survey (USGS) are used with a resampled resolution of 0.05 (about at the Equator) to compute surface elevation, land fraction, and dominating surface type (Biosphere Atmosphere Transfer Scheme Legend) for every sounding of the satellite.
Some incorrect values of zero elevation in the GMTED2010 data set over the Caspian Sea and Lake Superior have been replaced with corresponding Global 30 Arc-Second Elevation (GTOPO30) values . Figure demonstrates the resolution of the implemented elevation and surface type data sets using the example of Europe.
Figure 6
USGS GMTED2010 elevation and GLCC surface type (Biosphere Atmosphere Transfer Scheme Legend) with 0.05 resolution using the example of Europe. The Biosphere Atmosphere Transfer Scheme Legend comprises the following surface types : 1 crops, mixed farming; 2 short grass; 3 evergreen needleleaf trees; 4 deciduous needleleaf trees; 5 deciduous broadleaf trees; 6 evergreen broadleaf trees; 7 tall grass; 8 desert; 9 tundra; 10 irrigated crops; 11 semidesert; 12 ice caps and glaciers; 13 bogs and marshes; 14 inland water; 15 ocean; 16 evergreen shrubs; 17 deciduous shrubs; 18 mixed forest; 19 forest–field mosaic; 20 water and land mixtures. In the GMTED2010 data set, ocean areas have been assigned a value of 0 (shown in dark blue).
[Figure omitted. See PDF]
2.5 Post-processing2.5.1 Column-averaged dry air mole fractions
In order to convert the retrieved vertical columns into column-averaged dry air mole fractions (denoted and ), the columns are divided by the dry air column obtained from the European Centre for Medium-Range Weather Forecasts (ECMWF) analysis. Thereby, the ECMWF dry columns are corrected for the actual surface elevation of the individual TROPOMI measurements (based on the deviation from the mean altitude of the coarser model grid), inheriting the high spatial resolution of the satellite data.
An analysis based on simulated measurements has indicated that this approach is superior to a normalisation by simultaneously retrieved oxygen ( A band) from TROPOMI band 6 for off-nadir conditions and/or in the presence of strong scatterers in the atmosphere (aerosol, clouds) as a consequence of the spectral distance in combination with the albedo differences of natural surface types between NIR band 6 and SWIR band 7 (see Fig. ). For these reasons, is a barely sufficient proxy for the light path in the spectral range in a scattering atmosphere. For example, the errors for the scattering scenarios aerosols (extreme in boundary layer) and clouds (cirrus) from Table are % and %, respectively. Hence, the underestimations are considerably larger than the corresponding errors for and , which would lead to distinct overestimations of mole fractions obtained from the -proxy approach in the presence of strong scatterers.
In addition to the better accuracy of the ECMWF-based mole fraction computation, this approach is also faster because the oxygen fit and the interband coregistration mapping can be omitted. As a consequence, the fitting procedure is about twice as fast without the normalisation by . The out-of-spectral-band stray light issue of the TROPOMI band 6 would potentially further hamper the -proxy approach.
2.5.2 Quality filter
To enable a fast processing speed to handle the huge amount of TROPOMI data, the lookup table is limited to rather simple physical conditions (e.g. cloud-free scenes). Thus, a quality-screening algorithm excluding measurements not sufficiently characterised by the forward model had to be implemented. First of all, challenging conditions with solar zenith angles larger than 75, which are increasingly prone to scattering and saturation-related issues due to the weakening signal and lengthening of the light path, are cut off. To be independent of other data sets and their ongoing availability, it was aimed at filtering based on parameters directly included in the retrieval output. This was achieved by using a machine-learning approach based on a random forest classifier, which is a meta estimator growing many independent decision trees on different subsamples of the data set that uses averaging to improve the predictive accuracy and prevent overfitting. Thereby, each tree of the ensemble is grown in the following way .
-
Randomly draw samples from the training set of size with replacement (bootstrap sample). For large , a fraction of about % unique samples is expected, the remainder being duplicates.
-
From the input variables, features are randomly chosen out of , and the best split according to the minimisation of Gini impurity on these features is used to split the node . The value of is held constant during the forest growing.
-
There is no pruning of the decision trees; i.e. each tree is grown to the largest possible extent.
To classify a new previously unseen measurement after growing the forest with the training data, each decision tree gives a classification according to the input features of the measurement, and the forest chooses the majority vote over all trees in the forest. The combination of the tree results, each based on different bootstrap replicates of the learning set, is called bootstrap aggregating or bagging . The forest error rate depends on the correlation between the trees in the forest and the strength of the individual trees. The forest error rate decreases with decreasing correlation and increasing strength of the trees. Reducing reduces both the correlation and the strength, while increasing increases both. Hence, there is an optimal range of that minimises the forest error rate.
We use a forest size of trees and the well-recognised standard choice of . The training data set comprises 16 randomly chosen days, namely 25 November 2017 and 25 February, 21 April, 27 April, 22 May, 28 May, 16 June, 26 June, 16 July, 24 July, 3 August, 22 August, 18 September, 4 October, 21 October, and 16 November 2018. For each day, million measurements are randomly selected. Thus, the training subset consists of million measurements, which are classified based on cloud information from the Visible Infrared Imaging Radiometer Suite (VIIRS) onboard Suomi NPP , which flies in loose formation configuration with Sentinel-5 Precursor (S5P trails behind by 3.5 min). This classification is augmented by additionally flagging distinct deviations relative to a climatology consisting of averages on a 6 4 grid for the years 2003–2005 based on the MACC-II flux inversion system and adjusted by an accumulated increase until the time of the measurement based on globally averaged marine NOAA surface data , identifying scenes obviously not well-characterised by the forward model, in particular conspicuously decreased methane abundances in the presence of clouds due to shielding of the underlying atmosphere or in the case of very low surface reflectances.
Figure 7
Cross-validated predictive accuracy of the quality classification random forest obtained by recursive feature elimination. The confusion matrix when using all 25 features is shown on the right-hand side for the test data set, denoting good observations with and measurements to be excluded with . The green diagonal cells correspond to correct classifications and the red off-diagonal cells to incorrect classifications. The number of scenes and the percentage of the total number of scenes are given in each cell. Important key parameters are summarised in the grey cells along the edge. The right column shows the percentages of all the elements belonging to each class that are correctly (recall) and incorrectly (false negative rate) classified and the bottom row those that are predicted to belong to each class that are correctly (precision) and incorrectly (false discovery rate) classified. The dark grey cell in the bottom right corner displays the overall accuracy.
[Figure omitted. See PDF]
To train the forest, a set of feature variables is selected by feature ranking with recursive feature elimination and cross-validated selection of the best features. As widely used, % of the training data is randomly drawn and retained as test data
A more detailed analysis of the predictive power of the random forest can be obtained from the confusion matrix of the test data set (also shown in Fig. ). As can be seen, the data set is unbalanced, with barely % belonging to class , denoting the good measurements. This is primarily due to the large amount of cloudy scenes in combination with the issue that mainly sun glint and glitter scenes are classified as good over the ocean and inland waters as a consequence of the weak signal ascribed to the low reflectances of these dark surfaces (see Fig. ). For land scenes the fraction of good observations accordingly increases to about %. The percentage of all good measurements that are incorrectly excluded (false negative rate of class 0) amounts to about % ( % for land scenes). For these cases the filter is too strict, but the quality of the data passing the filter is not compromised. The percentage of all the measurements predicted to be good that are incorrectly classified and should actually be excluded (false discovery rate of class 0) amounts to about %. For these cases the filter appears not stringent enough. However, as the training classification is quite strict, that does not necessarily mean that all these measurements are actually of low quality. The rate can rather be interpreted as an upper bound of potentially remaining challenging retrievals on the verge of sufficient characterisation by the forward model, e.g. observations near cloud edges. The effective diagnostic performance of the quality filter will emerge from the validation.
Adding additional parameters to does not significantly improve the predictive accuracy further. It is important to note that the resulting classification is independent of the absolute abundances of the primary retrieval parameters and . The performance of the classification algorithm is demonstrated in Fig. , confirming that cloudy scenes are reliably excluded in general and that the quality filter is usually stricter than the VIIRS classification, in particular over the weakly reflecting ocean. Measurements classified as cloudy by VIIRS but still passing the quality filter are rare and not associated with conspicuous methane abundances.
Figure 8
(a, b) Quality-filtered over Europe overlaid on true colour reflectances from the Visible Infrared Imaging Radiometer Suite (VIIRS) taken from the NASA Worldview application for two example days not included in the training data set, demonstrating the performance of the machine-learning classification algorithm. Evidently, cloudy scenes are typically identified and excluded. (c, d) Comparison of the implemented quality filter (QUAL, 1: excluded) with the VIIRS cloud classification (1: cloudy). Matching classifications are shown in white and green. By definition the quality filter is generally stricter than the VIIRS cloud flag and the blue areas are additionally excluded. The rare instances of measurements classified as cloudy by VIIRS but still passing the quality filter are shown in cyan.
[Figure omitted. See PDF]
2.5.3 Shallow learning calibration for methaneThe implemented machine-learning-based quality filter described in the previous subsection removes observations not sufficiently characterised by the forward model. Although this procedure typically excludes scenes exhibiting large systematic errors, smaller systematic errors may remain in the residual data set. In particular, there seems to be a systematic albedo dependence of unknown origin of retrieved methane abundances with an underestimation over dark surfaces. As a consequence of the fairly stringent quality requirements for methane, a random forest regressor algorithm was implemented to reduce the remaining systematic methane errors after the retrieval by calibrating against an assumed standard defined below, which is deemed insensitive to surface reflectance variations.
Like the classification algorithm described in the previous subsection, the random forest regressor grows an ensemble of decision trees, training each tree on a different data sample by applying the bootstrap aggregating technique. From the randomly chosen parameters the optimal split maximising the variance reduction in the child nodes is used to split the nodes. To focus on the most prominent features (shallow learning of systematic errors caused by surface albedo variations), the tree growing is limited to 500 leaf nodes. Again, a forest size of trees and is used, where consists of five feature variables, which are in the following order of importance: retrieved apparent albedo, solar zenith angle, cloud parameter , strong absorption radiance, and across-track dimension index.
To compute the correction for a new measurement after growing the forest with the training data, each decision tree provides a regression according to the input features of the measurement, and the random forest uses the average over all tree regressions as a final calibration value for this observation. In other words, the random forest regressor uses averaging in the bagging procedure to combine the individual tree results (in contrast to voting used in the classification case).
The calibration data set consists of the climatology introduced in the previous subsection evaluated for selected regions spanning a wide range of albedos and solar zenith angles. For individual regions, the climatology is roughly corrected for potential systematic overall biases by adding up a single region-specific correction value based on a comparison to nearby sites of the Total Carbon Column Observing Network (TCCON) for the year 2017. In any case, the seasonal and intra-regional spatial variations are solely determined by the climatology. The training regions and corresponding climatology correction values are shown in Fig. .
Figure 9
Regions used to train the machine-learning regressor comprising the Arctic (ARC), the western United States (WUS), central Europe (CEU), Japan (JAP), the Sahara (SAH), the South Atlantic (ATL), and Australia (AUS). The corresponding numbers specify the regional corrections applied to the methane climatology before learning in parts per billion, which are also colour-coded in the borders and backgrounds of the regions (blue for negative and red for positive corrections). The yellow circles highlight the TCCON sites used in the validation.
[Figure omitted. See PDF]
The standard deviation of the resulting correction when considering global yearly averages of gridded data (on a 0.1 0.1 grid) amounts to , which is well below the natural variability. The data set is not corrected.
3 ValidationTCCON is a network of ground-based Fourier transform spectrometers recording direct solar spectra in the NIR–SWIR spectral region to retrieve accurate and precise column-averaged abundances of several atmospheric constituents, including and , thus providing a validation resource for satellite data . To ensure comparability, all TCCON sites use similar instrumentation (Bruker IFS 125HR) and a common retrieval algorithm. The TCCON data are tied to the WMO trace gas scale using airborne in situ measurements by applying individual scaling factors for each species. The estimated accuracy () is about for and for .
To compare the satellite data with TCCON quantitatively, it has to be taken into account that the sensitivities of the instruments differ from each other and that individual a priori profiles are used to determine the best estimate of the true atmospheric state, respectively. The first step is to correct for the a priori contribution to the smoothing equation by adjusting the measurements for a common a priori profile . Here we use the TCCON prior as the common a priori profile for all measurements:
7 In this equation, represents the originally retrieved TROPOMI column-averaged dry air mole fraction, is the index of the vertical layer, the corresponding column averaging kernel of the TROPOMI algorithm, and and the TROPOMI and TCCON a priori dry air mole fraction profiles; is the mass of dry air determined from the dry air pressure difference between the upper and lower boundary of layer via with gravitational acceleration and is the total mass of dry air. To minimise the smoothing error introduced by the averaging kernels we do not compare directly with the retrieved TCCON mole fractions but rather with the adjusted expression : 8 Thereby, represents the TCCON a priori column-averaged dry air mole fraction associated with the a priori profile . However, using instead of has only a marginal impact on the validation results presented here because the satellite averaging kernels are close to in the lower atmosphere (see Fig. ), implying .
The validation is performed at the TCCON sites listed in Table (see also Fig. ). For the comparison a set of collocation criteria has to be specified. Ideally, the representativity is maximised by criteria that are as strict as possible while concurrently ensuring sufficient data for a sound and stable comparison. This trade-off is resolved by the following selection. The spatial collocation criterion requires the satellite measurements to lie within a radius of around the TCCON site and the altitude difference to be smaller than . The temporal collocation criterion is set to h. As a consequence of the altitude representativity criterion, there are not enough collocations for a robust comparison at the mountain sites Zugspitze and Izaña .
Table 3TCCON sites used in the validation ordered according to latitude from north to south.
Station | Latitude | Longitude | Altitude | Reference |
---|---|---|---|---|
() | () | () | ||
Eureka | ||||
Ny-Ålesund | ||||
Sodankylä | ||||
East Trout Lake | ||||
Białystok | , | |||
Karlsruhe | ||||
Orléans | ||||
Garmisch | ||||
Park Falls | , | |||
Lamont | ||||
Tsukuba | ||||
Edwards | ||||
JPL | ||||
Caltech | ||||
Saga | ||||
Burgos | , | |||
Ascension Island | ||||
Darwin | , | |||
Réunion | ||||
Wollongong | ||||
Lauder | , |
The validation results are summarised in Figs. and , including the mean bias and the scatter relative to TCCON for each site. The parameter is estimated from Huber's Proposal-2 M-estimator , which is a well-established estimator of location and scale that is robust against outliers of a normal distribution. This is an appropriate choice and preferred over the standard deviation because one is interested in the actual single-measurement precision without distortion of the results by a few outliers, which are rather attributed to systematic errors, e.g. due to residual clouds. As a consequence, outliers are fully included in the computation of the systematic error but get lower weight in the robust determination of the random error, which is interpreted as a measure of the repeatability of measurements.
Figure 10
Comparison of the TROPOMI/WFMD v1.2 time series (green) with ground-based measurements from the TCCON (red). For each site, is the number of collocations, corresponds to the mean bias, and to the scatter of the satellite data relative to TCCON in parts per billion; is estimated from Huber's Proposal-2 M-estimator. The global offset is defined as the mean of the local offsets at the individual sites, the random error is the global scatter of the differences to TCCON after subtraction of the respective regional biases, and the systematic error is the standard deviation of the at the individual sites.
[Figure omitted. See PDF]
Figure 11
As Fig. but for .
[Figure omitted. See PDF]
It is also checked whether the respective site biases are sensitive to the selection of the spatial collocation radius, which is an indication of sources within the satellite collocation area with only a marginal influence on the TCCON measurements themselves. A considerable sensitivity was found for at Edwards. The collocation region intersects oil production areas in California's Central Valley (in contrast to Caltech and JPL; see also the results in Sect. and Fig. ) and the South Coast Air Basin (SoCAB), which has a well-known methane enhancement . As such nearby sources limit the representativity of affected satellite measurements, the collocation radius is reduced to for Edwards.
The altitude representativity criterion separates the well-isolated air masses of the SoCAB, where Caltech and JPL are located, from the Mojave desert with the Edwards site to the north. Hence, different air masses are analysed in the validation at Caltech/JPL and Edwards, although the corresponding collocation circles overlap. This also explains the insensitivity to the spatial collocation radius at Caltech/JPL and why no additional constraints on the coincidence criteria are necessary for these sites to ensure representativity. As Caltech and JPL are both exposed to SoCAB air masses, the permissible altitude collocation tolerance of Caltech is equally assumed for JPL despite slightly differing surface elevation.
The results for the individual sites are condensed to the following parameters for the overall quality assessment of the satellite data: the global offset is defined as the mean of the local offsets at the individual sites, the random error is the global scatter (analogously estimated to the single-site case) of the differences to TCCON after subtraction of the respective regional biases, and the systematic error is the standard deviation of the local offsets relative to TCCON at the individual sites as a measure of the station-to-station biases. For the global offset amounts to , the random error is estimated to be ( when using the standard deviation instead of Huber's Proposal-2 M-estimator), and the systematic error is , which is on the order of the estimated (station-to-station) accuracy of the TCCON of about . For the global offset aggregates to , the random error is ( when using the standard deviation), and the systematic error is given by , which is again similar to the TCCON accuracy of about .
Figure 12
Comparison of the TROPOMI/WFMD data to the TCCON based on daily means. Specified are the linear regression results and the correlation of the data sets, as well as the mean and standard deviation of the difference. To analyse the impact of outliers, the regression is also performed for the Huber linear regression model, which is robust to outliers.
[Figure omitted. See PDF]
To further analyse how well the real temporal and spatial variations are captured by the TROPOMI data, Fig. shows a comparison to TCCON based on daily means for days with more than three collocations. The obvious linear relationship with a high correlation for both gases ( for and for ) underlines the typically good agreement of the satellite and validation data. The linear regression yields a fit close to the line for both gases.
In the case of , there are a few outliers for which the satellite values are considerably lower than the TCCON values. These occasional instances are not site-specific and can probably be ascribed to days with residual or partial cloud cover interfering with the satellite retrievals. Outliers with higher values compared to TCCON are more rare and dominated by a handful of collocations at East Trout Lake. This exceptional lack of agreement occurs on four days in the time period 10–21 February as well as on 29 March and may be attributable to Arctic polar vortex air above East Trout Lake potentially causing the following related issues: associated fronts of different air masses may complicate the identification of collocations near the vortex edge, and/or the stratospheric part of the methane profile may be largely affected by the polar vortex, leading to a considerable deviation from the assumed a priori profile shapes . It is verified that the impact of outliers on the regression is marginal by repeating the fit with the Huber linear regression model , which is robust to outliers and provides similar results to the standard linear regression here.
In summary, the natural and variations are well-captured by the satellite data. We find a single-measurement precision of the TROPOMI data of about % for and % for , while the station-to-station accuracy of the satellite data is comparable to the TCCON.
4 Initial results using real TROPOMI dataIn this section we present the first results from the mission start until the end of 2018. For temporally averaged data we grid the data on a 0.1 0.1 grid instead of showing swath data, which are used for daily data and single satellite overpass detection. Before analysing the data more regionally, we want to provide a global overview.
The global distribution of retrieved and for the year 2018 is shown in Figs. and , respectively. Clearly visible is the interhemispheric gradient with larger values on the Northern Hemisphere, where the majority of sources is located, for both data sets superimposed by enhancements over prominent source regions like anthropogenic emissions in China, India, and Southeast Asia.
Other visible source regions include human-initiated biomass burning in Africa and South America for land clearing and land use change, as well as wildfire emissions in North America, which were exceptionally pronounced in 2018. The anthropogenic emissions of congested urban areas like Mexico City and Tehran are already unambiguously detected on such a global map without zooming in.
Figure 13
Global yearly average of TROPOMI/WFMD for 2018.
[Figure omitted. See PDF]
Figure 14
Global yearly average of TROPOMI/WFMD for 2018.
[Figure omitted. See PDF]
In the case of additional visible source regions apart from the anthropogenic sources in Asia, like fossil fuels or rice cultivation, include tropical wetlands and anthropogenic emissions from California and the Padan Plain in Italy. There is also a distinct signal from Etosha National Park in the north of Namibia containing significant areas of wetland like the Etosha pan, an endorheic salt pan that exhibits intermittent shallow inundation.
4.1 Comparison to operational productsThe operational TROPOMI product is retrieved using the Shortwave Infrared Retrieval (SICOR) algorithm , and the operational product is based on RemoTeC , which is a physics-based approach originally developed for and retrievals from OCO and GOSAT. Although the operational algorithms and the scientific algorithm presented here use similar spectral bands, there are many differences concerning the details of each approach. For example, TROPOMI/WFMD is a weighted least-squares approach, whereas SICOR and RemoTec are based on a Philips–Tikhonov regularisation scheme. There are also differences in the radiative transfer model, the quality filter, the spectroscopy used, and the state vector elements, in particular in the treatment of aerosols and clouds.
While WFMD and the operational algorithm are mainly applicable to cloud-free scenes, the operational algorithm is designed to also handle cloudy observations under specific conditions. Both methane algorithms include a post-processing correction to improve the systematic albedo dependence. However, the details of this correction are again quite different: while the correction for the operational algorithm is based on linear regression relative to GOSAT retrievals, which are in turn bias-corrected against TCCON, the scientific WFMD algorithm uses a random forest regressor relative to a climatology as described in Sect. .
The comparisons are performed on a monthly basis with the latest version (V01.02.02) of the operational products. Figure shows the corresponding results for December 2018. The comparison of the global distribution of all quality-filtered data for the respective algorithm illustrates that the spatial patterns are very similar for both algorithms. The operational algorithm exhibits better coverage as it can handle a larger amount of cloudiness. For common scenes passing the quality filters of both algorithms the data sets are highly correlated, with a correlation coefficient of , and the regression slope is also close to the line, confirming the good agreement. The mean bias between the two data sets is about %, and the standard deviation of the difference is comparable to the noise level. For the occasionally very high CO abundances the results of the operational algorithm are a few percent larger than for WFMD, which is also reflected in a regression slope somewhat smaller than .
Figure 15
Comparison of TROPOMI/WFMD with the operational TROPOMI data for December 2018. Panel (a) depicts the global distribution of all quality-filtered data for the respective algorithm. Panel (b) shows a bivariate histogram of all common scenes passing the quality filters of both algorithms, summarising the linear regression results and the correlation of the data sets, as well as the mean and standard deviation of the difference. The number of points per bin is shown as a decadic logarithm .
[Figure omitted. See PDF]
The corresponding comparison of the results is shown in Figs. and for December and June 2018, respectively. WFMD exhibits somewhat better coverage and also includes some retrievals over the ocean in contrast to the operational algorithm. Although the prominent features, like the interhemispheric gradient and source regions in Asia, are similar and the correlation coefficients are close to , the differences down to the last detail are more pronounced than for . This is reflected in both the global maps and the scatter plots for common scenes. The global offset between the two data sets amounts to a few parts per billion, and the standard deviation of the difference is again comparable to the noise level.
In December 2018 the methane abundances over the Bohai Economic Rim, including the cities of Beijing and Tianjin, are larger than in southern China to the west of the Pearl River Delta for WFMD, whereas it is the other way round for the operational product, but this may be due to the different sampling. The distribution over the Sahara is more uniform for WFMD, and the corresponding patterns of the operational product seem to vaguely resemble some albedo features, with higher values over brighter parts of the Sahara. There is an obvious clustering of the common measurements around the line, even for the largest values. Nevertheless, the linear regression line is somewhat distorted from the line due to a slight shift of the two dominating densely populated sub-clusters.
Figure 16
As Fig. but for methane.
[Figure omitted. See PDF]
Figure 17
As Fig. but for June 2018.
[Figure omitted. See PDF]
In June 2018 there is a sharper gradient in the operational product when transitioning from the temperate into the low albedo boreal zone, and the values over the boreal ecosystem are lower than for WFMD. In addition, there are enhanced methane abundances in the operational product over the Canadian province Nunavut in contrast to WFMD. Some occasionally high values in the WFMD methane data over South America possibly attributable to surface roughness contribute to the rare outliers in the comparison scatter plot, which exhibits a clear linear relation close to the line apart from that.
Overall, we find good agreement of our scientific product with the operational product based on the presented concise comparison. For we find good agreement of the prominent features with some interesting differences in detail, including potential indications of residual albedo issues in the operational product. Further future analysis and understanding of the differences is expected to advance greenhouse gas retrievals from wide-swath imaging satellites like TROPOMI under challenging conditions such as scenes with low surface reflectance or residual cloudiness.
4.2 Detection of emission sources4.2.1 Carbon monoxide
Intense emissions from agglomeration areas, cities, and industrial facilities are clearly detected by TROPOMI. This is demonstrated using the example of China, India, and Southeast Asia in Fig. . The 2-month average was chosen to get an overview of the complete region. Typically, larger emissions can even be detected in a single satellite overpass. The tracked facilities mainly belong to the Chinese and Indian iron and steel industry.
Figure 18
Carbon monoxide distribution for November and December 2018 over China, India, and Southeast Asia highlighting emissions from congested urban areas and industrial facilities. Panel (a) shows the TROPOMI/WFMD product and panel (b) the operational product.
[Figure omitted. See PDF]
For comparison, Fig. also shows the operational product in addition to the TROPOMI/WFMD results. As the operational product is available as total columns, the corresponding mole fractions were generated in the same way as for the scientific product by division of the total columns by the dry air columns obtained from the ECMWF. The comparison demonstrates that the enhancements due to the analysed emission sources can be typically identified in both data sets. However, as a consequence of the different spatiotemporal sampling, the enhancement over some point sources is somewhat more pronounced in the WFMD product. A possible reason for this is the additional utilisation of cloudy observations in the operational SICOR product, which may be associated with reduced surface sensitivity under certain conditions reflected in the averaging kernels of the corresponding measurements.
In steelmaking is formed during two processes. Firstly, it is an essential constituent of blast furnace gas, which emerges when iron ore is reduced with coke to metallic pig iron. As the resulting pig iron has a relatively high carbon content, further processing is necessary to harden the metal. Therefore, the carbon-rich molten pig iron is converted to steel by lowering its carbon content via oxidation in the oxygen converter process (Linz–Donawitz steelmaking). The resulting converter gas predominantly consists of ( %) .
The detected factories include the following steel plants: Baotou Iron & Steel Group (Baogang), Wuhan Iron & Steel Co. (WISCO), Panzhihua Iron & Steel Group (Pangang), Fujian Sangang Group (Sangang), Liuzhou Iron & Steel Co. (Liu Steel), Jiuquan Iron & Steel Co. (JISCO), Xining Special Steel, Shaanxi Hanzhong Iron & Steel Co., Tonghua Iron & Steel Group (Tonggang), Steel Authority of India Limited (SAIL), and TATA Steel. Further emitters include other metal processing industries such as the National Aluminium Company (NALCO), the Tang Loong Industrial Park in Vietnam, and the cement production plants of the Arco Group and Khyber Industries in the Kashmir Valley, where accumulates between the mountain ranges.
emissions from the steel industry can also be observed in other regions of the world, for example in Turkey. Figure shows that two of the largest steel plants in the country are detected in a single satellite overpass. The plants are operated by the Turkish steel producers Ereğli Demir ve Çelik Fabrikaları (Erdemir) and Karabük Demir Çelik Fabrikaları (Kardemir), with yearly production capacities of about million tonnes of crude steel each. It can also be seen that the product exhibits striping in flight direction for single overpasses, similar to the operational product .
Figure 19
Carbon monoxide enhancement due to emissions from steel plants in Turkey.
[Figure omitted. See PDF]
There are also examples of detected emissions from steel works in Europe. Figure illustrates such a case and shows that emissions from the largest steel plant in Poland, operated by ArcelorMittal in the industrial city Dąbrowa Górnicza in the Upper Silesian metropolitan area, are detected in a single overpass. As can be seen, the corresponding pronounced plume coincides with the boundary layer wind direction, and the striping is observable as well.
Figure 20
Carbon monoxide enhancement due to emissions from the ArcelorMittal steel plant in Dąbrowa Górnicza in the Upper Silesian metropolitan area in Poland. Also shown is the mean wind in the boundary layer obtained from ECMWF data.
[Figure omitted. See PDF]
Another prominent source of is fire. In September 2018 a peat bog in the military training area WTD 91 in the Emsland region was accidentally set on fire by the German army and burnt for several weeks. The corresponding plume is clearly detected and aligns well with the wind direction (Fig. ). The scenes right above the origin of the fire are automatically excluded by the quality filter because of the strong formation of smoke potentially shielding the subjacent partial columns, similar to thick clouds.
Figure 21
Carbon monoxide enhancement due to the peat fire (red circle) in the Emsland region in Germany. Dotted scenes are excluded by the quality filter. Also shown is the boundary layer wind from the ECMWF. The bottom panel shows the corresponding true colour reflectances from the Visible Infrared Imaging Radiometer Suite (VIIRS).
[Figure omitted. See PDF]
The difference between smoke and clouds is the particle size distribution. While clouds consist of water droplets with an effective radius of about , the mass distribution of smoke plumes shows a prominent peak at about but is nevertheless dominated by a small number of supermicron-sized particles . The submicron particles reduce visibility and lead to an extended smoke plume over large distances in the true colour reflectances from VIIRS shown in Fig. . However, these small particles are not a major issue for the satellite measurements taken at . The satellite retrievals near the origin of the fire are rather affected by the large supermicron-sized particles, which become more and more negligible when departing from the source of the fire due to their rapid fallout. This is the reason that at a sufficient distance from the fire the corresponding measurements pass the quality filter despite efficient scattering in the visible spectral range manifesting in an extensive plume in the VIIRS image. On the other hand, even very small clouds, which are barely visible in the VIIRS image at this resolution, are rigorously filtered out. This indicates that the algorithm implicitly distinguishes between smoke and clouds according to their particle sizes and that a reliable retrieval is possible in smoke plumes in the far field of the fire origin. A thorough discussion of the sensitivity of measurements in conjunction with smoke from fires can be found in the revised version of .
The total column enhancement relative to background values allows us to roughly estimate the emitted mass flux of from the mean boundary layer wind speed and the plume width perpendicular to the wind direction only using measurements passing the quality filter: 9 The plume width is on the same order of magnitude as the instrument's spatial resolution, and the enhancement is thus calculated for the plume scene passing the quality filter which is nearest to the fire origin. As the wind direction is approximately perpendicular to one of the scene diagonals, the corresponding plume width is estimated by , assuming a quadratic scene with a side length of about . With , , and , the emission on 18 September amounts to about . According to , a : emission factor of % for boreal peat fires is assumed, implying an associated emission of approximately on that day. Compared to the German yearly total budget of about , the emissions from the Emsland peat fire are small even if one assumes that the fire burnt for several weeks at this strength.
4.2.2 MethaneOne integral component of anthropogenic methane sources is emissions from the energy sector. As an example of methane leakage from natural gas production, Fig. shows that the emissions of the world's second-largest natural gas field, Galkynysh in Turkmenistan, which is operated by Türkmengaz, can be clearly detected in a single satellite overpass. Also visible are enhancements over the productive South Caspian oil and gas basins, the oil and gas infrastructure at the Turkmen coast of the Caspian Sea, and smaller oil and gas fields south of Galkynysh.
Figure 22
Methane enhancement due to emissions from the world's second-largest natural gas field, Galkynysh in Turkmenistan.
[Figure omitted. See PDF]
Figure 23
Methane enhancement due to anthropogenic emissions from the Californian Central Valley. In contrast to the Caltech (and JPL) TCCON site (shown in cyan), the source region is intersected by the collocation radius standardly used in the validation for the Edwards site (pink). Therefore, the collocation radius is reduced to for Edwards. Due to the altitude representativity criterion, different air masses are analysed in the validation at Caltech/JPL and Edwards, although the corresponding collocation circles overlap. For a discussion of the collocation criteria at Edwards and Caltech/JPL, see also Sect. .
[Figure omitted. See PDF]
Emissions from oil and gas production are important to monitor because methane leaks offset the climate change benefits of natural gas or oil over coal if the leakage exceeds a certain threshold . There are several studies suggesting that the oil and gas industry leaks more methane than assumed in inventories, at least locally or temporally , and the potential heterogeneity among the sector complicates the specification of typical emission rates.
Table 4Achieved product quality compared to the mission requirements extracted from .
Data product | Vertical resolution | Bias (%) | Random (%) | ||
---|---|---|---|---|---|
Required | Achieved | Required | Achieved | ||
Carbon monoxide () | Total column | ||||
Methane () | Total column |
Figure 24
Methane distribution over the Upper Silesian Coal Basin in Poland. Individual mines are highlighted by red hexagons.
[Figure omitted. See PDF]
Another source region is the Central Valley in California, with combined anthropogenic emissions from oil fields and agriculture (see Fig. ). While one main area of oil production is located in Kern County around Bakersfield, the dairy and cattle industry extends more or less over the whole valley, with the largest livestock density in the counties of San Joaquin, Stanislaus, Merced, Kings, and Tulare . A reliable disentanglement of the emissions from the oil and agriculture sectors requires exact knowledge of the meteorology and unmistaken prior knowledge of the distribution of the different source types or methane isotopologue information, which is not yet available from satellite observations. As already mentioned in Sect. , the collocation radius standardly used in the validation intersects the Kern County source region for the Edwards TCCON site. As a consequence, the collocation radius is reduced to for Edwards to ensure the representativity of the satellite measurements used in the validation.
The two hitherto presented methane source regions of Turkmenistan and the Central Valley in California, which are both detected in a single TROPOMI overpass, were already identified in yearly averages of SCIAMACHY data .
An additional source of methane from the energy sector is emissions from coal mining. The physical process of coal extraction directly releases methane, which was previously trapped within the coal bed in the form of gas particles adsorbed at coal grains. For safety reasons, the coal mine methane is diluted with air below the explosive range and released through ventilation shafts to the surface.
Poland's primary energy consumption and electrical power generation relies strongly on coal, which helped the country to achieve one of the lowest energy import dependencies in the European Union as measured by the share of net imports in gross inland energy consumption (the sum of energy produced and net imports). The energy dependence rate of Poland was about % in 2016 compared to an EU-wide average of %, meaning that the majority of the EU's energy needs are met by net imports. Only three EU countries have a lower energy dependence rate than Poland, namely Romania, Denmark, and Estonia.
Poland is the largest coal-mining country in Europe and among the top 10 coal producers in the world , with huge reserves of hard coal and lignite . The major coal basin is the Upper Silesian Coal Basin (USCB), which is larger than and hosts % of the anticipated domestic hard coal resources. All operating hard coal mines in the country are situated in the USCB except the Bogdanka Mine in the Lublin Coal Basin in the east of Poland. The USCB is shown in Fig. , highlighting individual mines. The corresponding methane plume is vaguely perceptible and coincides with the wind direction. This is an example of emissions that are obviously close to the detection limit for a single overpass.
5 ConclusionsWe have introduced a scientific algorithm to retrieve and simultaneously from shortwave infrared spectra recorded by the TROPOMI instrument onboard the Sentinel-5 Precursor satellite. The error analysis based on synthetic data and the successful validation with independent reference data from the TCCON have demonstrated that the algorithm is suitable to retrieve and from real TROPOMI data well within the mission requirements after quality filtering (see Table ). The corresponding quality filter is based on a machine-learning approach utilising a random forest classifier. As cloud data from VIIRS onboard Suomi NPP were only used in the preceding supervised learning process and are no longer needed in the actual quality prediction of individual previously unseen measurements after the completion of the training, the quality filter is independent of the continuous availability of external cloud information. The performance of the retrieval algorithm is expected to further improve in the future, for example with respect to striping in flight direction for single overpasses, due to a refined calibration of the TROPOMI instrument and/or dedicated algorithm advancements.
The good global agreement of our scientific products with the operational products for the analysed example cases further underlines the quality of the presented algorithm. The differences in detail for can be thought of as a stimulation for further future analysis. The understanding of these differences will likely allow us to symbiotically advance both retrieval algorithms under challenging conditions, such as scenes with low surface reflectance or residual cloudiness. Moreover, the scientific and operational products are predestined to be used together with other products in an ensemble approach to benefit from the large range of respective realisations of different physical aspects in the individual retrieval algorithms.
Nevertheless, the results of the presented scientific algorithm are also valuable in their own right, as TROPOMI enables the determination of and with an unprecedented level of detail on a global scale, introducing new areas of application. It was shown that emissions from agglomeration areas, industrial facilities, in particular from the steel industry, and fires are readily detected, often even in a single satellite overpass. The same is true for emissions from the energy sector, including leakage from oil and gas production and coal bed methane from coal mining. The future quantitative reinforcement of these primarily qualitative findings will potentially enable emission monitoring and air quality assessments, ideally on a daily recurrent basis. Furthermore, improved knowledge of the methane cycle, which is essential for better prediction of future climate, can be derived by combining inverse modelling with a comprehensive monitoring system comprising complementary information from accurate ground-based in situ measurements and satellite observations with a unique combination of high precision, spatiotemporal resolution, and global coverage.
Data availability
The carbon monoxide and methane data sets presented in this publication can be
accessed via
Author contributions
OS designed and operated the TROPOMI/WFMD satellite retrievals, performed the data analysis, interpreted the results, and wrote the paper. MB, MR, HB, and JPB provided significant conceptual input to the design of the TROPOMI/WFMD satellite retrievals, the interpretation, and the improvement of the paper. TB and JL designed and executed the operational TROPOMI CO and satellite retrievals and supported the interpretation of the results. NMD, DGF, DWTG, FH, CH, LTI, RK, IM, JN, CP, DFP, SR, KeS, KiS, RS, VAV, TW, and DW operated the TCCON retrievals for the various sites and supported the interpretation of the results. All authors discussed the results and commented on the paper.
Competing interests
The authors declare that they have no conflict of interest.
Special issue statement
This article is part of the special issue “TROPOMI on Sentinel-5 Precursor: first year in operation (AMT/ACP inter-journal SI)”. It is not associated with a conference.
Acknowledgements
This publication contains modified Copernicus Sentinel data (2017, 2018). Sentinel-5 Precursor is an ESA mission implemented on behalf of the European Commission. The TROPOMI payload is a joint development by the ESA and the Netherlands Space Office (NSO). The Sentinel-5 Precursor ground-segment development has been funded by the ESA and with national contributions from the Netherlands, Germany, and Belgium. The research leading to the presented results has in part been funded by the ESA projects GHG-CCI, GHG-CCI+, and S5L2PP, the Federal Ministry of Education and Research project AIRSPACE, and by the State and the University of Bremen.
TCCON data were obtained from the TCCON Data Archive, hosted by CaltechDATA, California Institute
of Technology (
We acknowledge the use of VIIRS imagery from the NASA Worldview application
(
Financial support
The research leading to the presented results has in part been funded by the ESA projects GHG-CCI, GHG-CCI+, and S5L2PP, the Federal Ministry of Education and Research project AIRSPACE, and by the State and the University of Bremen. The article processing charges for this open-access publication were covered by the University of Bremen.
Review statement
This paper was edited by Helen Worden and reviewed by two anonymous referees.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2019. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Carbon monoxide (
The TROPOspheric Monitoring Instrument (TROPOMI) onboard the Sentinel-5 Precursor satellite, which was successfully launched in October 2017, is a spaceborne nadir-viewing imaging spectrometer measuring solar radiation reflected by the Earth in a push-broom configuration. It has a wide swath on the terrestrial surface and covers wavelength bands between the ultraviolet (UV) and the shortwave infrared (SWIR), combining a high spatial resolution with daily global coverage. These characteristics enable the determination of both gases with an unprecedented level of detail on a global scale, introducing new areas of application.
Abundances of the atmospheric column-averaged dry air mole fractions
We also present selected results from the mission start until the end of 2018, including a first comparison to the operational products and examples of the detection of emission sources in a single satellite overpass, such as
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details

















1 Institute of Environmental Physics (IUP), University of Bremen FB1, Bremen, Germany
2 SRON Netherlands Institute for Space Research, Earth Science Group (ESG), Utrecht, the Netherlands
3 Centre for Atmospheric Chemistry, School of Earth, Atmosphere and Life Sciences, University of Wollongong, Wollongong, Australia
4 Ludwig-Maximilians-Universität München, Lehrstuhl für Physik der Atmosphäre, Munich, Germany; Deutsches Zentrum für Luft- und Raumfahrt, Institut für Physik der Atmosphäre, Oberpfaffenhofen, Germany; Max Planck Institute for Biogeochemistry, Jena, Germany
5 Karlsruhe Institute of Technology (KIT), Institute for Meteorology and Climate Research (IMK-ASF), Karlsruhe, Germany
6 Royal Belgian Institute for Space Aeronomy, Brussels, Belgium
7 Atmospheric Science Branch, NASA Ames Research Center, Moffett Field, USA
8 Finnish Meteorological Institute, Space and Earth Observation Centre, Sodankylä, Finland
9 Satellite Remote Sensing Section and Satellite Observation Center, Center for Global Environmental Research, National Institute for Environmental Studies (NIES), Tsukuba, Japan
10 National Institute of Water and Atmospheric Research (NIWA), Lauder, New Zealand
11 Department of Physics, University of Toronto, Toronto, Canada
12 Japan Aerospace Exploration Agency (JAXA), Tsukuba, Japan
13 Karlsruhe Institute of Technology (KIT), Institute for Meteorology and Climate Research (IMK-IFU), Garmisch-Partenkirchen, Germany