1 Introduction
The Total Carbon Column Observing Network (TCCON) has been in operation since 2004, beginning with its first dedicated instrument in Park Falls, WI, USA . Since then, the network has expanded to 29 active sites located around the world. The network provides column average dry mole fractions (DMFs) of numerous gases, including carbon dioxide (), methane (), nitrous oxide (), hydrofluoric acid (), and carbon monoxide (). These observations have been used to infer or evaluate natural and anthropogenic carbon fluxes
The TCCON instruments are solar-viewing Bruker 125HR (high-resolution) Fourier transform infrared (FT-IR) spectrometers that record an interferogram once every few minutes. These interferograms are processed by the GGG software package to provide column average DMFs. Once the interferograms are converted to spectra, the core routine of GGG calculates the expected spectra from a forward model based on a custom linelist and a priori profiles of the absorbing gases with absorption lines in the fitting window. The retrieval calculates a posterior trace gas profile that minimizes the root-mean-square (rms) fitting residuals between the forward modeled and observed spectra.
There are two common terms used to describe different approaches towards finding the optimal posterior profile: a “scaling” retrieval or a “profile” retrieval. In a scaling retrieval, the retrieval multiplies the entire prior profile by a single value, finding the scaled version that produces the best agreement with the observed spectrum. In a profile retrieval, each level of the profile can be varied, with the allowed variation constrained by a specific covariance matrix. Compared to a profile retrieval, a scaling retrieval is faster and does not alias spectroscopic or instrument line shape errors into profile shape errors. It is more sensitive to errors in the shape of the prior profile compared to a full profile retrieval because it cannot change the shape of the posterior solution (meaning the ratio of DMFs between levels in the profile cannot change). However, it is not affected by a uniform multiplicative error in the prior DMFs at all altitudes. That is, if the entire profile underestimates or overestimates the true atmospheric DMFs by the same multiplicative factor, a scaling retrieval can – in theory – perfectly correct the retrieved profile. examines the differences between scaling and profile retrievals in the context of TCCON data in more detail.
The relationship between the shape error in the prior and the error in the retrieved column amount depends on the averaging kernels. For TCCON retrievals, testing with synthetic spectra shows that a 4 ppm error in the profile shape (defined as the error in the prior compared to the true profile changing by 4 ppm between the top and bottom levels) leads to an error of 0.025 % in at solar zenith angles (SZAs) 60 and 0.125 % up to SZA 75. Details of how this was quantified are given in Sect. S1 in the Supplement. This means that for typical SZAs observed by TCCON, an error of about 4 to 8 ppm in the prior results in a retrieval error well below the 0.25 % ceiling required for TCCON data.
In both GGG2014 and GGG2020, the prior profiles are derived as much as possible from meteorological variables and general correlations between these variables and trace gas DMFs in the atmosphere. GGG2014 used meteorological reanalyses from the National Centers for Environmental Prediction (NCEP). GGG2020 uses the Goddard Earth Observing System Forward Product for Instrument Teams (GEOS FP-IT) reanalysis product. The GEOS FP-IT product was chosen because it is provided on a finer temporal resolution than the NCEP product (3-hourly resolution versus 6-hourly resolution), is available with a lag of 1 d in normal operation, and includes diagnosed potential vorticity (PV). The PV fields are of particular importance because they allow the GGG2020 priors to better represent latitudinal transport in the stratosphere, thus improving the stratospheric trace gas profiles. However, GEOS FP-IT data are only available from the year 2000 on, meaning that the GGG package retains the capability to use NCEP meteorology as input data. This capability has been further developed since GGG2014, though we do not include those changes in this paper.
Here, we describe the algorithm used to compute the prior profiles of , , , , , , and for GGG2020. The algorithm is named “ginput” and is available through GitHub . We begin in this paper by describing the core parts of the algorithm that are common across many of the gases (Sect. ). We then address elements specific to individual gases in Sect. . Finally, we compare the GGG2014 and GGG2020 priors against a wide variety of observations in Sect. .
As a final note, the priors described here are also used in the versions 10 and 11 OCO-2 and OCO-3 (hereafter OCO-2/3) retrievals. There are small differences in the OCO-2/3 priors compared to the TCCON priors which are discussed in Sect. .
2 General design
The central algorithms for the GGG2020 (, , ) priors are similar to each other. Trace gas mole fractions are tied to the monthly average measurements in whole-air flasks sampled at the Mauna Loa, HI (MLO), and American Samoa (SMO) sites operated by the United States National Oceanic and Atmospheric Administration's (NOAA's) Global Monitoring Laboratory. The fundamental underlying assumption of the GGG2020 priors algorithm is that the spatial variation in these gases can be largely captured by accounting for the transport lag between the location of the prior profile and the tropics (where MLO and SMO flask samples are made) and chemistry occurring during stratospheric transport.
The MLO and SMO data used to create the GGG2020 priors end in December 2018. In order to ensure consistent priors are created with this version of GGG, these files will not be updated until the next GGG release even as NOAA releases more data in the interim. Therefore, it is necessary to extrapolate the MLO and SMO records forward in time for retrievals of spectra taken after December 2018. This is done using the following steps:
-
fitting a function, , to the last years of the MLO and SMO records, where both and are chosen for each gas to best represent that gas's behavior,
-
calculating the average seasonal cycle over the last years as the anomaly relative to ,
-
extending the record to the necessary date using as the baseline and applying the average seasonal cycle on top of it.
This procedure is shown graphically in Fig. .
Figure 1
Process to extrapolate the combined MLO and SMO monthly average record. (a) First, we fit the last 5 or 10 years with the best function for a given gas. (b) Second, we calculate the mean monthly anomaly relative to the trend over the same time period. (c) Third, we extend the trend in time and apply the mean monthly anomalies on top of it.
[Figure omitted. See PDF]
Details of and are provided in Table . Note that this method is also used to extrapolate back in time if data prior to the start of the combined MLO and SMO record are needed to represent the distribution of ages of air in the stratosphere (see Sect. ).
Table 1Function forms () and number of years used to fit the combined MLO and SMO DMF record to extrapolate beyond 2018. In , the values are the fit parameters.
| Gas | (years) | |
|---|---|---|
| 10 | ||
| 5 | ||
| 10 |
Errors in extrapolating the MLO and SMO DMFs will negatively impact the TCCON retrievals if the error in extrapolation introduces an error in the profile shape, for example, due to an El Niño year. In a scaling retrieval, such as the GGG algorithm used by TCCON, the posterior optimal profile is the prior profile multiplied by a scale factor, with the same scale factor applied to all levels. At its core, the algorithm we are describing here builds the priors by calculating what date to pull the MLO and SMO DMFs from for each level in the prior. If the extrapolation error caused all the MLO and SMO DMFs to be incorrect by the same percentage, this would manifest as the prior profile being incorrect by that percentage, for which a scaling retrieval can theoretically perfectly account. However, if the error in MLO and SMO DMFs is not the same for each level in the prior, the error in the prior cannot be represented by the same scalar multiplier for every level, and thus a scaling retrieval could never completely eliminate the error in the posterior profile.
Currently, we estimate the error in the MLO and SMO DMFs due to extrapolation to be about 0.25 % for , 0.15 % for , and 0.6 % for over a 5-year extrapolation (see Sect. S2 in the Supplement for details). We deem this level of uncertainty acceptable for TCCON priors. How errors in the priors alias into the posterior state in a profile retrieval, such as that used by OCO-2 and -3, is more complex. However, the OCO-2/3 retrieval uses a relatively tight covariance matrix for levels in the stratosphere
Ingesting the MLO and SMO data as the basis for the priors effectively ties those priors to the World Meteorological Organization (WMO) scale to which the MLO and SMO data are calibrated. Table describes which scale each gas is tied to for each algorithm in which these priors are used. As these priors were developed at the same time as the X2019 scale , whether the priors are tied to the X2007 or X2019 scale depends on which scale the MLO and SMO data are calibrated to.
Table 2The WMO calibration scales to which the in situ data used in the GGG2020 and OCO-2/3 priors are tied.
| Gas | Scale | Scale | Scale |
|---|---|---|---|
| (GGG2020) | (OCO-2/3 v10) | (OCO-2/3 v11) | |
| X2007 | X2007 | X2019 | |
| X2004 | n/a | n/a | |
| X2006 | n/a | n/a | |
| X2014A | n/a | n/a |
n/a stands for not applicable, as OCO-2 and OCO-3 only use CO priors. Note that, unlike for , , and (for which this tie comes from the MLO and SMO data), for CO this is from scaling to ATom data in the troposphere.
Unlike the other gases in Table , is not tied to its scale through the MLO and SMO data. priors are created using a different approach to the other primary gases; this approach will be described in Sect. . The relevant point here is that is taken from the GEOS FP-IT product , and in the troposphere it is scaled to match observations from the first three Atmospheric Tomography Mission (ATom) aircraft campaigns . As the ATom quantum cascade laser spectrometer (QCLS) CO observations used were calibrated to the X2014A scale, the priors are considered tied to that scale.
Several gases (, , , ) are contained in the GEOS FP-IT meteorology product ingested by GGG2020. and are taken directly from GEOS FP-IT, while and are derived from GEOS FP-IT. Details are given in Sect. .
Finally there are a large number of gases that must be accounted for as interfering absorbers during retrievals of primary TCCON target gases. These gases use priors derived from climatological profiles from the summer at N. Details are given in Sect. .
2.1 Design rationaleIn developing the GGG2020 priors, we had the following two guiding principles in mind.
-
First, we wanted to minimize direct dependence on other measurements or models as much as possible, such that retrievals using these priors are independent measurements (in the statistical sense) that other observations or models can be compared to.
-
Second, we wanted to produce an algorithm that generates reproducible prior profiles if run at different times.
The first principle is why the GGG2020 priors only ingest MLO and SMO data, rather than more surface data, and why we do not use modeled gas profiles (other than for CO). For the much shorter-lived CO, we decided that capturing the spatial variability was worth the trade-off of relying on GEOS FP-IT modeled CO (especially as GGG2020 already uses GEOS FP-IT meteorology). Other data used in generating the priors (e.g., latitudinal gradients of and from HIPPO and ATom, as well as Atmospheric Chemistry Experiment Fourier Transform Spectrometer, ACE-FTS, profiles) were likewise adopted because the improvement in the priors was deemed worth the loss of statistical independence. Since these data are used to generate static values (such as lookup tables or coefficients in functions), rather than for direct ingestion, we retain some independence from these sources.
The second principle is why the GGG2020 priors and OCO-2/3 v10 priors only use MLO and SMO flask data through the end of 2018 (rather than updating regularly). One concern raised during development was whether such regular data updates would alter previously obtained data, such as from retrospective quality control. This would introduce a situation where we could not exactly reproduce priors generated using an old version of the input data. Given time constraints, it was not possible to engineer a solution to detect or avoid this issue for GGG2020 and OCO-2/3 v10 priors. With the additional development time for OCO-2/3 v11, we were able to update the priors algorithm to safely ingest more rapidly updated MLO and SMO data.
2.2 Tropospheric prior
The GGG2020 tropospheric priors assume that the trend observed by MLO and SMO is driven by emissions in the northern midlatitudes; thus, the measured DMF at MLO and SMO will lag behind the DMFs in the Northern Hemisphere and precede the DMFs in the Southern Hemisphere. To compute the tropospheric DMFs, we average MLO and SMO data together with equal weight, deseasonalize the MLO and SMO average to get the underlying trend, approximate the offset forward or backward in time relative to MLO and SMO with an idealized distance function, apply a multiplicative and additive correction to match observed latitudinal gradients, and impose a latitudinally dependent seasonal cycle. Mathematically, this follows Eq. ():
1
Table 3Values of and coefficients in Eq. () for the three primary well-mixed gases. Rationales for these choices are given in the gas-specific sections (Sect. ).
| Gas | ||
|---|---|---|
| 0 | ||
| 0 | ||
The variables in this function are as follows.
-
is latitude. In the GGG2020 TCCON priors, this is an “effective latitude” derived from mid-tropospheric potential temperature (see Sect. ).
-
is altitude with the bottom half of the troposphere stretched downward slightly to treat the bottom layer as being at the surface for the purpose of this calculation (see Sect. ).
-
is the tropopause altitude.
-
is the fractional year (defined as 1-based day of year 365.25).
-
DMF is the reference DMF taken from a deseasonalized MLO and SMO trend.
-
is the distance offset function, defined by Eq. ().
-
is the seasonal cycle factor, defined by Eq. ().
-
and are coefficients that scale and adjust the ideal gradients assumed by to account for differences between gases. Their values are given in Table and are discussed in detail in Sect. .
The distance function is shown in Fig. (assuming a simple latitudinal dependence for the tropopause altitude). It has the following mathematical form: 2 where 3
Figure 2
Form of the distance function assuming that emissions occur at 45 N and a latitudinally dependent tropopause height that varies smoothly from 17 km at the Equator to 8 km at the poles.
[Figure omitted. See PDF]
Although has units of years, it does not represent a physical age or time. It is effectively a basis function to impose the ideal distribution of DMFs relative to MLO and SMO as shown in Fig. . Specifically, it assumes that surface DMFs precede MLO and SMO DMFs in the Northern Hemisphere, lag MLO and SMO DMFs in the Southern Hemisphere, and have a smaller latitudinal gradient in the upper troposphere due to faster winds. The basic shape is modified for each gas via and .
Figure 3
Parameterized seasonal cycle for (a) and (b) . The left axis is the factor in Eqs. () and (). The right axis gives what the seasonal cycle amplitude would be for a DMF of 400 ppm in (a) and DMF of 1800 ppb in (b).
[Figure omitted. See PDF]
in Eq. () is the combined MLO and SMO record, deseasonalized by taking a 12-month rolling mean. This is done because the seasonal cycle at MLO and SMO is not representative of all latitudes. We impose a latitudinally dependent seasonal cycle by multiplying the DMFs by a scaling factor : for all gases but . For the parameterization is where is the fraction of year passed (defined as 1-based day of year 365.25), is latitude, is altitude (in kilometers), the tropopause altitude (in kilometers), is a reference latitude (45 N), is the function from Eq. (), and is a gas-specific constant defined in Table S5. represents the basic seasonal variation, the latitudinal variation, and the altitude variation. The form of these equations for and are shown in Fig. .
These parameterized seasonal cycles are the same as those used in GGG2014 priors. The amplitude and phase were derived from surface in situ data, and the amplitude is assumed to decay with altitude due to mixing of air masses with different ages.
2.2.1 Potential temperature-based effective latitudeprofiles for locations on the edge of the tropics are sometimes more “tropical” in nature than their geographic latitudes suggest. In these cases, the observed profile would be more constant versus altitude than the prior profile, which would have some drawdown at the surface.
showed that, in the extratropics, there is a correlation between 700 hPa potential temperature and DMFs in the free troposphere, as variations in this potential temperature serve as an indicator of synoptic-scale motion and therefore the true source latitude of the air. We use dry potential temperature, i.e., the temperature a parcel of dry air would have if brought to a pressure of 1000 hPa adiabatically. This allows us to use potential temperature to derive an “effective latitude” that better predicts the shape of the prior profile. Note that while this was originally developed to improve the priors, it is used for all gases.
To calculate this effective latitude, we first build a climatology of mid-tropospheric potential temperature from the GEOS FP-IT product by averaging potential temperature between 500 and 700 hPa (henceforth termed ) versus latitude for 2-week periods in 2018 (Fig. a). A hypothetical example is shown in Fig. b. For a prior in the extratropics, we select the appropriate versus latitude curve from the climatology (Fig. b, black line) and compare the value for the prior against the tabulated mean. If the prior's is greater than the mean for that latitude, the effective latitude is moved equatorward until it matches (vice versa if the prior's is less).
Figure 4
(a) The lookup table for versus latitude and time of year. (b) A hypothetical example of how the effective latitude calculation works. The black line represents the climatological , and the black points represent hypothetical for individual profiles'. The arrows indicate how the effective latitude of each profile is adjusted such that the individual matches the climatological . The red shading indicates latitudes where this method is not applied; the blue shading indicates transitional areas between the geographic and effective latitude (see Sect. 2.2.1 for additional details).
[Figure omitted. See PDF]
More specifically, the implementation searches north and south of the prior's geographic latitude for the two latitudes (one north, one south) with the smallest difference between the prior's and the mean . If the difference between the mean values at both latitudes is within 0.25 K, then the nearer latitude is used. Otherwise, the latitude with the smallest difference between its and the prior's is used.
There are two caveats to this approach. First, the effective and true (geographic) latitude must have the same sign – that is, both must be in the same hemisphere. Second, within the tropics (defined as within 20 of the Equator), the effective latitude calculation is disabled and the geographic latitude is used. This is done because mid-tropospheric temperature gradients are weak in the tropics and largely uncorrelated with zonal advection . To smoothly blend between geographic and effective latitude, a linear interpolation between them occurs in the 20 to 25 range. For example, a profile at 22 N would have a latitude calculated as , where is the geographic latitude and is the effective latitude.
2.2.2 Altitude grid adjustmentThe seasonal cycle and distance basis function assume that the surface is at 0 km altitude. To this end, we use an adjusted altitude as in Eqs. () through (). To compute this adjusted , we stretch or squeeze the bottom of the altitude grid so that the bottom layer is at the surface altitude from the GEOS FP-IT 2D files. The adjustment is performed as follows:
6 where is the original altitude, , is the original grid altitude closest to , is the original grid altitude closest to , is the GEOS FP-IT surface altitude, is the tropopause altitude, and is 7 where , , and are the indices for , , and , respectively. Figure S7 shows an example of the adjustment. This adjustment is minor (typically 50 to 100 m) since the priors are generated on the terrain following levels from the GEOS FP-IT model.
2.3 Stratospheric priorThe design of the stratospheric priors draws heavily from . That work showed that the profiles of and in the lower stratosphere can be captured well using surface in situ data from the MLO and SMO observatories to determine the trace gas mole fraction entering the stratosphere and then accounting for mixing of air during stratospheric circulation. We extend this method by using atmospheric profile measurements between February 2004 and March 2019 from the Atmospheric Chemistry Experiment Fourier Transform Spectrometer
2.3.1 Stratospheric age of air
The age of stratospheric air parcels is calculated from a climatology simulated by the Chemical Lagrangian Model of the Stratosphere (CLaMS) and scaled to match the mean midlatitude age in the Goddard Space Flight Center 2D (GSFC2D) model , which provides age of air as a function of latitude, potential temperature, and day of year. Age of air in this context refers to the time since the air entered the stratosphere. Figure shows both latitudinal and temporal slices of the CLaMS age of air. The CLaMS model is a 2D representation of the mean dynamics of the stratosphere. To account for the zonal displacements driven by large-scale Rossby waves, we compute an equivalent latitude profile. Equivalent latitude is derived from PV following Eq. (1) in .
Figure 5
Mean age of air from the CLaMS climatology. (a) Age versus latitude and potential temperature for 1 January, (b) age versus day of year and potential temperature at 40 N. Panel (c) is the same as panel (b) but for 80 N.
[Figure omitted. See PDF]
Note that this equivalent latitude is not the same as the effective latitude used in the tropospheric part of the prior calculation. PV-derived equivalent latitude has been previously shown to predict stratospheric chemical fields well
Once the age of air is known, we can look backwards in the combined MLO and SMO record to determine the stratosphere boundary condition (SBC), i.e., the mole fraction of each gas when a parcel of air entered the stratosphere. The SBC time series is defined as the MLO and SMO average lagged by 2 months; and references therein show that this is a good proxy for the SBC. However, the mole fraction for a given level in the prior is not simply the mole fraction of, e.g., , when that air entered the stratosphere but is the result of mixing of air with different ages during convective transport. This mixing can be represented by solutions to Green's function derived from measurements , which we represent as age spectra.
Age spectra were precomputed for three regions (tropics, midlatitudes, and polar vortex) and 45 different mean ages. showed that different age spectra were necessary to capture tropical and midlatitudinal behavior; likewise, the polar vortex requires its own age spectra form due to strong wintertime descent of air. Example age spectra are shown in Fig. . Note that spectra for the youngest mean ages are not shown.
Figure 6
Example age spectra for the (a) tropics, (b) midlatitudes, and (c) polar vortex. The values represent the contribution of air from that time to the average mole fraction of the parcel as a whole. Note that age spectra for the youngest air are not shown because they are nearly delta functions.
[Figure omitted. See PDF]
For each stratospheric level in the priors, the mole fraction of a gas is computed as 8 where is the value of the age spectrum for the given mean age () and region () and is the SBC, both at time . That is, the mole fraction is a weighted average of the SBC over time with the weights set by the age spectrum. is the fraction of gas remaining after chemical loss, and is potential temperature, which we use as a vertical coordinate. For this fraction is always 1, but it varies with mean age and potential temperature for other gases, as discussed in more detail in Sect. .
2.3.3 Middleworld treatmentThe middleworld is defined as the part of the atmosphere between the tropopause pressure from GEOS FP-IT and the 380 K isentrope. Of the three tropopause pressure estimates in GEOS FP-IT, we use the blended (thermal and potential vorticity) estimate. The 380 K isentrope is the lowest potential temperature surface entirely contained within the stratosphere; therefore, the stratospheric approach described in Sect. is only applicable to levels above 380 K (the stratospheric overworld). To fill in the prior in the middleworld, we linearly interpolate mole fraction as a function of potential temperature between the tropopause and 380 K.
2.4 Secondary gases
For the purpose of this paper, “secondary gases” are defined as those which are tied directly to neither the MLO and SMO records nor the GEOS FP-IT product. This is all gases other than , , , , , , , and . and are the two most relevant to standard TCCON retrievals. Priors for these gases are based on climatological profiles for summer at 35 N derived from profiles measured by MkIV spectrometer balloon flights and the ACE-FTS instrument. These climatological profiles are modified for a given location and time in four steps:
-
stretch or compress the profile vertically so that the tropopause is at the correct altitude,
-
apply a latitudinal gradient,
-
apply a secular trend,
-
apply a seasonal cycle.
These steps require the latitude and age of air of the profiles. This approach is nearly identical to that used for all gases in the GGG2014 priors, except that for steps 2–4 the age of air and effective latitude described in Sect. are used in the troposphere and the CLaMS age and PV-derived equivalent latitude from Sect. are used in the stratosphere. The middleworld is filled in by linear interpolation in between the tropopause and 380 K, as is done for the primary gases. Details of the calculation are given in the Supplement.
2.5 Conversion to number density
All trace gas quantities shown and discussed in this paper are in dry mole fractions (DMFs, i.e., moles of trace gas per moles of dry air). However, in its forward model, GGG uses gas profiles in number density (molec. cm) for spectroscopic calculations. To convert DMF to number density, we use
9 where is the number density of the gas of interest, is the DMF of that gas, is the DMF of water (from the prior profile), and the ideal gas number density. The factor converts into number density of dry air.
3 Gas-specific designIn this section, we will discuss elements of the algorithm unique to each gas. With the exception of , each section will be divided into subsections for the tropospheric and stratospheric priors.
3.1
We assume a uniform DMF of 0.2095 for at all altitudes. During the retrieval, this is converted to number density following Sect. . In the GGG2014 priors, the conversion to number density did not include a correction for water. This led to a profile shape error: as water DMFs are highest near the surface, failing to include the water correction led to an overestimate of the near-surface number density for every absorbing gas.
The impact of this error in the previous priors on the final column amounts was small because in public TCCON data all gas column amounts are reported as column average mole fractions (termed Xgas, e.g., ). These are calculated as follows: 10 where and are the total column amounts (in molec. cm) of the target gas and , respectively. The denominator represents a column of dry air inferred from the retrieved column. The advantage of this method over using a column of air derived from surface pressure is that, because primary TCCON target gases are measured on the same detector as , certain types of instrumental error cancel out in this ratio, reducing their impact on the final data product . Likewise, the shape error due to the missing water correction in GGG2014 priors largely canceled out in the column-averaged Xgas DMFs. However, the GGG2020 treatment, following Eq. (), is more physically consistent, leads to more consistent scaling factors retrieved among TCCON stations, and yields a better shape – especially under warm, humid conditions.
3.23.2.1 Troposphere
The value of in Eq. () for was derived by comparing the priors generated with and against profiles from the HIPPO and ATom campaigns. We used measurements from the Harvard quantum cascade laser spectrometer (QCLS) for HIPPO and measurements from the NOAA Picarro for ATom. Only data from individual vertical profiles (identified as data points where the
The result is shown in Fig. a. The red line is a York fit to the data using the inverse square of the standard deviations of the prior–observation differences and distance function values in the latitude bins as the weights. This fit indicates setting equal to 3.55 times the interannual growth rate will give a latitudinal gradient that matches observations. Figure b shows the mean differences versus latitude with set to 1 (i.e., no adjustment) and with the best fit to the data. Using the derived from Fig. a and , the priors show no latitudinal bias versus observations.
Figure 7
(a) Bias between the initial CO DMFs and ATom/HIPPO profile versus the distance function (Eq. ) for profile levels below 800 hPa. Note that the axis is not in parts per million but in multiples of the interannual growth rate. See Sect. 3.2.1 for details. (b) The mean difference between priors and observations in 20 latitude bins below 800 hPa versus latitude bin center. In both panels, error bars are standard deviations of the respective variable within the 20 latitude bins.
[Figure omitted. See PDF]
3.2.2 Stratospherefollows the algorithm laid out in Sect. . No additional modifications were required. For our purposes, we assume that DMFs are unaffected by stratospheric chemistry (e.g., oxidation) and do not include a correction for chemistry in stratospheric .
3.3
3.3.1 Troposphere
We set in Eq. () to
11 where is the output of the distance function from Eq. () and years
In the stratosphere, is more complicated than because it is removed, principally through photolysis forming nitrogen and an oxygen atom but also via a reaction with excited oxygen () . fit this loss of in the lower stratosphere versus age of air with a third-order polynomial. We examined how this polynomial compares to data from the ACE-FTS instrument and found that the polynomial's skill in predicting the fraction of remaining relative to the SBC () decreased above approximately 25 km altitude, with the polynomial overestimating the mixing ratio by up to 150 ppb. We hypothesize this is due to different chemistry in the upper stratosphere compared to the lower stratosphere. As the original polynomial was based on lower stratospheric data, it did not capture this behavior. While the fraction of the column in the upper stratosphere is small (a few percent above 20 km), our goal was to develop priors with reasonably accurate DMFs at all altitudes, not just where the bulk of the column mass is. Additionally, developing our own method to estimate allows us to be consistent when calculating the same quantity for and HF.
We use data from the ACE-FTS instrument to build a lookup table of the fraction of remaining as a function of age of air and potential temperature. validated a previous version of the ACE-FTS data and found that mean differences between ACE-FTS and other stratospheric measurements were 10 ppbv between 18 and 30 km, and mostly within to ppbv between 30 and 60 km. They note that these are large relative to the magnitude of mole fractions at these altitudes; however, for our purposes, these are acceptable, given that we are averaging a large number of ACE-FTS profiles and need only a climatological relationship between fraction of remaining, age of air, and potential temperature. compared the version 3 ACE-FTS data (used in this work) to the version 2 evaluated by and note that the main difference is a 10 % reduction in above 30 km. Thus the general results in should still hold. For ACE-FTS v3.5 data (one minor version earlier than that used in this work), found biases between ACE-FTS and MIPAS (Michelson Interferometer for Passive Atmospheric Sounding) of between 9 % and 5 % and between ACE-FTS and MLS (Microwave Limb Sounder) of between 18 % and 4 % in the altitude range of 19 to 34 km.
To build the lookup table, age of air is computed as in Sect. ; for each ACE profile, the stratospheric equivalent latitude is computed for the GEOS FP-IT files that bound it in time, and then it is interpolated to the latitude, longitude, and time of the profile. This equivalent latitude and the potential temperature calculated from ACE-FTS temperature and pressure is used as input to the CLaMS model from Sect. to look up the age of air.
is defined relative to the stratospheric boundary condition in the ACE-FTS data, not the MLO and SMO record, to ensure self-consistency and avoid introducing error from the bias between the ACE-FTS and MLO and SMO data (Fig. S9). The stratospheric boundary condition is computed from a quadratic fit in time of ACE-FTS data in the tropics (latitude within 20) and with 360 K 390 K, excluding outliers (defined as values more than 5 times the median deviation from the median). This definition of the stratospheric boundary condition assumes that most of the air entering the stratosphere does so in the tropics and that the tropical tropopause is in that range of potential temperature values.
Finally, to compute the lookup table, the ACE-FTS data are binned by age of air (0.25 year increments) and potential temperature (variable increments; 50 K in the lower stratosphere to 200 K in the upper stratosphere). ACE-FTS data are excluded if
-
,
-
altitude 70.0 km (this is the top altitude in the TCCON priors),
-
the profile is in the polar vortex,
-
potential temperature is K (as we are only concerned with levels in the stratospheric overworld).
Additionally, values 1 are limited to 1. The resulting lookup table is shown in Fig. . As there are large gaps in age– space with no ACE-FTS data, we extrapolate to fill in these gaps. We use essentially a constant value extrapolation along age; that is, if there is no value for a given age– bin, the nearest point at the same is used. Linear extrapolation along age is done second, using the nearest two points to determine the slope. In general, points in these extrapolated regions are expected to be very infrequent, as the absence of ACE data suggests that those combinations of age and are rare in the atmosphere.
Figure 8
lookup table derived from ACE-FTS v3.6 data as a function of potential temperature and age of air. Unfilled circles are extrapolated points.
[Figure omitted. See PDF]
The need to capture how depends on both age and is apparent in Fig. . Consider the points in Fig. at ages of 5 years. Over the range of 1000 K, the decreases from 0.5 to almost 0. This is likely because at greater (i.e., higher altitude) the photolysis () pathway proceeds more rapidly than at lower altitudes. Age of air alone cannot capture this difference.
3.43.4.1 Troposphere
Similar to , the priors use Eq. () as , with a lifetime of 12.4 years
Figure 9
Differences in between ATom and HIPPO observations and priors, binned as in Fig. b, with and without ppb per degree correction in the Northern Hemisphere. Error bars are standard deviations in the 20 latitude bins.
[Figure omitted. See PDF]
3.4.2 Stratospheremust also include a fraction remaining term, , to account for stratospheric chemistry, similarly to . Figure a shows a tight correlation between ACE-FTS and in the stratosphere; therefore, we can use the relationship between and age derived in Sect. as a basis for the lookup table.
Figure 10
(a) 2D histogram of and from ACE-FTS. (b) The lookup table that is used in the GGG2020 algorithm. The same as in Fig. , filled circles are derived directly from data, while the unfilled circles are extrapolated. values are computed as described in Sect. .
[Figure omitted. See PDF]
To compute the lookup table, we first limit the ACE-FTS data to points where and are positive, the mole fraction is 2000 ppb (points 2000 ppb are almost certainly tropospheric), the profile is outside the polar vortex, and the altitude is below 70 km. We bin the data by and . Within each bin, outliers are rejected (distance median absolute deviation) and the mean value in each and bin pair is computed. As with , we use extrapolation to fill in parts of the lookup table not covered by ACE-FTS data. We use constant value extrapolation along the dimension first, then also along the dimension if necessary.
To compute the stratospheric prior profiles, Eq. () is used with the value described above. To compute the value, the age and values are first used to compute the value as described in Sect. , and then the value is determined by linearly interpolating the lookup table in Fig. b to the required and .
3.5 HFMeasurements of HF DMFs in the troposphere are very rare; the most recent direct measurement of gaseous fluoride that we found in the literature was , which reported measurements around an aluminum refinery. Their measurements near but not downwind of the refinery reported fluoride concentrations of 1 g m, or a DMF on the order of 10 to 100 parts per trillion (ppt). Spectroscopic measurements over Antarctica and Switzerland found upper-tropospheric HF DMFs of 1 to 10 ppt were consistent with solar-viewing spectra.
For our purposes, we assume that the tropospheric DMF of HF is negligible compared to the stratospheric component, and thus we imposed a small but non-zero DMF of 0.1 ppt. This is less than the previous measurements , but the impact on HF retrievals should be small given that TCCON HF averaging kernels are usually 0.5 below 200 hPa.
In the stratosphere, we once again make use of tracer–tracer relationships. HF is produced by reaction of fluorine atoms from photolysis of and (which are the products of destruction of CFC-11, CFC-12, and HFC-22) with , , or . Thus, and mole fractions are tightly anticorrelated in the stratosphere. Previous studies
We follow a similar approach to ; we determine the : slope () and directly compute the mole fraction from the mole fraction as
12 where is the stratospheric boundary condition determined from the MLO and SMO record, as described in Sect. .
Because of the time dependence in the ratio of methane to the long-lived fluorine-containing gases in the troposphere and because of the non-uniform ratio of the lifetime of and the CFCs in the stratosphere, the slope depends on both time and latitude . Before the beginning of the ACE-FTS data set in 2004, we use : slopes reported in . From 2004 on, we bin ACE-FTS and data into the same three latitude bins (tropics, midlatitudes, and polar vortex) as for the age spectra (Sect. ). We filter for ppb and ppb and limit to altitudes 70 km. The limit on is imposed for the same reason as in Sect. ; the limit on ACE-FTS is imposed due to erroneously large values of 200 ppb found in rare cases (despite only using data with and quality flags 1). A 10 ppb upper limit was determined to only exclude these extraordinary values. The : slopes were fit as in using a robust fit with Tukey's bi-weighting function.
Finally, we combine the ACE-FTS-derived slopes with those from and fit the change over time with an exponential. This allows us to extrapolate forward or backward in time as needed. Each latitude bin has its own exponential fit that fits the bin-specific ACE-FTS slopes and the slopes (all bins used the same , data). For consistency, we always take the slope from the exponential fit. The slope values and the exponential fits are shown in Fig. .
Figure 11
(a) : slopes and the exponential fits over the entire time period with data, be it from (RW03) or ACE-FTS. (b) Similar to (a) but zoomed in on the ACE-FTS time period and colored by latitude bin.
[Figure omitted. See PDF]
Therefore, for each overworld level ( K), a mole fraction is calculated (following Sect. ), and the : slope for the year and latitude bin (based on equivalent latitude, Sect. ) is used in Eq. () to compute the mole fraction. Note that we use the slope for the year of the observation and not the year the air entered the stratosphere because the slopes are based on observations for specific years.
3.63.6.1 Troposphere
With a shorter tropospheric lifetime (on the order of months) than the above gases, CO requires a custom treatment in order to adequately account for its spatial variability. The GEOS FP-IT product contains a CO forecast that shows reasonable skill in comparison to QCLS measurements taken during the ATom campaigns . We therefore adopt the GEOS FP-IT CO product as the base profile for the CO priors with the following modifications.
First, our comparison against the first three ATom campaigns shows a low bias in the GEOS FP-IT CO mole fractions, as seen in Fig. a. While there is some variation with latitude, the pattern was not sufficiently clear to lend itself to a robust correction; therefore, we multiply the troposphere CO mole fractions by 1.23 () to bring them in line with ATom observations.
Figure 12
(a) Comparison of colocated ATom-measured and GEOS-FP-IT-forecasted CO mole fractions. GEOS FP-IT CO matched to ATom observations using 4D nearest-neighbor interpolation. The fit is a robust fit using a Tukey biweight function with no intercept, i.e., using the
[Figure omitted. See PDF]
3.6.2 StratosphereComparison with ACE-FTS data in the lower stratosphere also demonstrates a low bias that varies with altitude. However, the general structure is consistent as a function of potential temperature relative to the tropopause, as seen in Fig. b. This can be represented by an exponential function.
Therefore, the overall CO correction has the form shown in Fig. . Below the tropopause, the 1.23 factor derived from ATom is used, while above 380 K (i.e., the stratospheric overworld) the exponential form derived from ACE-FTS is used. In the middleworld, we linearly blend between the two functions in order to provide a smooth transition.
Figure 13
The form of the CO bias correction scaling factor. The blue and red lines show the form derived from ATom and ACE-FTS data, respectively, while the black line shows the blending of these two corrections. Note that the ATom line is extended up to 380 K for reference and does not imply that ATom collected data into the mid-stratosphere.
[Figure omitted. See PDF]
The second correction required concerns the intrusion of mesospheric CO into the stratosphere. In the mesosphere, very large mixing ratios of CO are produced through photolysis of . As this descends (especially in the polar vortex), it can lead to very large CO mole fractions at altitudes as low as 40 km. This process is not captured in the GEOS FP-IT product but is represented in the Canadian Middle Atmosphere Model (CMAM), which compares well with ACE-FTS and MLS data . Here we use output from a version of CMAM run with dynamics specified
Comparison of GEOS FP-IT with ACE-FTS data shows the mesospheric CO impact beginning around 30 hPa and becoming dominant by 10 hPa. Therefore, we replace the GEOS FP-IT CO with CMAM CO above 10 hPa (i.e., at pressure 10 hPa) and linearly interpolate from GEOS FP-IT to CMAM in pressure–log space between 30 and 10 hPa. The CMAM CO is drawn from a monthly climatology constructed from the monthly averaged CO DMFs in the 30-year CMAM model run (available at
The third and final correction accounts for the mesospheric CO itself. While the priors used in TCCON retrievals have a 70 km ceiling, the CO above that altitude in the CMAM model can comprise up to 2.5 % of the total column, particularly in the polar regions. To account for this in the prior, we add an equivalent mass of CO to the top level of the priors. This is detailed in Sect. S4 of the Supplement.
3.7and
The profile is computed directly from the GEOS FP-IT specific humidity. The HDO profile is directly computed from the profile as 13 where and are the DMFs of and , respectively. In the GGG retrieval, the line intensities of isotopologues are multiplied by the isotope abundance. This form therefore does not need to reproduce the abundance of but instead just the decrease in relative to with altitude due to Rayleigh fractionation . While reading the priors, GGG takes the absolute value of the HDO DMF to eliminate negative DMFs resulting from . In versions of ginput after 1.1.4, the absolute value of the HDO DMF is output.
4Use of priors for OCO-2 and OCO-3
The Orbiting Carbon Observatory 2 (OCO-2) and OCO-3 retrievals use these priors starting in their respective version 10 products. The version 10 products use this algorithm exactly as described above except for one small change: in Eq. (), is geographic, rather than effective, latitude. This difference ensures a smooth latitudinal variation in . Using effective latitude introduced discontinuities near the Equator (Fig. S17a).
The specific structure of the discontinuities in Fig. S17a arises because version 10 of the OCO-2/3 algorithm uses an earlier version of the priors algorithm than GGG2020; in this earlier version, rather than transition between geographic latitude and effective latitude between 20 and 25, effective latitude was used for profiles at all latitudes but disallowed from crossing the Equator (i.e., a profile in the Northern Hemisphere could not have an effective latitude in the Southern Hemisphere and vice versa).
Switching the version 10 priors to use geographic latitude for all soundings trades some ability to capture day-to-day variation in the troposphere for guaranteed spatially smooth priors (Fig. S17b), which is well worth it for nadir-viewing instruments such as OCO-2 and OCO-3. In contrast, for discrete measurement sites such as TCCON, the ability to capture day-to-day variations is preferred.
The OCO-2/3 version 11 priors introduced an additional change to allow more frequent updating of the input in situ data. GGG2020 and OCO-2/3 version 10 use a static file of MLO and SMO data as input that contains monthly averages of flask data prepared by NOAA up through the end of 2018. These records are extended by extrapolation (see Sect. ) as needed. This has the virtue of simplicity but cannot capture anomalies in the trend of such as those caused by El Niños.
The OCO-2/3 version 11 algorithm switched to using hourly in situ data from the continuous trace gas analyzers stationed at MLO and SMO NOAA observatories that has undergone preliminary quality control but not full background selection by NOAA personnel. These hourly in situ data are preprocessed by the priors code to produce monthly averages, allowing the main algorithm to use either monthly flask or hourly in situ data as needed. The preprocessing algorithm is described in Sect. S5 of the Supplement.
5 Validation5.1 Comparison with aircraft and AirCore observations
To directly validate the GGG2020 priors, we use aircraft data from the NOAA GLOBALVIEWplus v5.0 Obspack , NOAA GLOBALVIEWplus v2.0 ObsPack , and the Infrastructure for Measurement of the European Carbon Cycle (IMECC) campaign , as well as AirCore profiles from NOAA routine and campaign balloon flights
Figure 14
Root-mean-square error (RMSE) of (a) , (b) , and (c) priors versus combined AirCore and aircraft observations. Data sources are listed in Tables S1 and S2. In each panel, both the GGG2020 and GGG2014 priors' RMSEs are shown. The number of profiles contributing to each panel is printed above the panel. FMI/RUG Sodankylä AirCore data above 20 km altitude are not included due to anomalously high mixing ratios in CO. and data above 20 km are also excluded for consistency.
[Figure omitted. See PDF]
Figure shows the root-mean-square error (RMSE) for each vertical level of both the GGG2014 and GGG2020 priors. Mean and individual profile errors are given in Fig. S10. A breakdown of the number of profiles by gas and source is given in Table S4.
For , the RMSE is noticeably smaller at all altitudes for the GGG2020 priors compared to the GGG2014 priors (Fig. a). This results from removing a small but clear negative bias throughout the troposphere arising from an underestimate of the secular growth rate in GGG2014. Using the MLO and SMO data eliminates that as a source of uncertainty for profiles before 2019 (2019 is the first year that the MLO and SMO trend is extrapolated for GGG2020 as we chose to use a static file to avoid the complications of updating the input data in a reliable, reproducible manner, as discussed in Sect. ). In the stratosphere (above 200 hPa), the improved representation of stratosphere dynamics (Sect. ) better captures the gradient of in the lower stratosphere, reducing the previous overestimate of lower stratospheric in the GGG2014 priors.
The RMSE for the GGG2020 priors is still greater near the surface than at higher altitudes. This may be due to the simplified seasonal cycle (Sect. ). Comparing the priors to ATom and HIPPO observations in different seasons (Fig. S8) shows large differences near the Northern Hemisphere surface in spring and summer. As the seasonal cycle has latitudinal dependence, revising its parameterization will require adjustment to the distance function (Eq. ) and the and coefficients (Table ). This area will be revisited in a future version of the GGG priors.
shows a small improvement in RMSE throughout most of the troposphere (Fig. b, 800 to 200 hPa). Above 200 hPa, the RMSE shows a greater improvement, again due to the improved representation of stratospheric dynamics. However, near the surface (below 800 hPa) the RMSE increases somewhat in the GGG2020 priors compared to the GGG2014 priors. This increase in RMSE is driven by near-surface emissions not accounted for in the priors. Figure a shows differences in the priors versus AirCore data (which has frequent sampling of areas with high emissions), colored by which TCCON site the prior represents. The bias in below 800 hPa is clearly due to underestimated in the Lamont, OK, profiles. The Lamont TCCON site is situated near a region of significant oil and natural gas production , and it thus experiences enhanced mole fractions of 100 to 200 ppb near the surface (Fig. S13). Neither the GGG2014 priors nor the GGG2020 priors attempt to account for local anthropogenic emissions. The increase in RMSE near the surface in the GGG2020 priors is due to the removal of a compensating error in assumed vertical gradients – introducing the tropospheric effective latitude (Sect. ) accounts for times when Lamont has a profile that varies less with altitude due to the influence of tropical air.
Figure 15
Difference plots for GGG2020 priors versus (a) and (b) AirCore data. The thinner, colored lines represent differences for individual profiles, and the thick black line indicates the mean difference across all profiles shown. The individual differences are colored by their TCCON site.
[Figure omitted. See PDF]
The GGG2020 priors' RMSE improves throughout the free troposphere (600 to 200 hPa). Unlike and , RMSE is similar between GGG2014 and GGG2020 in the stratosphere (above 200 hPa). Near the surface, the GGG2020 priors' RMSE is 20 ppb greater than GGG2014. Figure b shows that this is driven by overestimated at the Armstrong Air Force Base (AFB) TCCON site and both overestimated and underestimated CO at the Lamont TCCON site.
The cause of the overestimates and underestimates in the Lamont profiles is not clear. The GGG2020 CO profiles are based on the CO field in the GEOS FP-IT product (Sect. ). The underestimated CO DMFs could be due to changes in energy economies in the region in recent years . GEOS FP-IT uses 2008 anthropogenic CO emissions for all years after 2008 (Lesley Ott, personal communication, 2019), and thus the CO priors would have no information on changes past 2008.
The overestimated at Armstrong AFB is due to its proximity to Los Angeles. CO emissions in Los Angeles have been decreasing , a trend not captured in GEOS FP-IT as 2008 emissions are repeated for all years after 2008. Additionally, given that the GEOS FP-IT model resolution is 0.67 0.5 (longitude latitude), that the topography of the Los Angeles Basin is complex, and that Armstrong AFB is only 0.8 north of Los Angeles, the model is likely not able to capture the full separation of Los Angeles and Armstrong profiles.
Outside of urban or energy-intensive locations, the agreement between the new GGG2020 priors and colocated in situ profiles is much improved. Figure S15 compares RMSEs and mean prior versus in situ differences for CO when Armstrong AFB, Lamont, and Orléans (another near-urban location) are excluded from the comparison. In that case, the RMSE reduces by about a factor of 2 or better at all levels except the surface in the new GGG2020 priors compared to the GGG2014 priors.
We compared CO profiles from the GEOS FP-IT product to the Copernicus Atmospheric Monitoring Service (CAMS) model to see if this issue of overestimated CO is common among models. The results for 2018 through 2022 are shown in Fig. S16. In general, GEOS FP-IT CO is dramatically greater than CAMS CO in Los Angeles (at the Pasadena TCCON site). This is also true at Armstrong AFB but to a lesser extent. In Paris, both models exhibit very high surface CO on some of the sampled days, though this was more frequent in the GEOS FP-IT CO profiles. At Lamont and East Trout Lake, both models had CO DMFs of similar magnitude (even with our factor of 1.23 scaling applied to the GEOS FP-IT data), with the main difference being in vertical distribution. While the factor of 1.23 applied to bring the GEOS FP-IT CO in line with ATom observations (Fig. ) definitely aggravates the GEOS FP-IT overestimate in urban areas, it improves the mean CO in more remote areas. In the future, drawing CO profiles from a model that better represents urban–rural CO gradients would improve the CO priors but requires an existing model run that also covers the full range of times needed by TCCON (from 2004 on).
Despite the increase in RMSE near the surface, overall the CO priors demonstrate important improvement. The reduction in error in the mid-troposphere will be very beneficial to TCCON retrievals, as the CO averaging kernels increase with altitude up to the tropopause. Therefore, the retrievals are more sensitive to errors in the upper troposphere than the surface. We performed a sensitivity test where we retrieved 1 year of XCO at Armstrong using two sets of priors. We found that the sensitivity of the retrieved XCO to the surface CO in the prior was small, with only a 0.024 ppb change XCO per 1 ppb change in surface prior CO (2.4 %, Fig. S14c).
5.2 Indirect validation through retrievalsWe can also evaluate the quality of the priors indirectly using the TCCON retrievals themselves. TCCON uses a scaling retrieval, in which the prior profiles are multiplied by scalar volume mixing ratio scale factors (VSFs) until the optimal match between the forward spectroscopic model and measured spectrum is found. A VSF near 1 usually indicates that the prior profile represented the true atmospheric column abundance well (provided that the forward model spectroscopy is accurate), though it is also possible that compensating errors also yield a VSF near 1. However, given that the direct validation shown in Sect. does not show compensating positive and negative biases on average, we expect such compensating errors are unlikely.
Figure shows VSFs for HF and . Figure a shows that the median HF VSF decreased from 1.25 in GGG2014 to 0.94 in GGG2020, and the distribution is substantially tighter. HF is found only in the stratosphere ; therefore, this result provides additional evidence that the stratosphere is well modeled by the GGG2020 priors.
Figure 16
Volume mixing ratio (VMR) scale factors (VSFs) of (a) HF and (b) retrieved using GGG2014 and a preliminary version of GGG2020. The vertical dashed gray line marks a VSF value of 1.
[Figure omitted. See PDF]
Figure b shows that VSFs moved slightly closer to 1 in GGG2020 with a tighter distribution. is well mixed in the troposphere with an extremely uniform mixing ratio but varies substantially in the stratosphere due to loss via photolysis. Again, this implies improvement in the stratospheric priors and is a valuable check, as we did not directly validate against aircraft or AirCore observations due to sparse profiles over TCCON stations.
Finally, we also consider the interhemispheric bias in and VSFs. For , found a 1 % bias between Northern Hemisphere and Southern Hemisphere VSFs using GGG2014 data, and determined that this was because the GGG2014 priors assumed a smooth DMF profile across the tropopause. In fact, the gradient in the lower stratosphere is driven by stratospheric circulation and entering through the tropics (Sect. ). As the priors now correctly account for this, the underlying error driving the interhemispheric bias in tropospheric in should now be eliminated, and in fact the difference between median VSFs between the Northern Hemisphere and Southern Hemisphere has reduced by nearly 50 % (Fig. S11).
For , the difference between median Northern Hemisphere and Southern Hemisphere VSFs remains nearly the same magnitude ( 0.4 %, Fig. S12) but flips with the GGG2020 priors such that the median VSF is now greater in the Southern Hemisphere. Figure S12c compares the surface DMFs from six NOAA stations against the surface DMFs in the priors for five TCCON sites. While the priors' surface in the Southern Hemisphere is approximately correct, there is a high bias in the Northern Hemisphere, possibly due to an incorrect assumed tropospheric lifetime (Sect. ) or a need for an additional correction to our distance function (Sect. ) that was not identified during development. This will be corrected in a future version of the TCCON priors.
5.3 Impact on retrieved Xgas valuesFigure shows how the bias of the Xgas value retrieved by TCCON relative to in situ profiles changes between using the priors from the previous GGG2014 data version and using the new priors described in this paper. For this comparison, we used only AirCore profiles, as these profiles extend into the lower stratosphere and therefore require the least extension to produce a total column profile suitable for comparison to TCCON. We follow in applying TCCON averaging kernels and pressure-weighted integration to the AirCore profiles to produce an in situ Xgas value for comparison to TCCON.
Figure 17
Impact of the new priors on the retrieved TCCON Xgas values compared to coincident AirCore profiles. The axis shows how the difference between the retrieved TCCON Xgas value and an averaging-kernel-smoothed and integrated in situ profile changes between using the GGG2020 priors described in this paper versus the previous GGG2014 priors. A negative value indicates a reduction in bias compared to in situ with the new priors; the percentage in the title indicates what fraction of the comparisons had reduced bias. The vertical dashed black line marks 0 on the axis. Each panel is a different TCCON Xgas product; and are experimental TCCON products added in GGG2020 that are more sensitive to the upper atmosphere and near surface, respectively, than the standard TCCON .
[Figure omitted. See PDF]
For the TCCON products, the differences are on the order of 0.05 to 0.1 ppm. Only about half of the comparisons show improvement; this is true for both the standard TCCON (Fig. , top left) and two experimental products introduced in GGG2020 with different vertical sensitivities ( and , Fig. , top middle and top right).
CO worsened on the whole (Fig. , bottom right) but by less than 1 ppb. However, this only includes three comparisons at the Armstrong site (most of the comparisons from Fig. are from aircraft profiles, and we only use the AirCore profile here as mentioned above), where the new priors have a known bias (see Sect. ) and none at the Pasadena site (as it is difficult to obtain profiles safely over urban sites), which is more strongly affected by the same issue. Thus, we consider 1 ppb a lower bound on the bias introduced at these sites by the overestimated near-surface CO in the priors.
shows the clearest improvement (Fig. , bottom left). Almost 80 % of comparisons show reductions in bias relative to the AirCore profiles of up to 13.6 ppb. This likely comes from a combination of the new priors' improved representation of the gradient around the tropopause and the general reduction in bias through the free tropopause (Fig. ).
6 ConclusionsGGG2020 introduces an improved algorithm to generate the prior profiles of , , , HF, CO, and other gases needed for TCCON retrievals. The version 10 and version 11 OCO-2 and OCO-3 retrievals also use these profiles. This approach is specifically designed to account for variations in vertical profiles due to synoptic-scale latitudinal motion of air masses. Direct validation against aircraft and AirCore observations shows consistent reduction in error in the free troposphere and lower stratosphere, and indirect validation by examining the magnitude of retrieved TCCON VSFs gives further evidence that the accuracy of the priors in the stratosphere has improved.
The column-average mole fractions retrieved by TCCON shift relative to in situ column averages by up to 0.2 ppm for , 13 ppb for , and 1 ppb for CO. For the standard TCCON , , and experimental ( with stronger sensitivity to the surface) products the new priors produce an overall improvement compared to the in situ column averages. The CO and experimental (stronger sensitivity to the upper atmosphere) products compare slightly worse overall to in situ data using the new priors. For CO, this is likely due to overestimated anthropogenic CO emissions in the source model. Finding a way to correct this, either by using a different model run or by applying a geographically varying correction, will be a high priority for the next version of the TCCON priors. The reason for the slight worsening of the retrievals is not yet clear.
An important guiding principle for the GGG2020 priors algorithm was to limit dependence on ongoing measurements or models as much as possible. Doing so means that retrievals using these priors produce data that can be treated as statistically independent with most existing and future measurements and models. Only , , and measurements from the Mauna Loa and American Samoa observatories and CO from the GEOS FP-IT model system are directly ingested, meaning that direct comparisons of TCCON GGG2020 or OCO-2/3 data with these data sources would not be fully independent. As latitudinal gradients from the HIPPO and ATom campaigns and correlations of , , and from the ACE-FTS instrument are used as well, comparisons between TCCON or OCO-2/3 and HIPPO, ATom, or ACE-FTS data should note that correlations of these specific characteristics (i.e., latitudinal gradients, correlations among , , and ) are correlated by design.
There are still clear areas for improvement. The age of air parameterization used in the troposphere is known to underestimate the age of air compared to measurements, and anthropogenic emissions are not accounted for except in the CO priors. Addressing these issues is planned for a future version of GGG; at that time, we will evaluate whether incorporating additional data from measurements or models produces worthwhile improvements in the priors' accuracy. Nevertheless, these new priors represent a significant improvement for the GGG2020 TCCON retrieval.
Code and data availability
The code to generate GGG2020 prior profiles is the “ginput” package, available from GitHub (10.22002/D1.20285; ). GGG2020 TCCON data use ginput version 1.0.6, which is scientifically identical to the publicly archived 1.0.7 version (10.22002/D1.1880; ). HIPPO data, provided by NCAR/EOL under the sponsorship of the National Science Foundation (
The supplement related to this article is available online at:
Author contributions
JLL created the priors code, carried out the validation, and led the writing of the manuscript. SR developed the code to read GEOS-FPIT meteorology and interpolate to TCCON locations. MK assisted with development of the priors. GCT developed the original GGG priors, of which the climatological profiles used for the secondary gases (Sect. ), seasonal cycle parameterization, and tropospheric distance function are retained in this work. POW guided the overall project. Other authors contributed data for validation of the priors. All authors reviewed the manuscript.
Competing interests
At least one of the (co-)authors is a member of the editorial board of Atmospheric Measurement Techniques. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.
Disclaimer
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Acknowledgements
The authors are deeply grateful to Arlyn Andrews from NOAA for providing the approach to generate the stratospheric , which was adapted and expanded upon for the other stratospheric priors, as well as the tables of stratospheric age of air used throughout this work. The GEOS data used in this study/project have been provided by the Global Modeling and Assimilation Office (GMAO) at NASA Goddard Space Flight Center. Joshua L. Laughner and Paul O. Wennberg acknowledge funding from NASA grant NNX17AE15G. We thank all TCCON partners for carrying out the GGG2014 and GGG2020 retrievals that provided the VSFs for Sect. . The AirCore campaign in Cyprus received support by the European Union's Horizon 2020 Research and Innovation Programme under grant agreement no. 856612 and the government of Cyprus and additional support from the Service National d'Observation ICOS-France-Atmosphere coordinated by LSCE. The TCCON Nicosia site has received additional support from the European Union's Horizon 2020 Research and Innovation Programme (EMME-CARE) under grant agreement no. 856612, the government of Cyprus, and the University of Bremen. The tower measurements at Park Falls were supported by the DOE Ameriflux Network Management project award to the ChEAS core site cluster (NSF nos. 0845166 and 1822420). NOAA/GML AirCore profiles were supported by NASA (grant no. 80NSSC18K0898). The authors are grateful to Peter Bernath for his leadership of the ACE-FTS project, which was invaluable to this work. The Atmospheric Chemistry Experiment (ACE), also known as SCISAT, is a Canadian-led mission mainly supported by the Canadian Space Agency. A portion of this research was carried out at the Jet Propulsion Laboratory (JPL), California Institute of Technology, under a contract with NASA (80NM0018D0004). Government sponsorship is also acknowledged.
Financial support
This research has been supported by the National Aeronautics and Space Administration (grant nos. 80NM0018D0004, NNX17AE15G, and 80NSSC18K0898), the National Science Foundation (grant nos. 0845166 and 1822420), and Horizon 2020 (EMME-CARE, grant no. 856612). Additional support has been provided by the government of Cyprus and the University of Bremen.
Review statement
This paper was edited by Jian Xu and reviewed by Zhao-Cheng Zeng and one anonymous referee.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2023. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Optimal estimation retrievals of trace gas total columns require prior vertical profiles of the gases retrieved to drive the forward model and ensure the retrieval problem is mathematically well posed. For well-mixed gases, it is possible to derive accurate prior profiles using an algorithm that accounts for general patterns of atmospheric transport coupled with measured time series of the gases in questions. Here we describe the algorithm used to generate the prior profiles for GGG2020, a new version of the GGG retrieval that is used to analyze spectra from solar-viewing Fourier transform spectrometers, including the Total Carbon Column Observing Network (TCCON). A particular focus of this work is improving the accuracy of
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
; Roche, Sébastien 2
; Kiel, Matthäus 1
; Toon, Geoffrey C 1 ; Wunch, Debra 3
; Baier, Bianca C 4 ; Biraud, Sébastien 5
; Chen, Huilin 6
; Kivi, Rigel 7
; Laemmel, Thomas 8
; McKain, Kathryn 4
; Pierre-Yves Quéhé 9 ; Rousogenous, Constantina 9
; Stephens, Britton B 10
; Walker, Kaley 3
; Wennberg, Paul O 11 1 Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA
2 Department of Physics, University of Toronto, Toronto, Canada; now at: School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
3 Department of Physics, University of Toronto, Toronto, Canada
4 Global Monitoring Laboratory, National Oceanic and Atmospheric Administration, Boulder, CO, USA; Cooperative Institute for Research in Environmental Sciences, University of Colorado – Boulder, Boulder, CO, USA
5 Climate Sciences Department, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
6 Center for Isotope Research, University of Groningen, Groningen, the Netherlands
7 Space and Earth Observation Centre, Finnish Meteorological Institute, Sodankylä, Finland
8 Laboratoire des Sciences du Climat et de l'Environnement (LSCE/IPSL), UMR CEA-CNRS-UVSQ, Gif-sur-Yvette, France; now at: Department of Chemistry, Biochemistry and Pharmaceutical Sciences, University of Bern, Bern, Switzerland
9 Climate and Atmosphere Research Centre (CARE-C), The Cyprus Institute, Nicosia, Cyprus
10 Earth Observing Laboratory, National Center for Atmospheric Research (NCAR), Boulder, CO, USA
11 Division of Geological and Planetary Sciences, California Institute of Technology, Pasadena, CA, USA; Division of Engineering and Applied Science, California Institute of Technology, Pasadena, CA, USA





