Content area
Herein, we describe an approach to retrieve free tropospheric columns of peroxyacyl nitrates (PANs) from radiances observed by the Atmospheric Infrared Sounder (AIRS). AIRS has provided daily global coverage since its launch in 2002, making the AIRS data a valuable long term record. Although the instrument is very radiometrically stable, the radiance noise level is large enough to present a challenge when retrieving a weak absorber such as PAN. To address this, spectral windows were selected to minimize interference from other species as much as possible and a set of filters was developed to predict whether a PAN value retrieved from AIRS is within 0.2
1 Introduction
Acyl peroxy nitrates (APNs) are a family of air pollutants formed by the reaction of a peroxy radical with . Peroxyacetyl nitrate (PAN, ) is the most commonly considered member of this family, resulting from the reaction of a peroxyacetyl radical with . PAN exists in equilibrium with its reactants and is more stable at colder temperatures. Because of this, PAN often acts as a temporary reservoir of nitrogen oxides (), enhancing long range transport of to downwind regions
PAN is neither a criteria air pollutant nor a designated hazardous air pollutant by the United States Environmental Protection Agency or the World Health Organization . As a result, routine in situ monitoring of PAN is rare. However, targeted campaigns such as the Arctic Research of the Composition of the Troposphere from Aircraft and Satellites
Techniques for remote sensing of PAN have been developed in the last two decades. PAN has very similar absorption features to other members of the APN chemical family
PAN has been retrieved from ground-based instruments as well as limb- and nadir- viewing space-based platforms. Several sites in the Network for Detection of Atmospheric Composition Change (NDACC) perform retrievals of PAN from ground-based spectra . From space, PAN in the upper troposphere/lower stratosphere has been retrieved from limb measurements from CRyogenic Infrared Spectrometers and Telescopes for the Atmosphere
Consistent records of atmospheric trace gas concentrations are essential to monitor how air quality is changing over time. A major challenge in this respect is addressing instrument differences among satellites to produce records spanning multiple decades. The Community Long-term Infrared Microwave Combined Atmospheric Product System (CLIMCAPS) product invested significant effort in applying a consistent retrieval to radiances from both the Atmospheric Infrared Sounder (AIRS) and the various CrIS instruments as well as minimizing cross-correlations between retrieved variables . discuss an information content approach to minimizing differences in CLIMCAPS retrievals. CLIMCAPS produces records spanning the more than two decades since AIRS launched in 2002 that include profiles of atmospheric temperature, , , , , , and , but does not include PAN.
The TRopospheric Ozone and its Precurors from Earth System Sounding (TROPESS) project also focuses on applying a consistent retrieval algorithm for various trace gases to radiances from a variety of instruments. This includes thermal radiances observed by AIRS and CrIS, as well as radiances in other parts of the electromagnetic spectrum from the Ozone Monitoring Instrument (OMI) and, in the future, the TROPOspheric Monitoring Instrument (TROPOMI). demonstrated the capability with TROPESS to retrieve from both AIRS and CrIS. They validated from both instruments against aircraft data and found that, although the retrievals from the two instruments are broadly similar, there are differences in the agreement with aircraft profiles. However, after accounting for the smoothing errors, the biases fall below 1 . evaluated trends in three TROPESS products using thermal radiances from AIRS and CrIS and combined thermal and ultraviolet radiances from AIRS and OMI. They compared these products to ozonesonde data, and found that trends in the bias of the retrieved were significantly less than the reported trends.
The ability to retrieve tropospheric columns of PAN from space has enabled scientific studies of various sources of air pollution. Several studies made use of the TES PAN retrievals to investigate factors driving PAN over Eurasia and the tropics as well as the prevalence of PAN in smoke-impacted air masses over North America . Other studies found that a combination of seasonal temperature, lightning, biomass burning, and microbial emissions influenced the PAN outflow from Eurasia, while found that the dominant factors in the tropics were biogenic emissions and lightning, with some influence from biomass burning during the study period. used PAN retrieved from CrIS to quantify the chemical production of PAN in the outflow from the Pole Creek Fire in central Utah, USA. combined PAN values retrieved from TES and CrIS and found that PAN columns over Mexico City had no trend over a time period when columns decreased. used PAN columns retrieved from CrIS to study whether there were statistically significant changes in PAN amounts over eight megacities during the COVID pandemic. They found a mix of increases, decreases, and no change in PAN columns among the megacities. More recently, used PAN retrieved from IASI to study transport of PANs across the Pacific and concluded that the effect on ozone in the western US was less than 1 . These studies provide examples of how space-based retrievals of PAN, particularly in synergy with other space-based trace gas observations, can provide valuable information about how the meteorological conditions, episodic events, and dominant chemical regime influence air quality in different regions.
Table 1Comparison of relevant AIRS and CrIS instrument characteristics. Spectral resolution was computed from the L1B files. All other values are from the cited references. NEdT stands for “noise equivalent differential temperature,” and the NEdT value for CrIS was estimated from Fig. 10 of for CrIS full resolution spectra.
| AIRS | CrIS | |
| Spectral resolution at 790 () | 0.355 | 0.625 |
| Field of view diameter () | 15 | 14 |
| Spatial sampling () | 13.5 | 15 |
| NEdT () | 0.1 to 0.8 | 0.04 |
In this work, we demonstrate the first retrieval of PAN from AIRS. As AIRS was launched in 2002, this has the potential to provide the longest continual record of PAN from a nadir viewing instrument. Our approach is based on that of . We begin with an overview of the AIRS and CrIS instruments, which are both used in this work. Then, we review the MUlti-SpEctra, MUlti-SpEcies, MUlti-Sensors (MUSES) algorithm which provides a retrieval framework for this work. Next, we describe the specific MUSES configuration we use. Fourth, we address several challenges encountered in adapting the approach of to AIRS spectra. Finally, we close with recommendations to users of the new AIRS PAN product. Due to the computational cost of this retrieval, our analysis focuses on a few days with significant variation in PAN from major fires in the US and Australia. This product will be incorporated in the operational TROPESS data processing in the future (https://disc.gsfc.nasa.gov/information/mission-project?keywords=tropess&title=TROPESS, last access: 11 September 2025), which will enable analysis on a longer timeseries of data.
2 Data sources and algorithm background
2.1 AIRS radiances
The Atmospheric Infrared Sounder (AIRS) instrument is carried on board the Aqua satellite. Aqua was launched in May 2002 and flies in a polar, sun synchronous orbit. For most of its mission, it had a local ascending equator crossing time of 13:35 LT. Starting in 2022, it began to drift to a later crossing time; as of early 2025, it has an equator crossing time of 14:30 LT (
AIRS is a grating spectrometer, covering three spectral bands (approximately 650 to 1140 , 1220 to 1610 , and 2170 to 2670 ) with 17 detector arrays and a nominal spectral resolution of 1200
2.2 CrIS radiances and the CrIS PANs product
At time of writing, there are three operational Cross-track Infrared Sounder (CrIS) instruments. The first is on board the Suomi-NPP satellite, launched in October 2011, followed by copies on the JPSS-1/NOAA-20 and JPSS-2/NOAA-21 satellites, launched in November 2017 and November 2022, respectively. All three are in sun synchronous orbits with ascending local equator crossing times around 13:30 LT. Unlike AIRS, CrIS is a Fourier transform spectrometer that observes nine fields of view in a 3 3 array simultaneously. It performs an across-track scan of 30 view positions. The fields of view are 15 in diameter .
used radiances from the CrIS instrument on board Suomi-NPP (S-NPP)
validated the CrIS PANs retrievals against PAN measurements taken during the ATom campaign. The measured profiles had GEOS-Chem profiles appended to the top. From the standard deviation of the differences between CrIS and aircraft free tropospheric PAN column averages, derived a single sounding uncertainty of 0.08 for the CrIS PANs retrieval. This was larger than the uncertainty calculated by the MUSES optimal estimation (OE) algorithm, but attribute the discrepancy to pseudo-random error contributions from the retrieval of interfering species or the temperature profile. Such interferent-driven error was not included in the uncertainty calculated by the MUSES algorithm, as for PAN retrievals, the algorithm calculates uncertainty from noise only.
Further, the comparison with ATom found a negative bias (CrIS lower than aircraft) that correlated with the total column amount of water vapor. The relationship between water vapor and the CrIS PAN bias was further corroborated by examination of the pre-PAN retrieval spectral residuals, which found a positive residual correlated with water vapor column amounts. From the ATom comparisons, derived a bias correction for the CrIS PAN product, , where is the column density of water vapor in and is the correction in .
2.3 MUSES Retrieval
The MUSES retrieval is an optimal estimation retrieval with heritage tracing back to the TES retrieval . It is instrument-agnostic, able to solve for the optimal state vector given radiances from a variety of instruments (e.g., AIRS, CrIS, the Ozone Monitoring Instrument (OMI), and the TROPOspheric Monitoring Instrument (TROPOMI)), or multiple instruments (e.g., AIRS OMI, CrIS TROPOMI).
The MUSES algorithm allows retrievals to be broken down into smaller steps, each of which define the spectral windows for which to minimize the radiance residuals, the atmospheric parameters to solve for, which of those parameters to update for the next step, along with a number of more technical options. The steps are defined in a “strategy table” which can be quickly edited to test different retrieval approaches. This step-wise design provides flexibility to fix some elements of the state vector while updating others in certain steps, which is particularly useful when retrieving state vector elements with large differences in the magnitude of their Jacobian matrices (e.g., atmospheric temperature vs. PAN) or which interfere with each other (e.g., vs. PAN). These steps are run sequentially; the final state of one step becomes the initial state for the next, save for any state vector elements which the strategy table indicates should not be updated.
Within each step, MUSES uses an iterative solver that applies the trust-region Levenberg–Marquardt scheme to mimimize a cost function
1
where
is the retrieved state vector,
is the a priori state vector,
is the observation vector (i.e., AIRS or CrIS radiances),
is the forward model that simulates radiances given the state vector and fixed parameters (),
is the error covariance matrix for the observed radiance, and
is the prior error covariance matrix.
The Levenberg–Marquardt solver will iteratively update the state vector, along a direction in state space expected to mimimize Eq. (). It will continue until the convergence criteria are satisfied or the maximum number of iterations is reached.
An important distinction within MUSES is the difference between the a priori (or constraint) state vector and the initial state vector. The former is in Eq. () and is a mathematical constraint on the optimal state vector, the latter is the starting point of before the first iteration of the Levenberg–Marquardt solver. This distinction is important within MUSES because it is a multi-step retrieval. The strategy table, mentioned above, defines which elements of the state vector will be retrieved in each step and whether or not the retrieved state for step becomes the initial state for step . For example, the retrieval may begin with an profile taken from a meteorological reanalysis as both the initial guess and the a priori constraint. An early step in the retrieval can then retrieve a new profile which is more consistent with the observed radiances. This new profile can then be used as an initial state for later steps (whether or not those steps retrieve ). This can be important for weak absorbers, such as PAN, which need the profiles of strong thermal IR absorbers to be accurate for the scene in question so that the relatively small absorption feature of the weak absorber can be identified. We note that, for a given step, the initial state and a priori constraint can be the same but do not need to be. For later steps of the retrieval, the initial state may have been set by earlier retrieval steps (as in the example given with ) but the a priori constraint will remain the same for all steps. Or, the a priori constraint may be chosen to be a relatively simple profile to avoid imposing undue assumptions, while the initial state may be chosen to reflect a better estimate of the atmospheric state in that location to attempt to minimize the number of steps needed by the solver.
MUSES can use different radiative transfer models for in Eq. (). For this work, we use version 1.2 of the Optimal Spectral Sampling (OSS) model . OSS is designed to use an optimal set of absorption coefficients (per absorbing species and vertical layers) and weights that can be used to compute the radiance for each channel of a spectrometer very efficiently, given the amounts of each absorbing species. These weights are computed by training OSS against a reference line-by-line spectroscopic model. Determining those optimal absorption coefficients and weights requires it to be trained for a given instrument. In version 1.2, the absorption coefficients are calculated from the Line By Line Radiative Transfer Model (LBLRTM) version 12.4 . This allows OSS to efficiently simulate the radiances a specific instrument would observe by reducing the number of monochromatic wavelengths that must be modeled for a given instrument channel, but means that OSS must be trained for each instrument used in a retrieval separately. For details on the approach, readers are encouraged to review and .
2.4 TROPESS products
The TROPESS project focuses on applying the MUSES algorithm to retrieve a range of atmospheric trace gases from a variety of space-based instruments, including AIRS, OMI, CrIS, and TROPOMI to date. Operational processing for TROPESS is set up to accommodate two distinct goals. The first is to provide a global record of ozone and related trace gases for the first 20 years of the 21st century. The second is to support rapid iteration on and improvement of the underlying level 2 algorithms while processing more recent data. Due to the computational cost of these retrievals, meeting both goals requires two separate data streams.
The first is a “retrospective” or “reanalysis” stream that retrieves trace gas amounts from 2002 through 2021. This stream is processed with a version of the MUSES algorithm frozen at the time the retrospective processing began. The second is a “forward” stream that processes new radiances as they become available with the latest version of the MUSES algorithm, including updates to the algorithm made after the retrospective processing began. The forward stream serves the dual purpose of monitoring significant events affecting air quality and serving as a test bed for improvements to the MUSES algorithm. Due to the difference in the algorithm versions, users must take care not to misinterpret changes in trends between the two streams.
Both streams use a “global survey” sampling approach to process a subset of all available soundings yet provide global coverage, which allows a balance between computational cost and spatial coverage. The default survey strategy processes one sounding in each ° ° box over land and one out of every four such boxes over ocean. For the current products, is either 0.7 or 0.8°. In addition, TROPESS produces special collections with full data density for high interest events (e.g., the 2019–2020 Australian Bush Fires and 2020 US West Coast Fires) and a set of megacities around the world.
The CrIS PAN product described in and Sect. , with mostly minor updates, is now routinely produced as part of both the reanalysis and forward TROPESS streams, as well as special products. The reanalysis and forward streams provide twice daily (day and night) global coverage of PAN, using the global survey strategy described in the previous paragraph. Other species retrieved within the TROPESS project include methane, carbon monoxide, deuterated water (HDO), ammonia, and ozone.
3 AIRS PAN retrieval development
3.1 AIRS PAN retrieval design: microwindows and retrieval order
Retrievals using AIRS radiances have previously been implemented within the TROPESS MUSES algorithm (§ ), thus the AIRS PAN retrieval can use the existing readers and MUSES OE framework. The components that must be added are (1) the desired windows, (2) the strategy table that instructs the MUSES algorithm to retrieve PAN, and (3) details about PAN retrievals copied from the CrIS PAN retrievals, such as the prior vector values.
Figure 1An illustration of the factors driving the selection of windows for the AIRS PAN retrieval. Each panel shows the simulated difference in brightness temperature for a 10 % increase in the mixing ratio of one species at all altitudes as the black line. The AIRS channels are marked at the top of each panel as gray dots. The chosen windows for the AIRS retrieval are the full height blue boxes. For reference, the CrIS windows used by are the short, orange boxes.
[Figure omitted. See PDF]
For the existing CrIS PAN retrieval, chose two windows on the low frequency side of the PAN spectral feature (Fig. ). However, parts of these windows fall in the AIRS “spectral gap,” where no radiance channels are available. Thus, we had to compromise between windows that will see sufficient signal for PAN absorption and windows that avoid signal from interfering species. Figure shows the selected windows overlaid on simulated absorption features for the relevant species in this spectral range. The two windows on the left of the PAN feature (below 785 ) only see a weak part of the signal from PAN, but are outside of the absorption. The center window at 795 is able to capture the core PAN absorption, but has interference from both water and . The two rightmost microwindows (above 800 ) are able to avoid interference from water, but have minor to moderate interference from . is not retrieved (Table ) but is simulated in the radiative transfer as an interferent, using climatological profiles scaled by yearly scale factors derived from ground based observations. The base climatological profiles vary with latitude and longitude in 30 and 60° bins, respectively, and were developed from MOZART model output .
Early tests with the three windows above 790 showed that omitting the 795 window gave erroneously high PAN column average values across much of the western United States during a period when the Pole Creek Fire was emitting PAN
The microwindows selected for the AIRS PAN retrieval.
| Window number | Freq. range () |
| 1 | 772.5 to 775 |
| 2 | 780 to 781.875 |
| 3 | 793.75 to 796.875 |
| 4 | 800 to 802.5 |
| 5 | 804.375 to 805 |
The retrieval steps in the strategy table for this AIRS PAN retrieval. Retrieved elements annotated with a ∗ are only included over land.
| Step num. | Step name | Retrieved elements | Comment |
| 1 | Brightness temperature check | – | Initial check to determine whether to run step 2, 3, or neither |
| 2 | Cloud properties | Cloud extent, cloud pressure | Optional, depends on step 1 |
| 3 | Surface temperature | Surface temperature | Optional, depends on step 1 |
| 4 | Strong features | Atm. temp., surf. temp., , HDO, , , cld. extent, cld. pres., surf. emissivity∗ | |
| 5 | Model residual check | – | This step does not update any values, it provides pre-PAN residuals useful for future development and filtering |
| 6 | PAN | PAN | |
| 7 | and update | Surf. temp., , , cld. extent, cld. pres., surf. emissivity∗ | |
| 8 | Surface refinement | Surf. temp., cld. extent, cld. pres., surf. emissivity∗ | This step gives a chance to refine surface temperature/emissivity∗/cloud properties before retrieving |
| 9 | |||
| 10 | , surf. temp., cld. extent, surf. emissivity∗ |
Development of the strategy table was straightforward, requiring only the addition of a PAN retrieval step to the standard AIRS strategy table in use by TROPESS to generate AIRS products. Table enumerates the steps included in this table; the PAN step added is step number 6. The choice to place the PAN retrieval immediately following the “strong features” step (no. 4) follows . Step number 5 was also added to enable saving of spectral residuals in a wider range of frequencies centered on the PAN feature. Such a diagnostic is helpful to understand what factors might be affecting a given retrieval and proved valuable for filtering (Sect. ).
Figure 2The different sets of PAN mole fraction profiles used as a priori constraints in the MUSES retrieval. Each panel represents a profile type, selected within MUSES based on the sounding location. Within each panel, the variation with month is shown by the differently colored profiles.
[Figure omitted. See PDF]
The a priori constraints used in the AIRS PANs retrieval are mostly the same as the CrIS PANs retrieval, with the exception of surface emissivity. As in , the PAN profile used as the a priori constraint for each sounding is selected from a set of 6 climatological profiles for each month (Fig. ) and the initial PAN profile used as the starting point for the nonlinear optimization is a flat 0.3 in the troposphere. Likewise, the a priori covariance for the PAN VMRs is the same as in . These constraints derive from those used in the retrieval of PAN from the Tropospheric Emissions Spectrometer . For surface emissivity, we used the Combined ASTER MODIS Emissivity over Land (CAMEL) database for our inital and a priori constraint on surface emissivity. used the University of Wisconsin Cooperative Institute for Meteorological and Satellite Studies High Spectral Resolution database . Note that all TROPESS products starting from v1.16 now use the CAMEL database; this included the CrIS PAN retrievals we use for comparison in Sect. .
3.2 Addressing cloud interference over ocean
During development, we found that low, warm clouds over ocean would be misinterpreted by our AIRS retrieval as PAN. For the cases tested, we were able to filter out such soundings by decomposing the AIRS radiances into empirical orthogonal functions (EOFs) and filtering soundings for which the second principle component (PC) was below a threshold. This section describes that approach.
Figure 3Column average PAN between 825 and 215 as retrieved for 11 September 2020 from both CrIS (on the Suomi-NPP satellite, a) and AIRS (b and c). Compared to (b), (c) uses a PC-based filter instead of the previous retrieval step's water quality check to filter for cloud impacts. The black box in both panels shows the location of the spurious plume in the AIRS retrievals that is the focus of discussion in Sect. .
[Figure omitted. See PDF]
This issue of certain clouds appearing as PAN in the retrieval can be seen in Fig. , which shows free tropospheric column averages of PAN (which we will refer to as ) from 11 September 2020. This was a period with major wildfires throughout the west coast of the United States. In Fig. a, retrieved from CrIS shows a reasonable plume structure, with clear advection of PAN from the fires on the west coast. We also see some of this in the AIRS retrievals – specifically, the enhanced PAN in the southern half of California, most of Arizona, and the northwest corner of Mexico, as well as over the northern Pacific Ocean (Fig. b).
Figure 4(a) RGB image from the GOES-17 (GOES-West) satellite as captured between 22:11 and 22:13 UTC on 11 September 2020. (b) Cloud fraction, (c) cloud top pressure, and (d) cloud top temperature from the MODIS-Aqua MYD06 product. In all panels, the red or black box encloses the same area as the black boxes in Fig. .
[Figure omitted. See PDF]
However, in the black box (20 to 30° N, 142 to 122° W), CrIS shows mostly background column whereas the AIRS retrievals show an enhancement with an unusual structure (not a shape representative of transport from the fires). When we check RGB imagery from the GOES-West Advanced Baseline Imager (https://noaa-goes17.s3.amazonaws.com/index.html#ABI-L2-MCMIPC/2020/255/22/, last access: 8 December 2022), we clearly see that this “plume” seen by the AIRS PAN retrieval matches the shape of the clouds in that area (Fig. a). Further, cloud properties from the MODIS-Aqua MYD06 product plotted in Fig. b–d show that this is a low, warm cloud. This clear spatial correlation between the cloud extent and the spurious PAN plume leads us to conclude that such low, warm clouds cause difficulties for our retrieval with the chosen spectral windows (Table ). Similarly, in the plume around 50°N, AIRS sees enhanced further west than CrIS (around 150° W) and more to the northwest of the state of Washington (near 50°N, 125°W). From the cloud properties shown in Fig. , these are also potential cases of erroneous impact from clouds.
The AIRS data shown in Fig. are those soundings which pass prototype quality flags chosen based on quality flags for other thermal retrievals, including sufficiently small radiance residual, surface temperature 265 , cloud top pressure (as retrieved in our algorithm) below the tropopause, and the quality of the retrieval in step 4 of Table . (Note that these quality flags were for prototyping purposes only, and are not those used in the final product.)
Figure 5The results of the EOF decomposition on AIRS radiances over a domain covering 20 to 60° N and 150 to 110° W. The top panel shows the spectral signatures of PAN and from Fig. for references. The remaining three panels show the first, second, and third EOFs, respectively, resulting from the decomposition. The legend in each panel reports the percent of variance explained by that EOF.
[Figure omitted. See PDF]
Since these criteria were insufficient to remove the spurious plume, we investigated an approach inspired by . As they used an empirical orthogonal function (EOF) decomposition to study dominant patterns of variability in the AIRS data, we tested whether an EOF decomposition could identify the low, warm clouds causing the spurious PAN signal in our AIRS PAN retrieval. We do note that a cloud-clearing approach, like that used in CLIMCAPS , could be one approach to address this issue. Such an approach combines radiances from multiple soundings to yield radiances unimpacted by clouds
Figure shows the first three EOFs resulting from a decomposition of the AIRS observed radiances (as stored by the MUSES algorithm in its output radiance files) within the domain covering 20 to 60° N and 150 to 110° W. Keeping in mind that the sign of an EOF is arbitrary, as it can be flipped by changing the sign of the principal component (PC) by which it is multiplied, the first two EOFs contain many features which match up closely in shape to the spectral features shown in the top panel. The third EOF appears to relate to , as the dominant feature appears at approximately the same frequency as the feature shown in Fig. . These EOFs were computed from the window used in step 5 of the strategy table (Table ), which spans 760 to 860 .
Figure 6The PC values for EOF 1 (left) and 2 (right) for each AIRS sounding in a domain covering 20 to 60° N and 150 to 110° W. The red box in both panels outlines the same area as the black and red boxes in Figs. and .
[Figure omitted. See PDF]
For each AIRS sounding, the observed radiances can be represented as the linear combination of the EOFs with the PCs as the coefficients. Figure shows the values of the PCs for the first two EOFs needed to reconstruct the AIRS radiances for all the soundings in this domain. The second PC (Fig. , right panel) has a spatial pattern of negative values strikingly similar to the clouds seen in Fig. . In Fig. c, we show the AIRS with a filter based on the values of PC 2 applied. Filtering out soundings with PC 2 0 removes the spurious .
Figure 7(a) retrieved from the CrIS instrument on 1 January 2020 over New Zealand. (b) retrieved from AIRS over the same region as (a), with parametric filtering, including a check of the quality of the water retrieval from a previous retrieval step applied. (c) As (b), but with the PC-based filter applied instead of the previous step's water quality check. (d) MODIS-Aqua cloud fraction, (e) cloud top pressure, and (f) cloud top temperature over the same region as (b). In all panels, the black or red box highlights an area with low, warm clouds.
[Figure omitted. See PDF]
As a next step, we applied this PC-based filtering to different region. We chose a PAN plume from the Australian Bush Fires in late 2019/early 2020. For this Australian fire case, we also see a collection of soundings with large values in the AIRS data but not the CrIS data, marked by the black box in Fig. a and b. The MODIS cloud properties (Fig. d–f) confirm that this is again a low, warm cloud. When we apply the PC-based filter, it correctly marks these soundings as bad quality and removes them (Fig. c).
Figure 8(a) retrieved from the CrIS instrument on 11 September 2020 over the Amazon. (b) retrieved from AIRS over the same region as (a) without the PC-based filter applied.(c) MODIS-Aqua cloud fraction, (d) cloud top pressure, and (e) cloud top temperature over the same region as (a). In all panels, the black or red box highlights an area where MODIS-Aqua observed mostly low, warm clouds.
[Figure omitted. See PDF]
Fortunately, it appears that soundings over land are not as susceptible to this issue with low, warm clouds. Figure shows retrieved from both CrIS and AIRS again along with MODIS-Aqua cloud properties, this time over the Amazon. While the AIRS (Fig. b) shows sporadic high values compared to CrIS (Fig. a), these erroneous high values appear to be random, rather than systematically located where the low, warm clouds are. In particular, the western swath shows mostly low values despite the presence of low, warm clouds. Therefore, we apply the PC-based filter only to ocean soundings.
Our hypothesis is that the reason the AIRS retrieval is affected by the low, warm clouds and CrIS is not is due either the difference in spectral windows used between the retrievals (Fig. ), the difference in radiance noise between the instruments, or a combination of the two. Further, our hypothesis for why land soundings are much less impacted than ocean soundings is that it is more difficult to distinguish a low, warm cloud from an underlying ocean surface than a land surface.
While the PC-based filter was successful in these cases, we note that it may need additional adjustment in the future. During testing, we found that filtering out soundings with PC 2 10 was sufficient for the US West Coast Fires case (Fig. ), but not the Australian Bush Fires case (Fig. ), whereas requiring PC 2 0 worked for both. Future work will examine whether the criterion of PC 2 0 is sufficient globally, or if further refinement is necessary.
We also note that it was necessary to use the 760 to 860 window, rather than the narrower windows used in the PAN retrieval step (Table ). When we tested the latter, this PC-based filter was not effective in the Australian fires case. Therefore, we conclude that information available in the wider window provides the necessary data for the EOFs to correctly fit clouds.
3.3 Filtering and validation through comparison with CrIS
While the PC-based filter addresses the issue of interference from low, warm clouds (Sect. ), it is not sufficient by itself as a quality filter. Ideally, quality filters would be derived by comparing the satellite product to in situ data and checking that the filters ensure good agreement between the satellite and in situ data. To this end, used aircraft profiles from the ATom campaign to validate the CrIS PAN product. This was ideal for CrIS, as the ATom flights provided profiles of PAN over the majority of the troposphere. However, the majority of the ATom profiles are over ocean
Thus, instead of relying on aircraft data directly, we decided to use the existing CrIS PAN product as a transfer standard by designing a quality filter that predicts whether the AIRS value will be within a given threshold of the nearest CrIS value. This provides the large number of soundings needed for bulk statistics and implicitly makes the AIRS PAN product consistent with the CrIS PAN product, which can allow users to combine the two.
We chose to implement this quality filter using decision trees, with the Scikit Learn package . Using simple decision trees allowed us to investigate what variables were used to classify a sounding as good or bad quality during development. Using decision trees rather than hand-tuned quality filter parameters allowed faster iteration and should, in principle, be more reproducible. Because we saw in Sect. that ocean soundings required different filtering for clouds than land soundings, we also tested whether using a single decision tree for all soundings or separate decision trees for land and ocean soundings gave better results. We found that separate decision trees for land and ocean soundings retained more soundings with significant , and that there was little difference in the correlation between AIRS and CrIS using separate land/ocean trees or a single tree. Therefore, we chose to use separate decision trees. Appendix shows a subset of results using a single decision tree, and describes the trade offs between using a single decision tree and separate decision trees.
Table 4Regions and dates used for the quality filter decision tree training and testing. A – in the “Training date” column indicates that no data from that region was used in training.
| Region name | Training date | Testing date | Latitude bounds | Longitude bounds |
| Australia/NZ | 1 Jan 2020 | 5 Jan 2020 | 60 to 20° S | 150 to 177.5° E |
| US West Coast | 13 Sep 2020 | 11 Sep 2020 | 20 to 60° N | 150 to 110° W |
| Amazon | – | 11 Sep 2020 | 25° S to 10° N | 80 to 40° W |
| Africa | – | 11 Sep 2020 | 30° S to 5° N | 5 to 45° E |
The decision trees were trained on AIRS and CrIS retrievals for one from each of the 2019/2020 Australian Bush Fires and 2020 US West Coast Fires. Ocean soundings that failed the PC-based filter (Sect. ) were excluded from training. Two different days from these fires, plus retrievals over the Amazon and Africa were used for testing (Table ). The data was divided into training and testing by days and regions rather than a random or similar stochastic split to ensure that the training data included at least some soundings with significant . Since plumes with significant are outnumbered by background soundings, we were concerned that a fully random split would miss the plume soundings.
Table 5Input variables for the quality filter decision trees. Note that “ quality” is not useful as is retrieved after PAN (see Table ) but is included because it is a standard quality variable in the MUSES algorithm.
| Short name | Description |
| Rad. resid. mean | Post-PAN retrieval mean of noise-normalized radiance residuals |
| Rad. resid. std. dev. | Post-PAN retrieval standard deviation of noise-normalized radiance residuals |
| Res. Norm. Init. | Quadrature sum of pre-PAN retrieval residual mean and standard deviation |
| Res. Norm. Final | Quadrature sum of post-PAN retrieval residual mean and standard deviation |
| Rad. Max. SNR | Maximum ratio of radiance to noise |
| Jacobian dotted with radiance residuals | |
| Radiances dotted with radiance residuals | |
| Cld. pres. | Cloud pressure |
| Cld. OD mean | Mean cloud optical depth between 975 and 1200 |
| Cld. OD var. | Standard deviation of cloud optical depth between 975 and 1200 |
| Mean surf. emis. | Mean difference between retrieved and a priori surface emissivity |
| Desert emis. | Value of retrieved surface emissivity nearest 1025 |
| self corr. | Consistency between retrieved in two different steps |
| Atm. T quality | Quality flag for retrieved atmospheric temperature |
| quality | Quality flag for retrieved profile |
| quality | Quality flag for retrieved profile from step 4 (Table ) |
As inputs, the decision trees received 16 values commonly used by existing MUSES retrievals as quality metrics, listed in Table . It was trained to predict a binary flag indicating whether the AIRS was within 0.2 or 50 % of the CrIS from the CrIS sounding closest to it (by great circle distance). The CrIS soundings are restricted to those that pass basic quality flagging for modeled vs. observed radiance and a check for certain surface features that can cause erroneous retrievals. The CrIS value compared against includes an averaging kernel adjustment to accommodate different vertical sensitivity between CrIS and AIRS. (see Fig. for a summary of typical CrIS and AIRS column averaging kernels.) Specifically, following Eq. (25) of ,
2
where
is the a priori from AIRS,
is the AIRS pressure-weighted column averaging kernel (i.e., one that includes the integration operator),
is the CrIS posterior PAN profile,
is the AIRS prior PAN profile.
Note that the CrIS is not an input to the decision trees; it is used only in training. This permits the decision trees to be applied to AIRS soundings without a coincidence CrIS sounding.
Typically, it is important to “prune” decision trees by limiting the number of decision nodes it can include in order to prevent overfitting to the training data. We tested pruning by limiting both the maximum depth (i.e., the number of nodes along any one path) and maximum number of leaf nodes (i.e., the number of end points for the model). However, we found that either method of pruning the decision trees caused the filter to screen out soundings with enhanced . Our hypothesis is that, because these soundings are still in the minority of all soundings in the training data, limiting the decision tree's size gave it too little flexibility to account for these somewhat uncommon cases. That is, because soundings with enhanced are in the minority, a model limited in size lacked the flexibility to develop useful rules for these soundings, and instead was able to achieve better accuracy by simply classifying all such soundings as bad quality. Therefore, we proceed without limiting the model size.
Figure 9(a) retrieved from CrIS on 11 September 2020 over Africa. (b) Surface emissivity at 1025 . (c) retrieved from AIRS with only the decision tree-based filter applied. (d) Like (c), but with the emissivity- and PC- based filters added to the decition tree-based filter. The red or black box in each panel indicates an area with the silicate feature known to bias our PAN retrievals.
[Figure omitted. See PDF]
Additionally, we include an explicit check that the retrieved surface emissivity at 1025 is 0.94. This filter is similar to one used in to remove soundings impacted by a silicate feature that produces a surface emissivity with a similar spectral shape to PAN. The same silicate feature also shows up as a low emissivity near 1025 (see Appendix ). Although the decision trees are trained on this value as an input, it still retains some soundings clearly affected by the silicate feature. Figure shows CrIS in Fig. a, AIRS in Fig. c and d, and the emissivity value in Fig. b. The red or black box identifies a region with low 1025 emissivity values that has very high values in the AIRS retrieval in Fig. c. When we add an explicit filter on the 1025 emissivity, those few remaining soundings are removed.
Figure 10(a) AIRS data from the Australian Bush Fires on 5 January 2020 with no filtering applied. (b) As (a), with the PC-based filter applied. (c) As (a), with the PC- and emissivity- based filters applied. (d) As (a), with the PC-based, emissivity-based, and decision tree-based filters applied. (e–h) show the same filtering progression, but for 11 September 2020 over the US West Coast Fires.
[Figure omitted. See PDF]
The final quality filter will be a combination of the PC-based filter from Sect. , the emissivity-based filter, and the decision tree-based filter. Figure shows how each of these filters affects the soundings passed as good quality for two days with clear PAN plumes. As discussed in Sect. , the PC-based filter is applied only to ocean soundings, where clouds cause a high bias in . For these two scenes, the emissivity filter has a modest impact, removing some soundings in southern California, northeastern Arizona, and southeastern Utah (Fig. g). In both scenes, the decision tree-based filter does remove a number of the soundings with large values (Fig. d,h). Therefore, in the public files, we will provide the information for users to adjust the quality flagging to suit their application; specifically the PC value used for flagging, the emissivity value used for flagging, and the binary flag produced by the decision trees.
For the rest of this section, we will focus on the performance of the combined filter. Appendix contains a brief exploration of the relationship between the input variables and predicted quality flag.
Figure 11Maps of retrieved from CrIS (first column) and AIRS (second column) along with total column, also retrieved from AIRS (third column). Each row contains one of the testing region/day pairs from Table . The first and third columns are filtered by the standard TROPESS quality flag; the middle column uses the combined decision tree PC emissivity filter described in Sect. .
[Figure omitted. See PDF]
First, we examine the spatial distribution of PAN plumes in our filtered AIRS product versus CrIS. Figure shows our filtered AIRS PAN data alongside the PAN retrieved from CrIS. The data shown here are from the four testing data region/day pairs in Table ; thus, these are data that the decision trees were not trained on. The first two rows show the Australian 2019/2020 Bush Fires and the 2020 US West Coast Fires, respectively. In both cases, we can see that the AIRS PAN product matches the location of enhanced PAN plumes seen in the CrIS data very well. In the US West Coast Fires case, the large values in Arizona, central/southern California, and northwestern Mexico are all in the same region where CrIS sees high values. Likewise, in the Australian fires case, AIRS captures the PAN plume approaching New Zealand's northern island, though compared to CrIS, more of the plume is removed by our filtering criteria.
The last two rows of Fig. show a day over the Amazon and central/southern Africa, respectively. These are regions not included in the training data for the decision trees (Table ), so these are a good test of whether the filter can generalize to new regions. Neither region has significant PAN plumes in the CrIS data. However, there are small enhancements to 0.5 in both cases. In the Amazon, there are also a few soundings with 1 near 17.5° S, 55° W. AIRS does see this 1 hotspot, though it also retrieves several soundings with 1 further north, where CrIS does not. The Amazon hotspot in western Brazil cannot be seen in AIRS due to the swath gap. The PAN hotspot seen by CrIS in the African test over Angola, Zambia, and the Democratic Republic of the Congo is not as apparent in the AIRS PAN; however, AIRS does appear to capture some enhancement in that area, particularly compared to further north, near the equator.
Helpfully, in most of these cases, when there is a strong PAN enhancement in CrIS, AIRS also sees an enhancement in . For example, in the Amazon test case, only the soundings with enhanced PAN at 17.5° S, 55° W also have a strong enhancement; while the false enhancements further north in the AIRS PAN do not. This implies that users looking for PAN plumes in the AIRS data can check for enhanced to distinguish whether a small PAN plume is likely real. This is not a entirely self-sufficient condition, as it is possible to have a PAN plume without enhanced , but the presence of enhanced can give more confidence in an observed PAN plume. (See Sect. for a summary of recommendations for use.)
Figure 12Correlation between AIRS and CrIS where the latter includes both the AIRS averaging kernel correction from Eq. () and the bias correction from . The different test date/region pairs from Table are represented by the different color series. Each marker represents the daily average of the AIRS and matched CrIS soundings in a box with the size of the marker increasing with the number of soundings in that box. Each box must have a minimum of 10 soundings to be included. The box size is the only difference between panels: (a) 1° 1°, (b) 2° 2°, (c) 5° 5°, and (d) 10° 10°
[Figure omitted. See PDF]
We also tested the correlation between AIRS and CrIS with different amounts of spatial averaging. Figure shows the results for four different spatial averaging box sizes. While the data from our test cases does have some fire-influenced observations, many of the observations vary primarily from large-scale seasonal or latitudinal variations. At 1° 1°, the correlation is somewhat weak. The correlation is more significant at 2° 2°, 5° 5°, and 10° 10°. However, averaging to 5° 5° or 10° 10° is needed for the root mean squared error (RMSE) between AIRS and CrIS values to drop below 0.1 and for the visual correlation (especially for high values) to be apparent. This is a fair amount of averaging, but is not surprising, given the sounding-to-sounding variation seen in Fig. . Given the amount of observations, this will still provide useful PAN coverage. We discuss recommendations for use based on this result in Sect. .
In Fig. d, we see that the AIRS value is biased low compared to CrIS . It is not clear if this bias in the AIRS data is best parameterized as a function of , as was the case for CrIS, or if another parameter is a better predictor. were able to derive the CrIS bias correction through comparison between CrIS and in situ background values. In this work, the need to average a significant number of AIRS soundings to reduce the random sounding-to-sounding noise makes it difficult to identify any relationship between AIRS values and column amounts.
Figure 13(a) Noise equivalent spectral radiance (NESR) from the channels in the spectral windows used in the CrIS and AIRS PAN retrievals. The circles give the median NESR value per channel and the error bars show the 25th to 75th percentile range. The medians and percentiles are computed over all soundings from the training and test data listed in Table . (b) As (a), but with the NESR values given as a percentage of the corresponding observed radiance.
[Figure omitted. See PDF]
3.4 Uncertainty estimates and vertical sensitivity
The CrIS radiance noise is lower than the AIRS radiance noise, which is a significant advantage when retrieving species, such as PANs, with only weak absorption features. Figure shows per-channel median and 25th to 75th percentile noise equivalent spectral radiance (NESR) values. Although we use different frequencies in the AIRS and CrIS retrievals, the AIRS NESR values are systematically greater than the CrIS values. Taking all of our test cases for comparing AIRS and CrIS (Table ), we find that the median ratio of AIRS to CrIS NESR across all channels is 5.9. Assuming that single sounding uncertainty scales linearly with radiance noise, that suggests that the AIRS single sounding uncertainty in should be approximately 0.5 , that is, approximately six times the 0.08 value calculated for CrIS. This aligns with the correlation between AIRS and CrIS shown in Fig. , which shows that AIRS values below 0.5 are dominated by random uncertainty without significant averaging. We also checked the correlation between individual AIRS and CrIS values in Fig. , and similarly see that the values have a spread of 0.5 . While we expect the error of individual soundings to vary depending on the specific atmospheric and surface conditions for each sounding, we believe 0.5 to be a reasonable estimate of the typical uncertainty in the AIRS data.
Figure 14A heatmap showing the distribution of AIRS compared to the corresponding CrIS with averaging kernel adjustment and bias correction. Unlike Fig. , there is no averaging, this is a comparison of individual soundings.
[Figure omitted. See PDF]
Figure 15The left two panels (a, c) show column averaging kernels for the free tropospheric column average quantities from AIRS (top) and CrIS (bottom). The values are the dot product of the free tropospheric pressure weighting function with the averaging kernel matrix; thus, the averaging kernels shown are weighted by each level's contribution to the column average. The right panel shows the sum across rows of the averaging kernel . This quantity estimates the sensitivity of the column to a given level, without the pressure weighting function. The kernels shown in all panels are the medians in 5 surface temperature bins from the good-quality soundings of the US West Coast Fires domain on 11 September 2020. For CrIS, good quality is defined using the standard MUSES quality flag. For AIRS, it uses the PC-based, emissivity, and decision-tree based filters as described at the end of Sect. .
[Figure omitted. See PDF]
Figure compares the pressure-weighted column averaging kernels and the sum across the rows of the averaging kernel for AIRS and CrIS for good quality land soundings within the US West Coast Fires domain on 11 September 2020. The averaging kernels shown are the medians of averaging kernels for soundings binned by surface temperature. For both instruments, maximum sensitivity shifts to lower pressure with decreasing surface temperature. However, compared to CrIS, AIRS maximum sensitivity decreases more quickly as surface temperature decreases. We suspect this is due to the greater noise present in the AIRS radiances, with AIRS sensitivity decreasing more with reduced thermal contrast due to the greater noise. However, we have not confirmed this hypothesis. Note that, for both instruments, the averaging kernels shown in the left panels incorporate the pressure weighting function, which is why the values are well below 1.
Figure 16(a) A 2D histogram of degrees of freedom of signal vs. retrieved from CrIS. (b) A histogram of the CrIS degrees of freedom. (c) As (a), but for AIRS. (d) as (b), but for AIRS. All panels are from the 11 September 2020 scene over the US West Coast Fires shown in Fig. . No soundings were removed by filtering.
[Figure omitted. See PDF]
Figure shows the overall degrees of freedom (DOF) of signal for both the AIRS and CrIS products in the 11 September 2020 US West Coast Fire scene. From Fig. a and b, we can see that the DOFs for the CrIS PAN product are grouped around 1, indicating that there is essentially always enough information to retrieval a single piece of vertical information in the form of a column average. In contrast, Fig. c and d show that the AIRS DOFs are lower (centered around 0.5) with a wider distribution. Greater AIRS values do tend to be associated with greater DOFs. This implies that the AIRS product will retain influence from the prior, particularly in background conditions, but can detect sufficiently large PAN enhancements.
4 Recommendations for use
The primary benefit to a retrieval of PAN from AIRS is the longer record available from AIRS compared to CrIS. We envision two primary use cases for this product. The first use case is tracking long term changes in background PAN levels. Given the sounding-to-sounding variation in the AIRS values, this will require significant averaging to discern trends in from AIRS. However, Fig. does show that the root mean squared error between AIRS and CrIS is 0.1 when averaged to a 5° 5° or 10° 10° box, which is comparable to the CrIS PAN errors. This does not imply that the overall error is 0.1 (as the AIRS and CrIS retrievals could have similar systematic errors), only that the AIRS and CrIS records will be consistent to within 0.1 with similar averaging. Table gives ranges of the number of points in each box size from Fig. . Based on this information, our first recommendation is that users interested in trends in background PAN from the AIRS product choose a spatiotemporal averaging window that has a median of at least 140 soundings passing our quality screening per window which will result in a typical difference versus CrIS of about 0.1 . This is chosen as the median number of points (to two significant figures) in a 5° 5° box (Table ), as Fig. shows this box size is sufficient to reduce the RMSE between AIRS and CrIS to 0.1 . In principle, it should not matter whether the 140 soundings are accumulated by averaging in time or space, as we assume the AIRS-CrIS differences are similarly uncorrelated in time as in space. We expect this assumption to hold true as long as episodic events that significantly perturb PAN concentrations (such as wildfires) are not included in the time period averaged. We will test this assumption in the future as more data becomes available.
Table 6Distribution of the number of points in the different sized boxes used for the AIRS-CrIS comparisons.
| Box width | 1st pct. | 25th pct. | Median | 75th pct. | 99th pct. |
| 1° | 10 | 12 | 15 | 22 | 46 |
| 2° | 10 | 23 | 38 | 58 | 154 |
| 5° | 12 | 62 | 137 | 238 | 747 |
| 10° | 25 | 150 | 406 | 642 | 2140 |
The second use case is investigating PAN from extreme events, such as wildfires, before the start of the CrIS PAN product. We showed in Fig. that the AIRS PAN product does reliably see significant values of 0.5 to 1 . However, users should be aware that there are many cases where a high AIRS value within a small spatial area is false. Figure shows that there is a large fraction of AIRS soundings with 0.5 that match with CrIS soundings with 0.5 . Therefore, users looking for PAN caused by extreme events should
ensure that high values are spatially connected (as a contiguous plume is more likely to be a real signal than a spurious single-sounding error), and
check for other species expected to be generated by the event of interest, such as for wildfires.
These two criteria should help users filter out false positive high values. When using other species of interest, users need not restrict themselves to TROPESS products – any good-quality dataset will be useful in this regard. Users interested in extreme events with large values should note that the decision tree-based filter can remove soundings with clear PAN enhancements. Custom filtering using only the PC- and emissivity- based filters can be used in such cases to recover the soundings with enhanced PAN; however, users must be aware that the difference with respect to CrIS will likely be larger in such a case. Further, while we believe that the PC-based filter is able to remove most cloud-affected soundings, there may be cases where it is not fully effective. Thus, we encourage users to engage with the algorithm team if there is concern about whether a signal of interest in the AIRS PAN is correct. As stated in Sect. , users should use a 0.5 uncertainty per sounding when using individual soundings in their analysis.
5 Conclusions
We have demonstrated the ability to retrieve free tropospheric column amounts of PAN from AIRS spectra. This is more challenging than the existing CrIS retrieval due to the higher radiance noise in AIRS than CrIS and the presence of a gap in the AIRS spectra on the low-frequency side of the PAN spectral feature. The AIRS PAN retrieval is also sensitive to low, warm clouds over oceans, which cause spurious PAN signals in the AIRS PAN retrieval. These spurious signals have been successfully removed with a PC-based filter in testing, but further adjustment may be needed to make this filter fully effective at removing these signals.
The AIRS product does have larger errors than the CrIS product and requires care in its application. This is mitigated by the use of a decision tree-based quality filter trained to identify AIRS soundings with values significantly different than the nearest CrIS sounding and by averaging sufficient numbers of AIRS soundings. For studies of background PAN concentrations, we recommend averaging at least 140 AIRS soundings which will result in a 0.1 error relative to the existing CrIS PAN product. This product opens up the potential for a global record of free tropospheric PAN amounts from 2002 to the present, potentially allowing the evaluation of trends in background PAN for over two decades.
This product is planned for inclusion in the TROPESS forward stream, provided AIRS continues to operate. This would allow us to evaluate its performance over a larger range of times than was possible during development. Future work could take advantage of that data set to further test the effectiveness of the PC-based filter on the effects of clouds over ocean. This can also enable us to explore alternative methods of retrieving PAN from AIRS taking advantage of, e.g., more advanced machine learning methods trained to directly retrieve , that may reduce the sounding-to-sounding noise. An interesting experiment would be to test whether a well-designed machine learning approach could be trained to directly predict the value CrIS would retrieve given only the AIRS radiances.
Appendix A Decision tree explainability
We use SHapeley Addition exPlanations
Figure A1
Beeswarm plot showing the Shapley values for the 13 input variables to the quality filtering decision trees that have non-zero contributions to the output flag. The meaning of each input variable's short name is given in Table
[Figure omitted. See PDF]
Some of the relationships shown in Fig.
Res. Norm. Init., the pre-PAN residual, follows the expected pattern where smaller residuals are more likely to yield a good sounding. Since this is the pre-PAN retrieval residual, this suggests that a successful retrieval is highly dependent on the previous steps minimizing the observation/model mismatch from other atmospheric parameters. This is a reasonable relationship, as PAN is a weaker absorber than the trace gases optimized in a previous step (Table
Rad. resid. mean, the mean of the post-PAN residual, should indicate how well the posterior solution matches the observed radiances. This is also a sensible metric, as it indicates how well the optimization algorithm minimized the cost function, so lower values should correlate with a higher probability of the sounding being marked as good quality, which is what we see in the land model.
Several of the other relationships from Fig.
Rad. resid. std. dev. shows another unexpected relationship. This is the standard deviation of the post-PAN radiance residuals. Larger values generally indicate that there is a lot of variation in how well the posterior state matched the observed radiances. This can occur if, e.g., narrow features in the radiances were not fit well by the posterior state causing specific frequencies to have large residuals. Interestingly, both models associate greater values with improved likelihood of a sounding being good quality. This may indicate that there are narrower absorption features than PAN which the optimization can try to fit incorrectly with PAN, and thus the decision tree is identifying that soundings were these non-PAN absorption features are not erroneously fit are more likely to be good quality. However, this is speculative. Alternatively, it may be similar to mean cloud optical depth and SNR, where this is simply identifying cases where the posterior is similar to the prior. Since the AIRS and CrIS retrievals use the same prior, this would result in consistent values for the same reasons as mentioned for cloud optical depth and SNR.
For the remaining features, we mostly see them centered on zero impact, with the long tails to either end having a mix of high and low input values. This indicates that there is not a clear correlation between these input values and the predicted quality.
Appendix B Physical interpretation of emissivity interference
Figure
Figure
Figure B1
(a) A map of surface emissivity at 1025
[Figure omitted. See PDF]
Appendix C Using a single decision tree for quality filtering
For the decision tree-based filter described in Sect.
Figure
However, when we compare Fig.
Figure C1
This figure is the same as Fig.
[Figure omitted. See PDF]
Figure C2
This figure is the same as Fig.
[Figure omitted. See PDF]
Code and data availability
A Jupyter notebook to reproduce the figures in this paper is available at 10.5281/zenodo.15305278
Author contributions
VHP developed the initial concept and secured funding. SSK performed the initial experiments to test if the AIRS PAN retrieval was practical and additional tests to determine the best choice of microwindows. JLL identified the possible sets of microwindows to use and carried out the other work described in this paper with input from VHP and SSK. JLL wrote the manuscript, VHP and SSK reviewed the manuscript.
Competing interests
The contact author has declared that none of the authors has any competing interests.
Disclaimer
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.
Acknowledgements
This research was carried out at the Jet Propulsion Laboratory (JPL), California Institute of Technology, under a contract with NASA (80NM0018D0004). Computational support was provided by the TROPESS Scientific Computing Facility at JPL. Parts of the results in this work make use of the colormaps in the CMasher package
Financial support
This research has been supported by the National Aeronautics and Space Administration (contract no. 80NM0018D0004).
Review statement
This paper was edited by Folkert Boersma and reviewed by two anonymous referees.
© 2026. This work is published under https://creativecommons.org/licenses/by/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.