Full Text

Turn on search term navigation

Introduction

Carbon in the atmosphere is present mostly in the form of carbon dioxide ( ${CO}_{2}$ ). Its amount is relatively small compared to the amount of carbon present in other reservoirs like the ocean . Being well mixed, atmospheric ${CO}_{2}$ is nevertheless easier to monitor by measurements than other carbon reservoirs. To improve the monitoring of atmospheric ${CO}_{2}$ , one can combine atmospheric ${CO}_{2}$ measurements with a numerical model. This paper describes such a system, which has been developed for the Copernicus Atmosphere Monitoring Service (CAMS).

Rather than using the relatively sparse network of the surface air-sample measurements, here we explore the measurements from satellite sounders in order to have a more global picture of the atmospheric ${CO}_{2}$ . To extract information on the ${CO}_{2}$ content in the atmosphere, passive atmospheric remote sounders measure in the thermal infrared (TIR) or in the near infrared/short-wave infrared (NIR/SWIR).

The Atmospheric Infrared Sounder (AIRS), measuring in the TIR, detects thermal radiation emitted by the Earth's surface and the atmosphere . The assimilation of the AIRS observed radiances was developed by at the European Centre for Medium-Range Weather Forecasts (ECMWF) using a four-dimensional variational (4-D-Var) data assimilation scheme. Their results showed the potential of data assimilation to constrain atmospheric ${CO}_{2}$ . They also showed the limitations of the assimilation of AIRS radiances, in particular due to the vertical sensitivity of the sounder. Due to the low thermal contrast between the Earth's surface and the air masses above, AIRS measurements have limited or no sensitivity to the lower troposphere and higher sensitivity to the middle atmosphere. Because the signals of the ${CO}_{2}$ surface sources and sinks are the largest in the near-surface and lower troposphere than in the middle atmosphere, AIRS measurements were not able to capture these signals.

In contrast, column-averaged dry-air mole fractions of ${CO}_{2}$ (or ${XCO}_{2}$ ) with a high near-surface sensitivity are retrieved from NIR/SWIR measurements based on scattered and back-scattered solar radiation; however, the NIR/SWIR measurements also have their limitations. They need sunlight and are therefore limited to daytime observations. Sufficiently cloud-free conditions and a low aerosol optical depth are also needed for accurate ${XCO}_{2}$ retrievals.

The aim of this study is to document the assimilation of ${XCO}_{2}$ products from NIR/SWIR measurements in order to constrain atmospheric ${CO}_{2}$ and to document how the assimilation impacts the simulated atmospheric ${CO}_{2}$ concentration. For that purpose, we assimilated the ${XCO}_{2}$ products derived from the NIR/SWIR spectra of the Greenhouse gases Observing Satellite GOSAT;. The assimilation system is based on the ECMWF system of , which has lately evolved for CAMS in order to assimilate retrieved products instead of observed radiances .

The assimilation system provides an analysis of the atmospheric ${CO}_{2}$ concentration that is then integrated in time using a forecast model. The ${CO}_{2}$ forecast model used in this study is documented by . In this model, the production and loss of ${CO}_{2}$ at the surface is based on surface fluxes that are partially prescribed and partially modelled. These ${CO}_{2}$ surface fluxes are not directly constrained by observations and they may deviate from reality. The accumulation of surface fluxes errors then leads to biases in the atmospheric ${CO}_{2}$ . On the other hand, the strength of the ${CO}_{2}$ forecast model is its ability to provide a realistic ${CO}_{2}$ synoptic variability. The first objective of this study is to determine the quality of the ${XCO}_{2}$ fields resulting from the assimilation of GOSAT ${XCO}_{2}$ data with a ${CO}_{2}$ forecast model where the ${CO}_{2}$ surface fluxes are not constrained.

The atmospheric ${CO}_{2}$ synoptic variability on a regional scale is related to the passage of frontal systems . These events are difficult to capture with the GOSAT measurements as the availability of the data is limited due to cloud contamination. Therefore, the second objective of this study is to document whether the assimilation helps improve the simulation of atmospheric ${CO}_{2}$ for synoptic events despite the lack of measurements nearby frontal systems.

Within CAMS, ECMWF is providing a ${CO}_{2}$ analysis based on the assimilation of GOSAT ${XCO}_{2}$ data with a delay of 5 days behind real time. A 10-day forecast is then issued from the analysis in order to provide the atmospheric ${CO}_{2}$ field in real time and for the next few days. The last objective of this study is to assess the quality of this forecast. The forecast quality as a function of the lead time and the season is evaluated against the analysis.

This paper is structured as follows. Section introduces the data sets used in this study. Section describes our atmospheric ${CO}_{2}$ simulations with and without assimilation of the GOSAT ${XCO}_{2}$ data, and how we compared them with independent measurements. Sections to present the global evaluation of our simulations, a case study and the evaluation of the ${CO}_{2}$ forecast based on the analysis. Finally, Sect. presents our conclusions.

Data sets

In this study, we used two sets of data. The first one is the measurements from the GOSAT's Fourier transform spectrometer and the ${XCO}_{2}$ product retrieved from these measurements by the University of Bremen (UoB) and is described in Sect. . The second one is the collection of measurements provided by the Total Carbon Column Observing Network (TCCON) and is described in Sect. .

GOSAT ${XCO}_{2}$

GOSAT is a joint effort between the Japanese Aerospace Exploration Agency (JAXA), the National Institute for Environmental Studies (NIES) and the Japanese Ministry of the Environment (MOE) as part of the Global Change Observation Mission (GCOM) programme of Japan. GOSAT was launched on 23 January 2009 and carries the thermal and near-infrared sensor for carbon observations, which consists of a Fourier transform spectrometer (TANSO-FTS) and a cloud and aerosol imager (TANSO-CAI).

In this study, we used ${XCO}_{2}$ retrieved from TANSO-FTS measurements of the upwelling radiance at the top of the atmosphere by the Bremen Optimal Estimation DOAS (differential optical absorption spectroscopy) (BESD) algorithm of UoB. The BESD algorithm was initially developed to retrieve ${XCO}_{2}$ from nadir measurements of the SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY) remote sensing spectrometer on the ENVIronment SATellite ENVISAT;. The BESD algorithm has been modified to also retrieve ${XCO}_{2}$ from GOSAT measurements. A detailed description of the GOSAT BESD algorithm can be found in . In brief, the algorithm uses three fitting windows, the $O_{2}$ -A band (12 920–13 195 ${cm}^{- 1}$ ), a weak ${CO}_{2}$ absorption band (6170–6278 ${cm}^{- 1}$ ) and a strong ${CO}_{2}$ band (4804–4896 ${cm}^{- 1}$ ) from both the medium- and high-gain (respectively M-gain and H-gain) GOSAT nadir modes. An optimal estimation-based inversion technique is used to derive the most likely atmospheric state from every individual GOSAT measurement using a priori knowledge. The BESD algorithm explicitly accounts for atmospheric scattering by clouds and aerosols, reducing potential systematic biases. The scattering information on cloud and aerosols is mainly obtained from the $O_{2}$ -A and strong ${CO}_{2}$ absorption bands.

We used an inhomogeneous GOSAT BESD ${XCO}_{2}$ data set in this study as the GOSAT BESD algorithm was still under development. This intermediate version of the GOSAT BESD ${XCO}_{2}$ data is referred to as MACC GOSAT BESD ${XCO}_{2}$ (MACC standing for Monitoring Atmospheric Composition and Climate, the precursor of CAMS). Nevertheless, from the beginning of 2014 onwards, we have been assimilating the current version of the GOSAT BESD data v01.00.02; in near-real time.

The TANSO-FTS detector has a circular field of view of 10.5 $km$ when projected on the Earth's surface (at exact nadir). In 2013, it measured in a mode with three measurements across track, and the footprints were separated by $\sim 263$ $km$ across track and $\sim 283$ $km$ along track. GOSAT can also operate in target mode, resulting in a finer sampling distance. For these specific situations, we further thinned the observations on a $1^{\circ} \times 1^{\circ}$ grid by removing all the observations but one chosen at random. This procedure avoids having several measurements in the same model grid cell during the assimilation. This thinning, plus the characteristics of the instrument (measurement only during sunlit periods) and the processing of the level-2 data procedure (retrievals for clear-sky conditions and only over land), reduces the number of GOSAT ${XCO}_{2}$ data to about 100 per day. The assimilation window of 12 h means that about 50 GOSAT ${XCO}_{2}$ data points are assimilated during each time window.

Example of the distribution of the assimilated GOSAT BESD ${XCO}_{2}$ data: July 2013 (top panel, about 3400 retrievals) and October 2013 (bottom, about 1270 retrievals). The monthly data are here aggregated on a $2^{\circ} \times 2^{\circ}$ grid and averaged. The blue/red represents the low/high averaged ${XCO}_{2}$ values in $ppm$ .

[Figure omitted. See PDF]

The geographic distribution of these data is dependent on the season and the atmospheric conditions as illustrated by Fig. . For example, in July 2013, GOSAT BESD data are available up to 75 $^{\circ}$ N, and in October 2013 they are available only up to 60 $^{\circ}$ N. The reason for this is the solar geometry and the filtering of measurements under high solar zenith angle (SZA) conditions, where ${XCO}_{2}$ is more challenging to retrieve as the impact of atmospheric scattering becomes larger compared to low-SZA conditions. Other data gaps are due to the strict cloud filtering and other types of filtering, like those based on the quality of the spectral fits, on scattering parameters, on the meteorological state, and on the measurement geometry.

The MACC GOSAT BESD ${XCO}_{2}$ data sets have been bias-corrected using the TCCON data. As this data set is delivered in near-real time and the TCCON data are delivered with a delay of few months, it was not possible to directly compare the two data sets. Instead, the TCCON data from the previous year were used and they were corrected assuming a 2 parts per million ( $ppm$ ) global atmospheric growth of ${CO}_{2}$ . A global offset was then computed and applied to the MACC GOSAT BESD ${XCO}_{2}$ based on the comparison between this data set and the corrected TCCON data set of the previous year. Moreover, with this procedure the TCCON data used in this study (same year as for the MACC GOSAT BESD ${XCO}_{2}$ data set) can be considered as independent data.

For the assimilation, the observation error covariances have to be specified. In this study, we assumed that the observation errors are not correlated in space and time. For the standard deviation of the observation error, we used the uncertainty of the BESD ${XCO}_{2}$ product provided together with the data. The BESD ${XCO}_{2}$ uncertainty product accounts for the various sources of uncertainty of the retrieval process. It varies in time and space around an average value of 2 $ppm$ . We furthermore established that the specified observation error based on the ${XCO}_{2}$ uncertainty globally matches the expected observation error using diagnostics posterior to the analysis (not shown).

TCCON ${XCO}_{2}$

The TCCON is a network of ground-based Fourier transform spectrometers recording direct solar spectra in the near infrared spectral region (http://tccon.ornl.gov/). The column-averaged dry-air mole fractions of ${CO}_{2}$ are retrieved from these spectra together with other chemical components of the atmosphere . In 2014, the version GGG2014 of the TCCON data was released. The errors on the retrieved ${XCO}_{2}$ are documented to be below 0.25 % ( $\sim 1$ $ppm$ ) until the solar zenith angles are larger than 82 $^{\circ}$ (Wunch et al., 2015).

When we downloaded the GGG2014 data in November 2015, 20 TCCON stations were providing data within the time period we are interested in (year 2013). Not all the stations were used in this study. First we removed JPL 2011 (USA), Pasadena/Caltech (USA) and Tsukuba (Japan), as they are not background stations and are associated with significant representativity errors. We also removed Edwards (USA). This station started to retrieve data from the middle of the year 2013, and we assumed that this was not long enough to provide information on the seasonal variation of the error in our simulations. Additionally, we removed Eureka (Canada) from the list of stations as the site was providing data during only 3 days in 2013. This selection of the TCCON stations left 16 stations for the study (Table ).

Orléans (France) had a specific treatment compared to the other stations. The averaging kernels were not specified in the GGG2014 release. Therefore, we decided to use the same information as for Lamont (USA) as advised in the previous release of the TCCON data (version GGG2012).

List of the TCCON stations used, ordered by latitude from north to south.

Site	Lat	Long	Starting date
Sodankylä (sodankyla01)	67.37	26.63	6 Feb 2009
Białystok (bialystok01)	53.23	23.02	1 Mar 2009
Bremen (bremen01)	53.10	8.85	6 Jan 2005
Karlsruhe (karlsruhe01)	49.10	8.44	19 Apr 2010
Orléans (orleans01)	47.97	2.11	29 Aug 2009
Garmisch (garmisch01)	47.48	11.06	16 Jul 2007
Park Falls (parkfalls01)	45.94	$-$ 90.27	26 May 2004
Four Corners (fourcorners01)	36.80	$-$ 108.48	1 Mar 2011
Lamont (lamont01)	36.60	$-$ 97.49	6 Jul 2008
Saga (saga01)	33.24	130.29	28 Jul 2011
Izaña (izana01)	28.30	$-$ 16.48	18 May 2007
Ascension Island (ascension01)	$-$ 7.92	$-$ 14.33	22 May 2012
Darwin (darwin01)	$-$ 12.43	130.89	28 Aug 2005
Réunion Island (reunion01)	$-$ 20.90	55.49	6 Oct 2011
Wollongong (wollongong01)	$-$ 34.41	150.88	26 Jun 2008
Lauder 125HR (lauder02)	$-$ 45.05	169.68	2 Feb 2010

Experimental setup

We ran two model simulations for the year 2013. The first is similar to the operational CAMS ${CO}_{2}$ forecast and is referred to as the “free run”. This simulation is used as the reference to assess the impact of the assimilation of the GOSAT BESD ${XCO}_{2}$ data. The second simulation is the analysis in which the GOSAT ${XCO}_{2}$ data are assimilated and is referred to as the “analysis”. The configuration of both simulations is described in Sect. . The simulations were evaluated against each other and also against the TCCON data. Section introduces the methodology used in comparison of simulations and the TCCON data.

Model simulations

The global simulations of atmospheric ${CO}_{2}$ are performed within the numerical weather prediction (NWP) framework of the Integrated Forecasting System (IFS). The ${CO}_{2}$ mass mixing ratio is directly transported within IFS as a tracer and is affected by surface fluxes. The transport is computed online and is updated each 12 $h$ , benefiting from the assimilation of all the operational observations within the IFS 4-D-Var assimilation system. The terrestrial biogenic carbon fluxes are also computed online by the carbon module of the land surface model Carbon-TESSEL or CTESSEL;, while other prescribed fluxes are read from ${CO}_{2}$ surface fluxes inventories (see , for more details).

The ability to assimilate retrieval products from GOSAT was included in IFS and is detailed in for the assimilation of methane data. The system used in this study is similar to the one of and is based on fixed background errors derived from the National Meteorological Center (NMC) method . The standard deviation of the background error is constant for each model level and slowly increases from the upper troposphere to the lower troposphere with values from about 1 to about 5 $ppm$ , and then rapidly increases to reach a value of about 40 $ppm$ at the surface. The correlation of the background errors varies over the whole domain and vertically with a representative length scale of about 250 $km$ . The system does not account for the spatial or temporal correlation between the errors of the observations.

We chose in this study to have a horizontal resolution of TL255 on a reduced Gaussian grid ( $\sim 80 km \times 80 km$ ), and 60 vertical levels from the surface up to 0.1 $hPa$ . This resolution is sufficient for resolving the large- and synoptic-scale horizontal structures ( $\sim 1000 km$ ) of the atmospheric ${CO}_{2}$ fields.

Comparison with TCCON

To evaluate the quality of the model simulations (free run and analysis), we have extensively used the TCCON data in this study. The comparison is performed in the TCCON space using the TCCON a priori and averaging kernel information (see Appendix for more details). In order to have a decomposition of the errors of the model column-averaged ${CO}_{2}$ against the TCCON measurement, we computed for each TCCON station $k$ for $k \in [1, N]$ , the mean difference (or bias) $δ_{k}$ and the standard deviation of the difference (or scatter) $σ_{k}$ over the $M_{k}$ times $t_{i}$ for $i \in [1, M_{k}]$ when we have a TCCON observation for the station $k$ . If ${\hat{c}}_{k}^{o} (t_{i})$ for $i \in [1, M_{k}]$ is the observed TCCON ${XCO}_{2}$ time series for the station $k$ , and if ${\hat{c}}_{k} (t_{i})$ for $i \in [1, M_{k}]$ is the model equivalent time series, then the bias $δ_{k}$ and scatter $σ_{k}$ are defined by $\begin{matrix} δ_{k} = \frac{1}{M_{k}} \sum_{i = 1}^{M_{k}} [{\hat{c}}_{k} (t_{i}) - {\hat{c}}_{k}^{o} (t_{i})], \\ σ_{k} = \sqrt{\frac{1}{M_{k} - 1} \sum_{i = 1}^{M_{k}} {[{\hat{c}}_{k} (t_{i}) - {\hat{c}}_{k}^{o} (t_{i}) - δ_{k}]}^{2}} . \end{matrix}$ Additionally, we computed the correlation coefficient $r_{k}$ between ${\hat{c}}_{k} (t_{i})$ and ${\hat{c}}_{k}^{o} (t_{i})$ for $i \in [1, M_{k}]$ .

Hovmöller diagram (latitude vs. time) of the smoothed bias (in ppm, negative/positive in blue/red) of the simulated ${XCO}_{2}$ against the data of the TCCON network, from 1 January to 31 December 2013. (a) Free-run simulation. (b) Analysis.

[Figure omitted. See PDF]

Following , we also computed the model offset $δ$ , the mean absolute error (MAE) $Δ$ , the station-to-station bias deviation $σ$ and the model precision $π$ for the $N$ TCCON stations $\begin{matrix} δ = \frac{1}{N} \sum_{k = 1}^{N} δ_{k}, & Δ = \frac{1}{N} \sum_{k = 1}^{N} | δ_{k} |, \\ σ = \sqrt{\frac{1}{N - 1} \sum_{k = 1}^{N} {[δ_{k} - δ]}^{2}}, & π = \frac{1}{N} \sum_{k = 1}^{N} σ_{k} . \end{matrix}$

The statistics for the comparisons of the simulations against the TCCON data have some gaps in time due to gaps in the availability of the TCCON data. They are also valid only where the TCCON sites are located, i.e. 16 points distributed over the globe. To have a more global overview of the model bias and scatter against the TCCON data, we smoothed these statistics in time and space (see Appendix for more details). In summary, for the bias we averaged all the model–measurement differences for each TCCON site using a 1-week time bin. We then fit the time evolution of the weekly bias with a function that combines a linear and a harmonic component for each station. The second step is an extrapolation in space. For each week, the weekly biases of every station are extrapolated using a quadratic function of latitude. This results in a Hovmöller diagram of the bias as a function of time and latitude. A similar process is applied for the scatter (see Figs. and ).

Statistics of the ${XCO}_{2}$ difference between the simulations (free run and analysis) and the average hourly TCCON data (model–TCCON): bias ( $δ_{k}$ , in $ppm$ ), scatter ( $σ_{k}$ , in $ppm$ ) and correlation coefficient ( $r_{k}$ ). Also shown are the mean, the mean absolute error (MAE) and the deviation of the stations bias (respectively $δ$ , $Δ$ and $σ$ , in $ppm$ ), the mean scatter ( $π$ , in $ppm$ ) and the mean $r$ (last three rows). The second column ( $N$ ) is the number of data used for computing the statistics.

		Free run			Analysis
Site	$N$	Bias	Scatter	$r$	Bias	Scatter	$r$
Sodankylä	20441	$-$ 1.59	1.35	0.91	$-$ 0.55	1.35	0.92
Białystok	16063	$-$ 2.68	1.96	0.81	$-$ 1.66	1.80	0.77
Bremen	4883	$-$ 1.62	1.52	0.79	$-$ 0.41	1.27	0.82
Karlsruhe	4201	$-$ 1.26	1.72	0.80	$-$ 0.25	1.54	0.82
Orléans	8444	$-$ 0.38	1.36	0.85	0.09	1.21	0.91
Garmisch	10371	$-$ 0.92	1.59	0.82	$-$ 0.29	1.62	0.80
Park Falls	27991	$-$ 1.69	2.06	0.81	$-$ 0.60	1.45	0.90
Four Corners	19872	0.69	1.76	0.58	0.57	1.43	0.74
Lamont	43731	$-$ 0.20	2.09	0.59	$-$ 0.04	1.35	0.80
Saga	10349	$-$ 1.19	1.61	0.75	$-$ 0.64	1.33	0.83
Izaña	4463	0.27	0.80	0.90	0.40	0.62	0.94
Ascension Island	7111	2.31	1.29	0.24	0.72	1.27	0.21
Darwin	29194	1.57	1.12	0.78	$-$ 0.02	1.04	0.79
Réunion Island	18880	0.56	0.73	0.76	$-$ 0.77	0.60	0.78
Wollongong	27562	0.30	1.05	0.71	$-$ 1.08	1.06	0.65
Lauder	53500	0.01	0.83	0.86	$-$ 0.97	0.59	0.85
Mean	16	$-$ 0.36	1.43	0.75	$-$ 0.34	1.22	0.78
MAE	16	1.08	–	–	0.57	–	–
Deviation	16	1.27	–	–	0.61	–	–

Same as Fig. but for the standard deviation, with yellow/red for low/high values.

[Figure omitted. See PDF]

Time series of ${XCO}_{2}$ (in ppm) at (a) Sodankylä, Finland; (b) Karlsruhe, Germany; (c) Park Falls, USA; and (d) Lauder, New Zealand, between 1 January and 31 December 2013. For each station, the top panel presents the daily averaged data from TCCON (black dots), the daily averaged data from GOSAT co-located in time and space with the station (yellow squares), the simulated ${XCO}_{2}$ (solid lines) and the daily averaged simulated ${XCO}_{2}$ in the observation space (coloured dots). The bottom panel presents the weekly averaged bias of the simulated ${XCO}_{2}$ against the TCCON data (coloured dots) and the smoothed bias (solid lines). Blue represents the free run, while red is for the analysis.

[Figure omitted. See PDF]

Global evaluation of the analysis

In this section we first present the characteristics of the ${XCO}_{2}$ derived from the free-run simulation when compared to the TCCON data. Second, we present the impact of the assimilation of the MACC GOSAT BESD ${XCO}_{2}$ comparing the ${XCO}_{2}$ from the analysis against the ${XCO}_{2}$ from the free run. Then, we discuss whether the analysis represents an improvement compared to the free run in terms of statistics against the TCCON data. Finally, we discuss the merits of the analysis compared to the MACC GOSAT BESD data using the TCCON data as a reference.

Free-run simulation vs. TCCON

When compared with the TCCON data, the free-run simulation has a mean offset $δ$ of $-$ 0.36 $ppm$ and a mean absolute error $Δ$ of 1.08 $ppm$ (Table ). However, the individual station bias $δ_{k}$ spans a range from 2.3 $ppm$ at Ascension Island (Saint Helena, Ascension and Tristan da Cunha) to $-$ 2.9 $ppm$ at Białystok (Poland). The station-to-station bias deviation $σ$ of the free-run simulation then has a value of 1.27 ppm.

The variations of the bias as well as the seasonal cycle of the bias are highlighted in the Hovmöller diagram displayed in Fig. a. First, it shows that the initial condition of the free run has a positive bias of about 2 $ppm$ over the tropical region (region between 23 $^{\circ}$ S and 23 $^{\circ}$ N) when compared to the TCCON data. This bias is reduced during the spring and reappears the next summer. It reaches its highest values in autumn with more than 2 $ppm$ . These results are slightly different from those of , where the model bias was found to be more constant in the tropical region when comparing the background ${CO}_{2}$ in the marine boundary layer with the National Oceanic and Atmospheric Administration (NOAA) GLOBALVIEW-CO2. Here, the evaluation of the bias in the tropics is driven by the comparison with $XCO 2$ measurements from the TCCON station of Ascension Island. For this station, the values of the bias from July to September result from the interpolation process as no measurements were reported during this period (Fig. S1 of the Supplement).

In contrast to the situation at the tropics, the initial condition of the free run has a negative bias at northern mid-latitudes (region between 23 and 66 $^{\circ}$ N) and reaches almost 4 $ppm$ at the latitude of Sodankylä (Finland, 67 $^{\circ}$ N) when compared to the TCCON ${XCO}_{2}$ . This value is the result of the smoothing process as we do not have data for that period (Fig. a). The negative bias at these mid-latitudes is nevertheless confirmed by the comparison with other stations, like Karlsruhe (Germany) and Park Falls (USA), where we have some data at the beginning of the year (Fig. b and c). The negative bias at northern mid-latitudes remains high during the whole year, with an absolute value generally greater than 1 $ppm$ at the end of spring, and in June and December. This can be explained by the fact that the model does not release enough ${CO}_{2}$ before and after the growing season, i.e. March to May and October to December, and by the fact that, in the model, the onset of the ${CO}_{2}$ sink associated with the growing season starts too early in the season .

The precision $π$ of the free run measured by the average scatter between the simulation and the TCCON data is 1.4 $ppm$ (Table ). Similarly to the bias, the scatter varies in time and space as highlighted by the Hovmöller diagram of the scatter (Fig. a). The scatter has its highest values of more than 1 $ppm$ at the northern mid-latitudes during May–June–July. This increase in the scatter is driven by the behaviour of the free run at Sodankylä. There, the simulation has a larger variability than the measurements. For example, at the end of June, the simulation presents a decrease of about 7 $ppm$ in 36 h, whereas the measurements show a decrease of about 4 $ppm$ (Fig. a). Elsewhere, there is also an increase in the scatter between May and July which is during the Northern Hemisphere growing season. This increase could be explained by the difficulty for CTESSEL to model the terrestrial biogenic carbon fluxes during the growing season, which leads to higher variability in the simulated atmospheric ${CO}_{2}$ .

Analysis vs. free run

To assess the impact of the assimilation of the MACC GOSAT BESD ${XCO}_{2}$ , we compared the evolution of ${XCO}_{2}$ from the analysis with ${XCO}_{2}$ from the free simulation. Figure presents the Hovmöller diagram (time vs. latitude) of this difference. It shows that the first region where the analysis impacts ${XCO}_{2}$ is the tropics. There, compared to the free run, the analysis continuously decreases ${XCO}_{2}$ by up to 1 $ppm$ in June and by more than 2 $ppm$ from September to December. The assimilation of the GOSAT data consequently causes an improvement as the free run has a positive bias in this region in autumn compared to the TCCON data.

The analysis also decreases ${XCO}_{2}$ over the southern extra tropics (region between 23 and 66 $^{\circ}$ S) when compared to the free run (Fig. ). The decrease extends to the southern high latitudes ( $\geq$ 66 $^{\circ}$ S) even when no GOSAT data were assimilated in this region. This decrease results mainly from the transport of ${CO}_{2}$ from the equatorial region and southern mid-latitudes towards southern high latitudes. Unfortunately, there are no independent ${XCO}_{2}$ data available at southern high latitudes to assess the merits of the analysis there.

Despite the fact that some GOSAT data are assimilated in the northern mid-latitudes during the first months of the simulation, the analysis only starts to differ significantly from the free run from March onwards. In this region, north of 30 $^{\circ}$ N, the analysis has higher values of ${XCO}_{2}$ than the free run, with a difference of more than 2 $ppm$ during the northern summer. Again, the assimilation of the GOSAT data improves the simulated ${XCO}_{2}$ as the free run shows a strong negative bias there. Similar to the behaviour discussed for the southern high latitudes, the change in the ${CO}_{2}$ concentration at northern mid-latitudes is transported northward to higher latitudes. There is, nevertheless, a difference between the two hemispheres. For the Northern Hemisphere we have more data at high latitudes, especially during the summer, when the northernmost GOSAT measurements' cover goes up to 80 $^{\circ}$ N.

Analysis vs. TCCON data

When compared with the TCCON data, the GOSAT BESD ${XCO}_{2}$ analysis has an offset $δ$ of $-$ 0.34 $ppm$ and a mean absolute error $Δ$ of 0.57 $ppm$ (Table ). The offset is similar to that of the free run ( $-$ 0.36 $ppm$ ), but the mean absolute error is improved (1.08 $ppm$ for the free run). The individual station bias is moreover more constant in time for the analysis compared to the free run. For example, the trend of the free-run bias is 2.08 $ppm {yr}^{- 1}$ for Lauder (New Zealand) (Table S1 of the Supplement), and it improves to 0.47 $ppm {yr}^{- 1}$ for the analysis (Table S2 and Fig. c).

By increasing ${XCO}_{2}$ in the northern mid-latitudes as discussed before, the analysis considerably reduces the bias. A residual seasonal cycle in the bias is still present, with values usually in the range of 0 to 3 $ppm$ (Fig. b). This could be explained by the fact that we correct the atmospheric state of ${CO}_{2}$ and not the ${CO}_{2}$ fluxes. During the seasons when the ${CO}_{2}$ fluxes are the main driver of the atmospheric ${CO}_{2}$ , the optimization of the atmospheric state only may not be enough.

The analysis has a more constant bias in time than the free run. It is also more accurate in space, with a station-to-station bias deviation $σ$ that is largely reduced compared to the free run with a value of 0.61 $ppm$ against 1.27 $ppm$ (Table ). The assimilation of the MACC GOSAT BESD ${XCO}_{2}$ thus helps to significantly improve the accuracy of the model. The assimilation also helps improve the precision $π$ , with the mean scatter improved by 15 $%$ , reduced to a value of 1.22 $ppm$ . The scatter of the analysis is reduced for all TCCON stations compared to the free run except for Garmisch (Germany), where the scatter remains essentially unchanged. The Hovmöller diagram of the scatter shows that the main reduction is in the northern high latitudes in May (Fig. ). In particular, the analysis shows less spurious variability than the free run at Sodankylä (Fig. a).

Analysis vs. MACC GOSAT BESD data

The analysis is much more accurate and more precise than the free run when compared to the TCCON data. The analysis also fills the gaps in time and space of the MACC GOSAT BESD data. In this section, we evaluate the analysis against the MACC GOSAT BESD data once more using the TCCON data as a reference.

The MACC GOSAT BESD data were compared to the TCCON data using a geolocation criterion of 5 $^{\circ}$ in space and a time window of $\pm 2$ $h$ . Before computing the difference between each GOSAT–TCCON pair, following , we added a correction to the GOSAT-retrieved value in order to account for the use of different a priori ${CO}_{2}$ profiles in the two products. Moreover, we only kept the stations where more than 30 GOSAT–TCCON pairs were found in order to have more robust statistical results. This procedure removes Izaña (Spain), Ascension Island, Réunion Island (France) and Lauder from the list of the used TCCON stations in the comparison and reduces the number of stations to 12 (Table ).

Statistics of the ${XCO}_{2}$ differences between the MACC GOSAT BESD data set and the average hourly TCCON data (left block, GOSAT–TCCON) or the analysis and the average hourly TCCON data (right block, model–TCCON): bias ( $δ_{k}$ , in $ppm$ ), scatter ( $σ_{k}$ , in $ppm$ ) and correlation coefficient ( $r_{k}$ ). The analysis has been sampled similarly to the GOSAT data set in time and space. Also shown are the mean, the mean absolute error (MAE) and the deviation of the stations bias, the mean scatter (all in $ppm$ ) and the mean $r$ (last three rows). The second column ( $N$ ) is the number of data points used for computing the statistics.

		MACC GOSAT data set			Analysis
Site	$N$	Bias	Scatter	$r$	Bias	Scatter	$r$
Sodankylä	90	$-$ 0.26	4.50	0.39	0.24	1.41	0.92
Białystok	58	$-$ 0.28	3.45	0.32	1.06	1.99	0.17
Bremen	41	1.19	2.34	0.53	0.54	0.86	0.81
Karlsruhe	91	1.45	2.74	0.52	0.89	0.74	0.88
Orléans	52	0.20	2.44	0.34	1.29	0.57	0.84
Garmisch	76	1.64	3.10	0.55	1.17	1.06	0.77
Park Falls	63	1.50	3.22	0.71	$-$ 0.08	1.03	0.95
Four Corners	102	$-$ 0.00	3.79	0.64	0.65	0.81	0.89
Lamont	340	$-$ 1.01	4.05	0.57	0.05	1.01	0.91
Saga	61	0.40	2.95	0.76	0.14	0.88	0.90
Darwin	234	$-$ 1.27	3.37	0.42	$-$ 0.11	0.81	0.84
Wollongong	221	$-$ 3.03	3.86	0.31	$-$ 1.54	1.07	0.74
Mean	12	0.04	3.32	0.50	0.36	1.02	0.80
MAE	12	1.02	–	–	0.65	–	–
Deviation	12	1.31	–	–	0.74	–	–

Hovmöller diagram (latitude vs. time) of the difference in ppm (negative/positive in blue/red) between ${XCO}_{2}$ from the analysis and from the free-run simulation, from 1 January to 31 December 2013. The horizontal dotted lines represent the latitude of the northernmost and the southernmost TCCON station, respectively. The grey shaded areas are where GOSAT does not provide observations.

[Figure omitted. See PDF]

For each GOSAT–TCCON pair, we extracted the ${CO}_{2}$ profile from the analysis at the same location and time as the GOSAT measurement before computing the difference between the model and the TCCON data. In this way, we have a fair comparison between the analysis and the MACC GOSAT BESD data with respect to the TCCON data.

The resulting subset of the analysis minus TCCON differences has a different offset than the full data set but a similar mean absolute error, station-to-station bias deviation and precision (Tables and ). The difference in the offset is mainly due to a difference in the sampling between the subset and the full data set over the Northern Hemisphere. Due to few or no pairs occurring in spring for the subset, the sampling misses the negative bias of the analysis there. Missing the negative bias of the analysis results in an increased offset. In that respect, the mean absolute error is less sensitive to the used data set (subset or full data set).

The analysis has a lower mean absolute error $Δ$ than the one from the MACC GOSAT BESD data (0.65 $ppm$ vs. 1 $ppm$ , Table ), a station-to-station bias deviation $σ$ almost half of the one from GOSAT data (0.7 $ppm$ vs. 1.3 $ppm$ ) and has an improved precision $π$ (1 $ppm$ vs. 3.3 $ppm$ ). The mean correlation coefficient is also higher in the analysis than in the satellite data with a value of 0.8 compared to 0.5. The statistics of the MACC GOSAT BESD data found here are different than those of , who used a more recent version of the GOSAT BESD product. With the successive improvements in the BESD algorithm, the latest version has a station-to-station bias deviation of $\sim 0.4$ $ppm$ and a precision of $\sim 2$ $ppm$ .

The better precision (lower value of $π$ ) and the lower value of the mean absolute error $Δ$ and station-to-station bias deviation $σ$ of the analysis compared to the MACC GOSAT BESD data set shows that the analysis is capable of smoothing the scatter of the satellite data. Moreover, the analysis is able to fill the gaps of the satellite data in time and space.

Case study of a cold front over Park Falls

The ${CO}_{2}$ concentration could be strongly affected by frontal systems. As an illustration, such a situation occurred at the end of May 2013, close to the TCCON station of Park Falls, Wisconsin, USA, when a cold front came from the northwest. On 31 May, the ${XCO}_{2}$ dropped from 398.62 $ppm$ at 08:15 $LT$ (local time) to 395.97 $ppm$ at 12:53 $LT$ (Fig. , top panel). This sudden decrease of 2.65 $ppm$ in less than 5 $h$ occurs after the arrival of a cold front, which is associated with a decrease in the surface pressure and a decrease in the temperature at 500 $hPa$ (Fig. , lower panel).

The free run is able to capture the sudden decrease in ${XCO}_{2}$ , highlighting the skill of the model for such a situation (Fig. , upper panel). The flow during this period is mainly a descent of cold air from Canada towards the midwestern and eastern US. This cold air mass is depleted in ${CO}_{2}$ relative to the background (Figs. e and f). When it moves towards Park Falls, it results in decreasing ${XCO}_{2}$ as observed and simulated, but the decrease in the free run is too strong by 2 to 3 $ppm$ compared to the measurements.

Situation over Park Falls (USA) between 30 May and 2 June. Top panel: evolution of ${XCO}_{2}$ (in $ppm$ ) from hourly averaged TCCON data (black dots), the free run (blue line and dots) and the analysis (red line and dots). The dots are the values of the model in the observation space. Lower panel: evolution of the mean sea level pressure (in $hPa$ , black line) and the temperature at 500 $hPa$ (in $K$ , magenta line). The vertical dotted lines represent 31 May, at 00:00 $UTC$ and at 12:00 $UTC$ , and 1 June, at 00:00 $UTC$ .

[Figure omitted. See PDF]

Situation around Park Falls (black triangle), Wisconsin, USA, end of May 2013. (a) Average increment in terms of ${XCO}_{2}$ (in ppm, negative/positive in blue/red) on 30 May 2013 (contours) and location of the GOSAT measurements during this day (black rectangles). (b, c, d) ${XCO}_{2}$ (in ppm) on 31 May at 00:00 $UTC$ , at 12:00 $UTC$ and on 1 June at 00:00 $UTC$ , respectively, from the analysis. (e, f) ${XCO}_{2}$ (in ppm) on 31 May at 12:00 $UTC$ and on 1 June at 00:00 $UTC$ from the free run (below/above background value in blue/red). For (b) to (f) the dark contours are the values of the geopotential at 500 $hPa$ .

[Figure omitted. See PDF]

We investigated whether the assimilation of the GOSAT data helps improve the simulated evolution of the ${CO}_{2}$ concentration for such situations even if the number of BESD GOSAT data is limited in the vicinity of a frontal system due to the strict cloud filtering. Frontal systems are associated with clouds formed when moist air between the cold and warm fronts is lifted.

On 30 May, we have a few GOSAT measurements over the north and northeast region of North America (Fig. a). These measurements have the effect of increasing the ${XCO}_{2}$ in this region (Fig. b–d). The cold air mass is then richer in ${CO}_{2}$ in the analysis compared to the free run, and when it moves towards Park Falls, the decrease is weaker and closer to the observed decrease. The assimilation of the GOSAT data helps improve the simulation by correcting the large-scale structure upstream and by improving the large-scale atmospheric ${XCO}_{2}$ horizontal gradient.

The ${XCO}_{2}$ decrease continues the next day on 1 June in both simulations as the cold front continued its descent. Unfortunately, likely due to the presence of clouds, no TCCON measurements are available during this period to corroborate the simulated ${XCO}_{2}$ decrease.

Anomaly correlation coefficient (ACC) of the forecast compared to its own analysis as a function of the forecast lead time and for each month: (a) global ACC, (b) ACC for the Northern Hemisphere (20–90 $^{\circ}$ N), (c) ACC for the tropics (20 $^{\circ}$ S–20 $^{\circ}$ N), and (d) ACC for the Southern Hemisphere (90–20 $^{\circ}$ S). Each month is represented by a different colour (see inset legends).

[Figure omitted. See PDF]

Forecast based on the analysis

Within CAMS, we are receiving the GOSAT BESD data for a given day with a delay of 5 days behind real time. The analysis for this day is run as soon as the data are received. A 10-day forecast is then subsequently run based on the resulting analysis.

In this section, we aim to evaluate the forecast as a function of its lead time by comparing the forecast to the analysis valid for the same time. This comparison informs us about how long the information provided by the analysis remains in the forecast. Assuming perfect transport and perfect surface fluxes, the analysis and the forecast (valid for the same time) should be similar given that the analysis accurately corrects the atmospheric concentration of ${CO}_{2}$ . In practice, the differences observed between the analysis and the forecast could come from either the transport, the surface fluxes or the analysis.

To compare a forecast with the analysis valid for the same time, we computed the anomaly correlation coefficient (ACC) for ${XCO}_{2}$ (see Appendix for more details). The ACC can be regarded as a skill score relative to the climatology: the higher the ACC, the better the forecast. In the framework of NWP, an ACC reaching 50 $%$ corresponds to forecasts for which the error is the same as for a forecast based on a climatological average. An ACC of about 80 $%$ indicates valuable skill in forecasting large-scale synoptic patterns.

We computed the ACC for each month individually as we know that the surface fluxes, drivers of the difference between the forecast and the analysis, have a strong seasonal cycle. We also computed it for different domains (globe, tropics and mid- to high latitudes) and for several forecast lead times, from 12 $h$ up to 10 days. We found that the ACC is globally more than 90 $%$ for day 3 and almost always more than 85 $%$ for day 5 for each single month (Fig. a). This means that the forecast for today based on the analysis of 5 days ago shows the same large-scale synoptic ${XCO}_{2}$ patterns as the analysis. The information of the analysis therefore lasts long enough in the forecast to provide a good quality 5-day forecast for today (compared to the analysis). The information lasts longer in the tropics than in the Northern Hemisphere and slightly longer in the Northern Hemisphere than in the Southern Hemisphere (Fig. b to d). This difference between the two hemispheres may reflect the fact that the ${CO}_{2}$ variability is much weaker in the Southern Hemisphere.

For forecasts longer than 5 days, globally, there are two particular months for which the ACC decreases faster than the others, i.e. July and December. For example, for these two months the ACC at day 5 is similar to the ACC at day 10 for October. This means that for July and December, the medium-range ${XCO}_{2}$ forecast (between 5 and 10 days) should be used more carefully. For July, the drop in skill occurs mainly over the Northern Hemisphere. The main reason is that the ${CO}_{2}$ fluxes are an even more important driver of the ${CO}_{2}$ concentration than the initial ${CO}_{2}$ concentration for this month. To better understand the impact of the surface fluxes, let us assume that in July we have too little release or, similarly, too much uptake of ${CO}_{2}$ in the atmosphere in the model over the Northern Hemisphere (as confirmed by Fig. a). This induces a negative bias of the ${CO}_{2}$ surface fluxes in the model. In the meantime, the analysis increases the ${CO}_{2}$ concentration helped by the GOSAT BESD data (Fig. ). However, the next 12 h short-term forecast (used as the background for the next analysis) will not increase the ${CO}_{2}$ concentration enough due to the negative bias of the ${CO}_{2}$ fluxes. This opposition between the analysis and the short-term forecast explains the reduction in skill during the periods when the surfaces fluxes are the most important driver of the ${CO}_{2}$ concentration in the atmosphere.

The global drop in skill for December is not directly related to a particular region as for July. It is nonetheless the second worst month for the tropics (after January) and the third worst for the Northern Hemisphere (together with September). Over the tropics during the winter, the reduction in skill is due to the opposite effect as for July over the Northern Hemisphere: the ${CO}_{2}$ fluxes are important and there is a positive bias in the fluxes (too much release or too little uptake of ${CO}_{2}$ in the atmosphere) in the model. For these situations when the ${CO}_{2}$ fluxes are the main driver of the atmospheric ${CO}_{2}$ , the only solution to improve the skill would be to optimize the ${CO}_{2}$ fluxes together with the ${CO}_{2}$ initial conditions.

Conclusions

The Copernicus Atmosphere Monitoring Service (CAMS) greenhouse gases data assimilation within the numerical weather prediction (NWP) framework of the Integrated Forecasting System (IFS) is designed to correct the atmospheric concentration of ${CO}_{2}$ instead of the surface fluxes in order to constrain the atmospheric ${CO}_{2}$ . This requires the use of a short assimilation window so as to neglect the model errors of the short-term forecast (lasting the length of the assimilation window). In the case of atmospheric ${CO}_{2}$ , model errors are related to potentially inaccurate surface fluxes or transport.

This article demonstrates the benefit of the assimilation of ${XCO}_{2}$ data derived from the Greenhouse gases Observing Satellite (GOSAT) by intermediate versions of the Bremen Optimal Estimation DOAS (BESD) algorithm of the University of Bremen (UoB). The assimilation of the GOSAT BESD ${XCO}_{2}$ provides a ${CO}_{2}$ analysis that was compared to a free-run forecast where the ${CO}_{2}$ concentration is not constrained by any ${CO}_{2}$ observation. The comparison was 1 year long (year 2013) and both simulations (analysis and free run) were evaluated against measurements from the Total Carbon Column Observing Network (TCCON). We showed that the free run has a negative bias at northern mid-latitudes and a large positive bias in the tropical region with strong seasonal variations in both regions. These results are consistent with the biases documented by and mainly associated with biogenic fluxes.

The analysis significantly reduces these biases without completely removing them, with a remaining mean offset of $-$ 0.34 $ppm$ and a mean absolute error of 0.57 $ppm$ compared to the TCCON data. However, the accuracy estimated with the station-to-station bias deviation is 0.61 $ppm$ . This represents a large improvement compared to the free run, for which the accuracy is 1.27 $ppm$ . The precision of the analysis estimated with the mean scatter is 1.22 $ppm$ , slightly better than for the free run with a value of 1.43 $ppm$ .

The analysis produced in this paper was compared to the assimilated MACC GOSAT BESD data using TCCON data as a reference. This comparison showed that the analysis has a lower station-to-station bias deviation than the assimilated data (0.7 $ppm$ compared to 1.3 $ppm$ ). The precision is much better for the analysis, with a scatter of 1 $ppm$ , while the assimilated data have a scatter of 3.3 $ppm$ . The precision of the analysis is also better than the documented precision of other GOSAT ${XCO}_{2}$ products. The precision of the NIES product extracted from is 1.8 $ppm$ . The precision of the University of Leicester product and of the SRON Netherlands Institute for Space Research product is respectively 2.5 and 2.37 $ppm$ . The ${CO}_{2}$ analysis is consequently an alternative to the standard ${XCO}_{2}$ GOSAT products as it provides a lower or similar station-to-station bias deviation and a better-precision ${XCO}_{2}$ product compared to TCCON. Moreover, it has a uniform spatio-temporal resolution.

The pre-operational CAMS ${CO}_{2}$ analysis is similar to the analysis presented in this paper, having nevertheless a higher horizontal resolution (TL511 on a reduced Gaussian grid, $\sim 40 km \times 40 km$ ), and a higher vertical resolution with 137 vertical levels. It currently assimilates the most recent version of the GOSAT BESD data presented by in near-real time. These data have an improved bias deviation ( $\sim$ 0.4 $ppm$ ) and an improved precision ( $\sim$ 2 $ppm$ ) compared to those used in this study. The near-real-time CAMS ${CO}_{2}$ analysis should therefore have an improved station-to-station bias deviation and precision than the analysis presented in this paper.

We corrected the atmospheric concentration by only constraining the atmospheric concentration and not the surface fluxes. When and where the surface flux is a significant driver of the atmospheric concentration and if the assimilated data are not good enough or not numerous enough (in time and space), then constraining only atmospheric ${CO}_{2}$ does not compensate for the error in the surface flux. The next step is to further improve the carbon module CTESSEL in order to reduce the bias of the model. Another long-term solution would be to constrain the surface flux at the same time as the concentration.

One strength of the ${CO}_{2}$ model used in this study is its ability to represent ${CO}_{2}$ variations associated with synoptic weather systems . By correcting the large-scale ${XCO}_{2}$ patterns and removing part of the model bias, we showed with a case study that the analysis is able to better represent the ${CO}_{2}$ variations associated with these situations. The variations in the atmospheric reservoir of ${CO}_{2}$ are the result of changes in the surface fluxes to and from the atmosphere. If the characteristics of the analysis are found to be satisfactory in terms of bias and precision, the analysis could be included into a flux inversion system to infer surface fluxes.

The horizontal resolution of this study is half the horizontal resolution of the pre-operational analysis and the vertical resolution of the pre-operational analysis is also higher. One should expect an even better representation of the ${CO}_{2}$ variability in the pre-operational analysis. In the future, the horizontal resolution could be increased even further toward the ECMWF operational resolution of about $16 km \times 16 km$ .

The quality of the analysis is considered to be sufficient to assess the quality of the forecast as a function its lead time. We showed that the forecast for day 3 and day 5, which will be the valid range for today's forecast, has an anomaly correlation coefficient of 90 and 85 $%$ , respectively. This means that we are providing a ${CO}_{2}$ forecast with accurate synoptic features for today. With a good representation of the variability and a bias mostly under 1 $ppm$ , the CAMS atmospheric ${CO}_{2}$ promises to become a useful product, for example, for planning a measurement campaign. It could also be used as the a priori in the satellite or TCCON retrieval algorithms or be used to evaluate the retrieval products from the Orbiting Carbon Observatory-2 (OCO-2, oco.jpl.nasa.gov).

Comparing the model against TCCON

For the comparison with the TCCON data, one has to account for the a priori information used in the retrieval that links ${\hat{c}}^{o}$ , the TCCON-retrieved ${XCO}_{2}$ to $x^{t}$ , the true (unknown) ${CO}_{2}$ profile , ${\hat{c}}^{o} = c^{b} + a^{T} (x^{t} - x^{b}) + ε,$ where $x^{b}$ is an a priori profile of ${CO}_{2}$ , $a$ is a vector resulting from the product of the averaging kernel matrix with a dry-pressure weighting function vector (for the vertical integration), $c^{b}$ is the column-averaged mixing ratio computed from $x^{b}$ , and $ε$ is the error in the retrieved column-averaged mixing ratio. This error includes the random and systematic errors in the measured signal and in the retrieval algorithm.

To compare the model with the TCCON-retrieved value, we used the same a priori information, so that the model profile $x$ is converted to a column-averaged mixing ratio $\hat{c}$ by $\hat{c} = c^{b} + a^{T} (x - x^{b}) .$

The comparison between the simulation and TCCON occurs in the observation space with the difference between the model column-averaged mixing ratio $\hat{c}$ of Eq. () and the TCCON column-averaged mixing ratio ${\hat{c}}^{o}$ of Eq. (), $\hat{c} - {\hat{c}}^{o} = a^{T} (x - x^{t}) - ε .$

Let us define $η = a^{T} (x - x^{t})$ as the model error in terms of the column-averaged mixing ratio. It accounts for numerous errors, for example the errors directly linked to the model processes like the transport, the errors in the surface fluxes, the representativity error and the error due to the assimilation of the GOSAT ${XCO}_{2}$ data for the analysis. The difference between the smooth model column-averaged mixing ratio $\hat{c}$ and the TCCON column-averaged mixing ratio ${\hat{c}}^{o}$ is, therefore, the sum of the model error $η$ and the error in the retrieved column-averaged mixing ratio $ε$ .

To compute the model column-averaged mixing ratio $\hat{c}$ of Eq. () equivalent to each TCCON measurement, we extracted the two model profiles that are closest to the measurement time and at the nearest grid point to the measurement. The two profiles are then interpolated in time in order to obtain the model profile at the same time as the measurement. Finally, we computed the column-averaged mixing ratio according to Eq. ().

Smoothing the statistics against TCCON

In order to have a more global view of the bias and the scatter of a simulation against the data from the TCCON network, we have developed and used a two-step algorithm. The first step consists in computing the statistics (bias and the standard deviation) for each week of 2013 and for each TCCON station when the data are available. The weekly statistics are then interpolated in time using a function described in the following section (Sect. ). This allows one to fill in the gaps in time when no data are available. We therefore have a value for the bias at each station and for each week. For the second step, we compute a quadratic function of latitude that best fits the interpolated biases for each week (Sect. ).

Time smoothing

For each TCCON station $k$ and for each week $w^{l}$ for $l \in [1, 52]$ , we compute the mean difference $δ_{k}^{l}$ and the standard deviation of the difference $σ_{k}^{l}$ between every TCCON observation during this week and the model equivalent value. The statistics are computed only when more than 10 TCCON measurements are available during the week. The averaged difference (or bias) is then interpolated in time $t$ with the function ${\tilde{b}}_{k} (t)$ that combines a linear growth and a harmonic component, $\begin{matrix} {\tilde{b}}_{k} (t) = a_{k} t + b_{k} + α_{k} sin⁡ (\frac{t}{τ_{1}} + φ_{k}) \\ + β_{k} sin⁡ (\frac{t}{τ_{2}} + φ_{k}) . \end{matrix}$ $a_{k}$ , $b_{k}$ , $α_{k}$ , $β_{k}$ and $φ_{k}$ are the parameters of the function ${\tilde{b}}_{k} (t)$ obtained by an optimization procedure that minimizes the distance between ${\tilde{b}}_{k} (t)$ and the series of $δ_{k}^{l}$ for $l \in [1, 52]$ . $τ_{1}$ is chosen to be 6 months and $τ_{2}$ 3 months. The form of the function of Eq. () thus gives a linear growing bias and allows seasonal variations. A similar function is used for the standard deviation.

Spatial smoothing

The time smoothing allows us to fill in the gaps in the time series of the bias for each station, when for a given week we do not have any measurement to compare with. Following , we then compute for each week $w^{l}$ the best fit of the interpolated biases with a quadratic function of latitude ${\hat{b}}^{l}$ , ${\hat{b}}^{l} (ϕ) = a^{l} ϕ^{2} + b^{l} ϕ + c^{l},$ where $ϕ$ is the sine of the latitude. $a^{l}$ , $b^{l}$ and $c^{l}$ are obtained by an optimization procedure that minimizes the distance between ${\hat{b}}^{l}$ and the weekly interpolated biases $δ_{k}^{l}$ for $k \in [1, N]$ . A similar function is used for the standard deviation.

Discussion

For some stations, the availability of the weekly differences is not uniform in time and the time smoothing of Eq. () provides spurious values. We solved this issue by fixing the coefficient $α_{k}$ to a zero value (See Table S1).

With a root mean square error (RMSE) mostly under 0.7 $ppm$ and a correlation mostly over 0.8, the smoothed bias matches well with the weekly bias (Table S1). The Hovmöller diagram (Fig. ) can, thus, be considered as an accurate representation of the overall bias.

Compared to the bias, the fit between the time series of the weekly scatter and the regression is not as good for the scatter. The correlation coefficient is mostly between 0.5 and 0.7 (Table S1).

Anomaly correlation coefficient

The anomaly correlation coefficient (ACC) between the forecast $f$ and the analysis $a$ is computed using the climatology $c$ by $ACC = \frac{\overline{(f - c) (a - c)}}{\sqrt{\overline{(f - c)^{2}} \overline{(a - c)^{2}}}},$ where the overline is the spatial and temporal average. For example, for the forecast range 24 h, we take the ${XCO}_{2}$ fields from all the 24 h forecasts for a given month, all the analyses valid for the same time, and a fixed climatology for this month.

The climatology is based on a free-run simulation using the optimized ${CO}_{2}$ surface fluxes from , which simulated the years from 2003 to 2012. For each month, we compute the average over the 10 years of the simulation, rescaling the mean so that the mean is the same as for the analysis, avoiding with this procedure the issue of the increase in ${CO}_{2}$ over time. The two-dimensional climatology field for ${XCO}_{2}$ for the month $m$ is $\begin{matrix} c (m) = \frac{1}{10} \sum_{y = 2003}^{2012} \frac{1}{n (y, m)} \sum_{d = 1}^{n (y, m)} \\ [Σ (y, m, d) - \overline{Σ} (y, m, d)] + {\overline{Σ}}_{an} (m), \end{matrix}$ where $y$ is the year, $n$ the number of days for the year $y$ and the month $m$ , $d$ is an index for the day, $Σ (y, m, d)$ is the ${XCO}_{2}$ field from the simulation for the year $y$ , the month $m$ and the day $d$ , and $\overline{Σ}$ is a spatial average of $Σ$ and ${\overline{Σ}}_{an} (m)$ is the spatial and temporal average of the ${XCO}_{2}$ fields from the analysis for the month $m$ (and the year 2013).

The Supplement related to this article is available online at doi:10.5194/acp-16-1653-2016-supplement.

S. Massart designed and carried out the experiments with the help of A. Agustí-Panareda and advice from F. Chevallier, J. Heymann and M. Buchwitz. J. Heymann, M. Reuter, M. Hilker, M. Buchwitz and J. P. Burrows were responsible for the design and operation of the BESD GOSAT ${XCO}_{2}$ retrieval algorithm. S. Massart prepared the manuscript with contributions from A. Agustí-Panareda, J. Heymann, M. Buchwitz, F. Chevallier, M. Reuter, M. Hilker, J.P. Burrows, D. G. Feist, and F. Hase. N. M. Deutscher and R. Sussmann contributed to the ACP version of the paper. F. Desmet is the co-investigator of the La Réunion TCCON station. N. M. Deutscher and C. Petri are responsible for the Białystok, Bremen and Orléans TCCON data. M. Dubey is the PI of Four Corners TCCON station. D. G. Feist is the PI of the Ascension TCCON station. D. W. T. Griffith and V. Velazco are the PIs of Darwin and Wollongong stations. F. Hase is the PI of the Karlsruhe TCCON station. R. Kivi is the PI of the Sodankylä TCCON station. M. Schneider is the PI of Izaña TCCON station. R. Sussmann is the PI of the Garmisch TCCON station.

Acknowledgements

This study was funded by the European Commission under the European Union's Horizon 2020 programme. The development of the GOSAT BESD algorithm received funding from the European Space Agency (ESA) Greenhouse Gases Climate Change Initiative (GHG-CCI). TCCON data were obtained from the TCCON Data Archive, hosted by the Carbon Dioxide Information Analysis Center (CDIAC) – http://tccon.ornl.gov/. Garmisch work was funded in part via the ESA GHG-CCI project. Four Corners TCCON was funded by LANL's LDRD programme. Darwin and Wollongong TCCON measurements are funded by NASA grants NAG5-12247 and NNG05-GD07G and the Australian Research Council grants DP140101552, DP110103118, DP0879468, LE0668470 and LP0562346. We are grateful to the DOE ARM programme for technical support in Darwin, and Clare Murphy, Nicholas Jones and others for support in Wollongong. TCCON measurements in Białystok and Orléans are supported by ICOS-INWIRE, InGOS and the Senate of Bremen. N. Deutscher is supported by an ARC-DECRA fellowship, DE140100178. The authors are grateful to Marijana Crepulja for the acquisition of the BESD GOSAT data at ECMWF and the preparation of the data for the assimilation. The authors would like to acknowledge Paul Wennberg, PI of the Lamont and Park Falls TCCON stations. Finally, we would like to express our great appreciation to William Lahoz, editor of this paper, for his useful comments during the revision process. Edited by: W. Lahoz

Word count: 9535

Show less

© 2016. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

This study presents results from the European Centre for Medium-Range Weather Forecasts (ECMWF) carbon dioxide ( ${CO}_{2}$ ) analysis system where the atmospheric ${CO}_{2}$ is controlled through the assimilation of column-averaged dry-air mole fractions of ${CO}_{2}$ ( ${XCO}_{2}$ ) from the Greenhouse gases Observing Satellite (GOSAT). The analysis is compared to a free-run simulation (without assimilation of ${XCO}_{2}$ ), and they are both evaluated against ${XCO}_{2}$ data from the Total Carbon Column Observing Network (TCCON). We show that the assimilation of the GOSAT ${XCO}_{2}$ product from the Bremen Optimal Estimation Differential Optical Absorption Spectroscopy (BESD) algorithm during the year 2013 provides ${XCO}_{2}$ fields with an improved mean absolute error of 0.6 parts per million ( $ppm$ ) and an improved station-to-station bias deviation of 0.7 $ppm$ compared to the free run (1.1 and 1.4 $ppm$ , respectively) and an improved estimated precision of 1 $ppm$ compared to the GOSAT BESD data (3.3 $ppm$ ). We also show that the analysis has skill for synoptic situations in the vicinity of frontal systems, where the GOSAT retrievals are sparse due to cloud contamination. We finally computed the 10-day forecast from each analysis at 00:00 $UTC$ , and we demonstrate that the ${CO}_{2}$ forecast shows synoptic skill for the largest-scale weather patterns (of the order of 1000 $km$ ) even up to day 5 compared to its own analysis.

Details

Title

Ability of the 4-D-Var analysis of the GOSAT BESD XCO2 retrievals to characterize atmospheric CO2 at large and synoptic scales

Author

Massart, Sébastien¹

; Agustí-Panareda, Anna¹; Heymann, Jens²; Buchwitz, Michael²

; Chevallier, Frédéric³

; Reuter, Maximilian²

; Hilker, Michael²; Burrows, John P²

; Deutscher, Nicholas M⁴; Feist, Dietrich G⁵

; Hase, Frank⁶; Sussmann, Ralf⁷; Desmet, Filip⁸; Dubey, Manvendra K⁹

; Griffith, David W T¹⁰

; Kivi, Rigel¹¹

; Petri, Christof²

; Schneider, Matthias⁶; Velazco, Voltaire A¹⁰

¹ European Centre for Medium-Range Weather Forecasts, Reading, UK
² Institute of Environmental Physics, University of Bremen, Bremen, Germany
³ Laboratoire des Sciences du Climat et de l'Environnement, CEA-CNRS-UVSQ, IPSL, Gif sur Yvette, France
⁴ Institute of Environmental Physics, University of Bremen, Bremen, Germany; Centre for Atmospheric Chemistry, School of Chemistry, University of Wollongong, Wollongong, Australia
⁵ Max Planck Institute for Biogeochemistry, Jena, Germany
⁶ Karlsruhe Institute of Technology, IMK-ASF, Karlsruhe, Germany
⁷ Karlsruhe Institute of Technology, IMK-IFU, Garmisch-Partenkirchen, Germany
⁸ Department of Chemistry, University of Antwerp, Antwerp, Belgium
⁹ Earth and Environmental Sciences, Los Alamos National Laboratory, Los Alamos, USA
¹⁰ Centre for Atmospheric Chemistry, School of Chemistry, University of Wollongong, Wollongong, Australia
¹¹ Finnish Meteorological Institute, Arctic Research, Sodankylä, Finland

Pages

1653-1671

Publication year

2016

Publication date

2016

Publisher

Copernicus GmbH

ISSN

16807316

e-ISSN

16807324

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.5194/acp-16-1653-2016

ProQuest document ID

2414053128

Ability of the 4-D-Var analysis of the GOSAT BESD XCO2 retrievals to characterize atmospheric CO2 at large and synoptic scales

Jump to:

Full Text

Abstract

Details

Suggested sources