Geosci. Model Dev., 10, 11991208, 2017 www.geosci-model-dev.net/10/1199/2017/ doi:10.5194/gmd-10-1199-2017 Author(s) 2017. CC Attribution 3.0 License.
Laurent Menut1, Sylvain Mailler1, Bertrand Bessagnet2, Guillaume Siour3, Augustin Colette2, Florian Couvidat2, and Frdrik Meleux2
1Laboratoire de Mtorologie Dynamique, Ecole Polytechnique, IPSL Research University, Ecole Normale Suprieure, Universit Paris-Saclay, Sorbonne Universits, UPMC Univ Paris 06, CNRS, Route de Saclay, 91128 Palaiseau, France
2INERIS, National Institute for Industrial Environment and Risks, Parc Technologique ALATA, 60550 Verneuil-en-Halatte, France
3Laboratoire Inter-Universitaire des Systmes Atmosphriques, UMR CNRS 7583, Universit Paris Est Crteil et Universit Paris Diderot, Institut Pierre Simon Laplace, Crteil, France
Correspondence to: Laurent Menut ([email protected])
Received: 17 June 2016 Discussion started: 24 June 2016
Revised: 18 February 2017 Accepted: 23 February 2017 Published: 17 March 2017
Abstract. A simple and complementary model evaluation technique for regional chemistry transport is discussed. The methodology is based on the concept that we can learn about model performance by comparing the simulation results with observational data available for time periods other than the period originally targeted. First, the statistical indicators selected in this study (spatial and temporal correlations) are computed for a given time period, using colocated observation and simulation data in time and space. Second, the same indicators are used to calculate scores for several other years while conserving the spatial locations and Julian days of the year. The difference between the results provides useful insights on the model capability to reproduce the observed dayto-day and spatial variability. In order to synthesize the large amount of results, a new indicator is proposed, designed to compare several error statistics between all the years of validation and to quantify whether the period and area being studied were well captured by the model for the correct reasons.
1 Introduction
Chemistry-transport models (CTMs) aim at simulating the atmospheric composition where humans and the environment can be affected by air pollution. Air pollution results from the presence of chemical compounds emitted into
An alternative way to evaluate chemistry-transport model variability
the atmosphere due to anthropogenic activities and natural sources (biogenic emissions from vegetation, soil erosion, sea salt, volcanic activity and wildres). CTMs are used to represent the dynamical and chemical processes that drive spatial and temporal features of the atmospheric composition.
To estimate the quality of CTMs, model output results are usually compared with available observations. These comparisons have been performed for as long as the models have existed; they are crucial for quantifying the ability of models to reproduce particular events or a general behavior. The quantication of the model quality is performed in every research work. It depends on the case being studied, the modeled variables and the spatial and temporal resolutions. The comparison between observations and model outputs is a complex task and has to take into account numerous factors such as the spatial representativeness of the monitoring stations (Valari and Menut, 2008; Solazzo and Galmarini, 2015). For many years, the best approach to evaluate a models results has been discussed and, in the eld of atmospheric composition, numerous methods were proposed. It is not possible to give an exhaustive list of all validation studies and we present some examples here.
Baldridge and Cox (1986) and Cox and Tikvart (1990) proposed the use of error statistics like correlation, bias and root mean squared error (RMSE) in the specic framework of air quality, i.e., the atmospheric composition when crite-
Published by Copernicus Publications on behalf of the European Geosciences Union.
1200 L. Menut et al.: An alternative way to evaluate chemistry-transport model variability
ria pollutant concentrations exceed predened limit values.
Chang and Hanna (2004) also proposed an evaluation framework dedicated to air quality model performance and explained that there is not a single best evaluation methodology and how important it is to use as many evaluation criteria as possible to really understand model results well.Later, and in order to ensure the use of systematic procedures in the evaluation process, dedicated tools were developed for the model evaluation. For example, Appel et al. (2011) and Galmarini et al. (2012) proposed complex statistical modules to extract all possible information related to the capability of a model to reproduce an observed event. In parallel, some studies were dedicated to revisit the way to evaluate models such as Thunis et al. (2012), dedicated to air quality in a policy framework. In this study, the authors proposed the target diagram to have the bias and the RMSE on the same plot. Complementary to the denition of performance indicators to be used, Simon et al. (2012) used these indicators to compile photochemical model performance for a large set of data over several years of simulation. This kind of evaluation may also be done in dedicated projects, such as the recent AQMEII (Air Quality Model Evaluation International Initiative), comparing chemistry-transport models running both in Europe and North America (Vautard et al., 2012; Campbell et al., 2015); the EURODELTA project (Bessagnet et al., 2016); and the EMEP (European Monitoring and Evaluation Programme) context in the framework of the United Nations Convention on Long-range Transboundary Air Pollution (Prank et al., 2016). Using comparisons between observations and model outputs, some studies proposed methodologies to decompose the statistical scores in order to estimate the main source of errors (Solazzo and Galmarini, 2016). Finally, other studies also use observations to adjust the result by implementing methods to unbias simulation without changing the model, as in Porter et al. (2015) for ozone over the United States. The common point of all these studies is that they are always using, as best as possible, the observations corresponding in time and location to the model grid cell.
In the present study, a simple method is proposed to add information about the model performance with a focus on its spatial and temporal variability. To reach this objective, we propose to use observations corresponding to the modeled period and geographical domain but also to use observations for the same domain but other periods. In this way, we want to extract the information about the model variability and to answer the following question: is the performance of the model satisfactory because the model is accurate or just because the model is able to reproduce a situation which is recurrent from year to year? The issue to be solved and the tools developed are presented in Sect. 2. The new methodology with the presentation of the indicator developed for this study are presented in Sect. 3. The results and discussions to point out the drivers of model errors are presented in Sects. 4 and 5 for the new indicator.
2 Methodology
In the present study, a simple method is developed to improve the evaluation of model variability and to identify the processes responsible for discrepancies of model outputs vs. observations. The methodology is general and could be applied to all types of models. In this study, the methodology is presented for the specic case of the regional atmospheric composition modeling: a topic mixing meteorology and chemistry, with a high spatial and temporal variability, thus having a good potential to test the relevance of our methodology.
2.1 Regional chemistry-transport modeling
In chemistry-transport modeling, several processes are involved, some of them directly inuencing the others. When studying both meteorological and chemical variables, the dependencies between all variables are helpful to better interpret the model results.
The boundary conditions prescribe the concentrations of chemical species which may enter the simulation domain. Usually for large domains, they are issued from global models as monthly climatologies. They correspond to averaged values suitable to characterize the background concentrations of long-lived species such as ozone, carbon monoxide and mineral dust. Anthropogenic emissions are prescribed from databases and the inuence of meteorology is limited in the model. Vegetation, re and mineral dust emissions depend both on land-use data and meteorology. These emissions are not measurable; it is almost impossible to directly assess their quality.
The meteorological variables inuence transport and mixing processes, with a direct effect on gas and aerosol plume locations and their vertical distribution. Cloudiness and temperature impact the photolysis efciency; the boundary layer height impacts the surface mixing of pollutants; rainfall impacts the wet deposition. Moreover, meteorology also has an impact on emissions: wind variability is the prevalent driver for dust emissions, and it has also a major impact on wildre emissions. Both temperature and solar irradiance inuence the magnitude of biogenic emissions from vegetation. The spatial variability of land-use data also has a strong impact on all these natural emissions.
The chemistry-transport model is a numerical integration tool of all forcings and processes. The chemical mechanism handles the life cycle of chemical species (production and loss) when the deposition processes are the only net sink of species. In the model, the spatial (horizontal and vertical) and temporal resolutions are also prescribed, directly impacting the simulation representativeness and thus the quality of the modeled air pollutant concentrations when they are compared to available observations.
Geosci. Model Dev., 10, 11991208, 2017 www.geosci-model-dev.net/10/1199/2017/
L. Menut et al.: An alternative way to evaluate chemistry-transport model variability 1201
2.2 The studied case
The study focuses on the summer 2013 period (1 May to31 August) over the European Mediterranean region. This period is called the reference period in this paper. This case has already been modeled (using the same models, WRF and CHIMERE) and the results were discussed in Menut et al. (2015). The same simulation is used in this study; all parameters are identical.
The observational data come from different sources depending on the variables (see Table 1). In this region, where the monitoring networks are dense enough, comparisons are performed with observations from surface stations that provide hourly O3, NO2 surface concentrations for gases and
PM2.5 and PM10 (particulate matter with mean mass median diameter lower than 2.5 and 10 m, respectively) for particles. Complementary to surface concentration data, evaluated using the EBAS database (Trseth et al., 2012), the meteorology is also evaluated for 2 m temperature (T2m), 10 m wind speed (U10m) and precipitation rates (in mm day1) from the BADC (British Atmospheric Data Centre). In order to quantify the transport of aerosols in dense plumes aloft, observations from the AERONET (AErosol RObotic NETwork) program are used for the aerosol optical depth (AOD) and the ngstrm exponent. In this study, all variables are used as the daily mean (except for precipitation corresponding to daily cumulated values) in order to (i) have homogeneous scores between the variables and (ii) be able to separate the systematic and the day-to-day variabilities. The use of an hourly time frequency was ruled out to avoid a too strong weight of the diurnal cycle in the temporal variability.
3 Proposed methodology
As discussed in the introduction, many statistical indicators (SIs) exist to quantify the model ability to simulate observed pollution events. The correlations (temporal and spatial), the RMSE, its normalized expression nRMSE and the bias (the difference between observations and modeled values) are widely used in regional air pollution modeling. The correlations are able to split the relative contributions of systematic meteorology or source-related variability and day-to-day variability. The RMSE and the bias are a direct quantication of the model error.
The main goal of this study is to separate the contributions due to systematic and sporadic events. The systematic events correspond to yearly phenomena, while the sporadic events correspond to the events observed during one year but not the others. In addition, complementary to the model variability quantication, the model error is also important to estimate. The key points of this study are to (i) study the model variability which is statistically represented by the correlations and (ii) add complementary information on the model
Model Year=REF
Observations
... Year=N1
Figure 1. Principle of the multi-year variability indicator (Imv) calculation, using one modeled year and several years of observations.
SI stands for statistical indicator and is related to spatial and temporal correlation.
errors, which could be represented here by the RMSE (or the nRMSE).
First, as presented in Fig. 1, the SIs are calculated between observation data and model outputs for the simulation year(i.e., the reference year). Second, the SIs are calculated between the observation data for other years and the model output for the reference year. Logically, the scores calculated for the reference year for observations and model outputs would give the better results. By examining the difference with the scores calculated for other years (with the observations only), we expect to conclude whether the model is able to catch the observed variability for the correct reasons. Using this approach, the goal is to give complementary information to those usually obtained when using only SIs calculated for a single year (the studied year).
We apply this methodology for the simulation of the year 2013 and using observation data for years ranging from 2008 to 2013. In order to give some synthetic answers, the different SI scores are aggregated into a single indicator called Imv and presented in detail in the next section. Of course, it seems awkward to evaluate a model day by day with observational data from another year. For a given station, at a given day of the reference year, air concentrations will be affected by a different local meteorology, emissions and long-range transport of chemical species. However, we can consider that to take the same date for another year is strictly the same as randomly choosing a date in the same season. This trivial method can emphasize how a model is affected by large-scale patterns and long-term temporal cycles.
3.1 Calculation of correlations and nRMSE
In this study, we focus on three statistical indicators: the spatial correlation, the temporal correlation and the normalized RMSE. For these three indicators, it is important that, for all years of validation, the same list of stations with valid measurements is used.
The correlation used in this study is Pearsons correlation.
Each correlation provides specic information on the quality of the simulation. The temporal correlation, noted Rt, is
www.geosci-model-dev.net/10/1199/2017/ Geosci. Model Dev., 10, 11991208, 2017
Observations Year=1
Observations Year=REF
1202 L. Menut et al.: An alternative way to evaluate chemistry-transport model variability
Table 1. List of measurement data used for the statistical comparison with the model results. All data used are issued from surface stations representative of their own environment. Originally provided hourly or 3-hourly, they are used as daily averages in this work. The abbreviation ad. is used to indicate dimensionless units.
Variable Network Spatial Vertical Temporal Unit coverage coverage frequency
O3, NO2 EBAS/EMEP Europe Surface Hourly ppb
PM2.5, PM10 EBAS/EMEP Europe Surface Hourly g m3
AOD, ngstrm AERONET Global Column Hourly ad.
T2 m BADC Global Surface Tri-hourly C
U10 m BADC Global Surface Tri-hourly m s1 Precipitation BADC Global Surface Daily mm day1
estimated station by station using daily averaged data in order to have homogeneous comparisons between all variables.This correlation is directly related to the variability from day to day for each station. Ot,i and Mt,i represent the observed and modeled values, respectively, at time t for the station i, for a total of T days and I stations. The mean time-averaged value Xi is
Xi =
1 T
T
Xt=1Xt,i. (1)
The temporal correlation Rt,i for each station i is calculated as
Rt,i =
PTt=1(Mt,i Mi)(Ot,i Oi)
[radicalBig][summationtext]
for all stations i and all times t.
3.2 Denition of the Imv indicator
For the specic purpose of the model variability (and not the model error), we dene an indicator, Imv, dedicated to express in one value the results obtained with the temporal and spatial correlations. The goal of this indicator is to quantify how the correlation between measurement data (for different years) and model outputs (for the reference year) evolves from one year to another. This indicator does not replace the usual statistical indicators but aims at providing complementary information about the variability between years.
We rst dene the differences, D, between all years as
D =
1N 1
. (2)
The mean temporal correlation, Rt, used in this study is thus
Rt =
1 I
Tt=1(Mt,i Mi)2
PTt=1(Ot,i Oi)2
N1
Xi=1
|si sN|
!, (7)
with sN the score of the indicator for the reference year being modeled and si the score of the indicator computed using observations corresponding to other meteorological years (from 1 to N 1 if there are N 1 other available years for the ob
servations).
We now aim to develop a simple indicator, called Imv, which is a combination of the statistical indicator for the reference year and the differences between years. This Imv corresponds, in fact, to the SI itself weighted by the differences between the SI scores of all years. We expect that Imv follows the following rules:
Imv has the same evolution as the studied SI. If the correlation increases, Imv also increases.
Imv is bounded between 0 and 1, like the correlation.
This enables us to compare the results for different variables (with different metrics).
In the case of a high correlation value found for the studied year, the obtained sN value is close to 1. This value may be lower for the following reasons:
If the differences between the other years are low (D tends to 0), it means that the model is correct for
Geosci. Model Dev., 10, 11991208, 2017 www.geosci-model-dev.net/10/1199/2017/
I
Xi=1Rt,i, (3)
with I the total number of stations. The spatial correlation, noted Rs, uses the same formula type except it is calculated from the temporal averaged values of observations and model for each location where observations are available. A good correlation shows that the model correctly locates the largest horizontal gradients as known sources and long-range transport plumes. The spatiotemporal averaged value is estimated as
X =
1 I
I
Xi=1Xi, (4)
and the spatial correlation is thus expressed as
Rs =
PIi=1(Mi M)(Oi O)
[radicalBig][summationtext]
Ii=1(Mi M)2
PIi=1(Oi O)2
. (5)
The normalized RMSE is expressed as
nRMSE =
[radicaltp]
[radicalvertex]
[radicalvertex]
[radicalbt]
1 T
1 I
T
Xt=1I
Xi=1
Ot,i Mt,i Ot,i
2
(6)
L. Menut et al.: An alternative way to evaluate chemistry-transport model variability 1203
Figure 2. Scheme of the Imv values as a function of the studied year correlation values and the multi-year differences D.
the studied year, but possibly because it reproduces a recurrent phenomena. In this case, we want Imv to decrease and tend to 0.
If the differences between the other years are high (D tends to 1), it means the model gives good results for the studied year, but it is not because it simulates a systematic event. In this case, we want Imv to remain close to the indicator value. With sN 1
and Imv 1, we can conclude that the model is very
good for the studied year and this is not due to a recurrent process.
In the case of a low correlation value, and whatever the magnitude of differences between years, the model is not correct. Imv must be low, as it is the indicator value.
These constraints allow us to dene an indicator having this kind of formulation:
Imv = sN 1 exp(Ds)4[parenrightBig]. (8)
This means that Imv always has, as a maximum, the value of the indicator itself. The power of 4 is here dened to have a specic shape for Imv, respecting the rules presented below. Finally, this expression gives an indicator variability presented in Fig. 2. Considering the state of the art of chemistry-transport modeling, the model is considered accurate, having an acceptable variability for Imv > 0.4: this means that the correlation is at least 0.5 and the differences are also at least greater than 0.5.
Finally, this indicator is not calculated for nRMSE and bias. Two reasons explain this choice: the rst reason is that, contrarily to correlations, RMSE and bias are not bounded
Table 2. Scores for T2 m. The correlations and nRMSE are calculated between the observations (20082013) and the model results (2013).
Year Rs Rt nRMSE
2008 0.58 0.34 0.31 2009 0.57 0.36 0.32 2010 0.61 0.30 0.34 2011 0.62 0.25 0.32 2012 0.61 0.37 0.32 2013 0.60 0.91 0.22
D 0.02 0.59 0.10
between 0 and 1. This leads to indicator values possibly varying a lot between several years and thus being difcult to compare between years. The second reason is that the goal of the indicators is to extract a message from the model variability of the studied year compared to the other years. In this case, the correlations constitute a statistical indicator which is more appropriate for this evaluation.
4 Time series of statistical indicators
The calculations of differences are performed for the correlations and the nRMSE. These values are calculated for all variables described in Table 1 for the years 2008 to 2013. For each year, it is noted that only the May to August period is considered. Results are presented as time series in Fig. 3 and discussed in the following sections. Note also that some values discussed in these sections are also reported in the synthetic Table 4.
4.1 Meteorological variables
The meteorological variables are T2m, u10m and the precipitation rate. The values of the statistical scores are provided, year by year, in Fig. 3. As an example, the same values are reported for T2m in Table 2.
T2m is a meteorological variable, constraining processes both for meteorology and chemistry. Its diurnal cycle is strong, as well as its latitudinal variability (for large model domains), often ensuring a good spatial correlation. In general, this variable is the least uncertain of all modeled meteorological parameters. The spatial correlation is good for all years, ranging from 0.57 (2009) to 0.62 (2011). For the studied year (2013), the score is 0.60, slightly lower than for 2011. Even if the correlation for the selected year is good, it is not signicantly better than for the other year, with D = 0.02. This means that the model reproduces fairly well
a spatial pattern that is observed every year. Indeed, the simulation domain is large and the temperature has a latitudinal variability larger than between each measurement station.The temporal correlation ranges from 0.25 to 0.91 (2013).
www.geosci-model-dev.net/10/1199/2017/ Geosci. Model Dev., 10, 11991208, 2017
1204 L. Menut et al.: An alternative way to evaluate chemistry-transport model variability
T2m u10m Precipitation AOD
Rs RtN.RMSE
Rs RtN.RMSE
2008 2009 2010 2011 2012 2013 Years
1.2
1.2
1.2
1.2
1.0
1.0
1.0
1.0
0.8
0.8
0.8
0.8
0.6
0.6
0.6
0.6
0.4
0.4
0.4
0.4
0.2
0.2
0.2
0.2
0.0
0.0
0.0
0.0
0.2
0.2
0.2
0.2
0.4
0.4
0.4
0.4
2008 2009 2010 2011 2012 2013 Years
2008 2009 2010 2011 2012 2013 Years
2008 2009 2010 2011 2012 2013 Years
ANG O3 NO2 PM2.5
1.2
1.2
1.2
1.2
Rs RtN.RMSE
1.0
1.0
1.0
1.0
Rs RtN.RMSE
0.8
0.8
0.8
0.8
0.6
0.6
0.6
0.6
Rs RtN.RMSE
Rs RtN.RMSE
0.4
0.4
0.4
0.4
0.2
0.2
0.2
0.2
0.0
0.0
0.0
0.0
0.2
0.2
0.2
0.2
0.4
0.4
0.4
0.4
2008 2009 2010 2011 2012 2013 Years
2008 2009 2010 2011 2012 2013 Years
2008 2009 2010 2011 2012 2013 Years
2008 2009 2010 2011 2012 2013 Years
PM10 Ammonium Sulfate Nitrate
Rs RtN.RMSE
1.2
1.2
1.2
1.2
Rs RtN.RMSE
1.0
1.0
1.0
1.0
Rs RtN.RMSE
Rs RtN.RMSE
0.8
0.8
0.8
0.8
0.6
0.6
0.6
0.6
0.4
0.4
0.4
0.4
0.2
0.2
0.2
0.2
0.0
0.0
0.0
0.0
0.2
0.2
0.2
0.2
0.4
0.4
0.4
0.4
2008 2009 2010 2011 2012 2013 Years
2008 2009 2010 2011 2012 2013 Years
2008 2009 2010 2011 2012 2013 Years
2008 2009 2010 2011 2012 2013 Years
For the meteorological variables, these scores showed that the meteorological forcing is well captured, and always better for the year being considered compared to other years.
4.2 Optical properties
The optical properties are directly linked to the atmospheric composition of aerosol and may be quantied using the AOD and the ngstrm exponent (ANG).
For the AOD, the spatial correlation is very good for 2013, with Rs = 0.97, but it is as good or better for other years. This
means that we model a rather recurring phenomenon: every year, the same stations are, on average, exposed to aerosol plumes. The temporal correlation is lower with Rt = 0.45
but much better than for other years. This indicates that the model partly reproduces the observed temporal variability but the events are changing from one year to another and the model captures these changes well. In the studied region, the AOD is sensitive to desert dust outbreaks in summer.This means that large-scale systems are driving the aerosol plumes; they are spatially recurrent and temporally better captured for the year being considered than for other years.
Geosci. Model Dev., 10, 11991208, 2017 www.geosci-model-dev.net/10/1199/2017/
Figure 3. Multi-year scores for T2 m, u10 m, the precipitation rate, aerosol optical depth (AOD), the ngstrm exponent (ANG), surface concentrations of O3, NO2, PM2.5, PM10, ammonium, sulfate and nitrate. The correlations and the nRMSE are calculated between the observations (20082013) and the model results (2013). The spatial correlation, Rs, is in black; the temporal correlation, Rt, in blue; the nRMSE in red.
The variability of nRMSE is lower than for the correlations, with values ranging from 0.22 (2013) to 0.34 (2010). The lowest value is found for 2013, highlighting the fact that the model error is the lowest for the reference year. The model is thus performing well in capturing the day-to-day variability for T2m for the correct reasons.
From Fig. 3, the calculation of u10m also gives satisfactory results with Rt = 0.60. The spatial correlation, Rs = 0.09, is
poor and very variable from one year to another. As for T2m, we also have an effect of the model resolution and the representativeness of the variable.
Scores for the precipitation are correct, with a very good spatial correlation that is always exceeding 0.6. As for the temperature, the latitudinal effect plays a major role in the variability. Both the spatial and temporal correlations increase signicantly for the reference year. The nRMSE is not on the plot, with the values being larger than 1.2. The model is biased in absolute values and overestimates the amount of daily precipitation. However, the day-to-day variability is correct and such variability is the most important feature for atmospheric composition modeling (the lower atmosphere is scavenged when a precipitation occurs, whatever its value).
L. Menut et al.: An alternative way to evaluate chemistry-transport model variability 1205
Table 3. Scores for NO2. The correlations and nRMSE are calculated between the observations (20082013) and the model results (2013).
Year Rs Rt nRMSE
2008 0.44 0.00 1.56 2009 0.42 0.04 1.76
2010 0.66 0.04 1.82
2011 0.79 0.03 2.07
2012 0.76 0.04 2.84 2013 0.88 0.22 1.76
D 0.27 0.23 0.33
due to anthropogenic sources that vary in space, such as biogenic and vegetation res. The temporal correlation is low for 2013, Rt = 0.22, but is closer to 0 for other years and there
fore signicantly better for the reference year compared to the others. These two correlation values show that the model certainly captures the right location of emission sources (low variability of Rs). The nRMSE is large and shows that the concentrations are overestimated by the model. However, this overestimation appears for all years and can be due to the representativeness of the surface measurements compared to the size of model cells.
The spatial correlation is good for O3, NO2 and PM10, with Rs = 0.69, 0.88 and 0.81, respectively. For PM2.5, this
correlation is low, with Rs = 0.16. The PM10 shows that the
largest particles are well modeled over the whole domain, and this was also the conclusion for the AOD and ANG. The low score for PM2.5 indicates that for the aerosol distribution the ne mode is not as well modeled as the coarse mode. This is conrmed by the scores of the aerosol inorganic species, ammonium, sulfate and nitrate, which contribute to a large part of the ne fraction of particles. Except for sulfate (with Rs = 0.51), the spatial correlations are 0.15 for nitrate and
0.20 for ammonium. Thus, the ne part of the aerosol is not well modeled mainly due to a deciency in the modeling of nitrates.
The temporal correlations have a completely different behavior than the spatial correlations. The values are generally low, from Rt = 0.09 for nitrate to Rt = 0.32 for O3. Surpris
ingly, the PM10 concentrations display a good spatial correlation but a poor temporal correlation. This is due to the long lifetime in the atmosphere of nonreactive species such as mineral dust: plumes are correctly modeled over large areas but the day-to-day variability needs improvement. Another point is the good spatial correlation for NO2 but its low temporal correlation with Rt = 0.22. In this case, this means
we have a correctly spatialized anthropogenic emission inventory (mainly for NO2 sources), but difculties to model the day-to-day chemistry still exist.
For the surface concentrations, we can conclude that O3, NO2 and PM10 concentrations are spatially well modeled, and this is not due to a recurrent behavior. For particles, the problem is more related to the ne mode, where PM2.5 concentrations are not well located. This modeling problem is highlighted by the low correlations and Imv values for the inorganic species. For the temporal correlations, the scores are always lower than for the spatial correlation but also always higher for the reference year than for the other years.
5 Estimation of the Imv indicator for all variables
To summarize the results obtained for each statistical indicator and the values of differences between all years, we apply the Imv formulation. This enables us to have one value for
www.geosci-model-dev.net/10/1199/2017/ Geosci. Model Dev., 10, 11991208, 2017
For the ANG, the spatial correlation is very good, with Rs = 0.91, but also persistent in time. The temporal corre
lation is much better for 2013 than for other years. This is probably due to a size distribution that is not necessarily well simulated from one day to another (shown by AOD and explained in Menut et al., 2016) but the relative contributions of ne and coarse aerosol atmospheric load are fairly reproduced. This feature highlights the high sensitivity of the AOD calculation to the modeled aerosol size distribution, although the overall mass emitted and transported is realistic.
Globally, the AOD and ANG reect the models ability to retrieve the long-range transport of long-lived aerosols, which depends on several processes (emissions, transport and deposition). These scores show that the model is able to retrieve these yearly recurrent plumes but the model size distribution of particles clearly requires improvement.
4.3 Surface concentrations
For the surface concentrations of gaseous and aerosol species, the variability is much more related to local effects. As an example, the detailed values of the statistical indicators and the differences between years are extensively presented for NO2.
NO2 is both primary and secondary in origin. Mostly emitted in urbanized areas, the diurnal cycle of this species is well constrained. Depending on meteorological conditions, its lifetime may vary signicantly from hours to days. Modeling this species with CTMs is challenging because several uncertainties are acting at the same time, including the spatial representativeness of the model cell. The scores show whether the sources are properly located and whether the photochemistry and transport processes have been well simulated. In general, at coarse model resolution, the model results for this species are worse than for ozone. The spatial correlation gives a score of Rs = 0.88 for 2013. This corre
sponds to the best correlation compared to the other years. The anthropogenic emissions are strongly related to industrial activities and road trafc, and since these activity sectors are xed in space, the good spatial correlation is more
1206 L. Menut et al.: An alternative way to evaluate chemistry-transport model variability
Spatial correlation Rs Temporal correlation Rt
0.0
1
1.0
1
1.0
0.9
0.9
0.9
0.9
0.8
0.8
0.8
0.8
0.7
0.7
0.7
0.7
0.6
0.6
Difference
0.6
0.6
0.5
0.5
Difference
0.5
0.5
0.4
0.4
0.4
0.4
0.3
0.3
0.3
0.3
0.2
0.2
0.2
0.2
0.1
0.1
0.1
0.1
0.0
0
0
0 0.2 0.4 0.6 0.8 1 Score
0 0.2 0.4 0.6 0.8 1Score
all years D. For each studied variable, their values are reported on the gure, where the colors represent the values of Imv. The interpretation of these results follows the quality criteria presented in the academic schematic of Fig. 2. This presentation shows an important spread for the spatial correlation results. If the relative differences D range from 0 to 0.6, the correlations range from 0.09 (for the 10 m wind speed) to 0.97 (for AOD). The common point is that there is no variable with differences above 0.5. This means that, spatially, the studied problem shows systematic patterns from year to year. The low values of correlations show that some variables are systematically poorly estimated. This means that some meteorological structures (for u10m) or emission sources (contributing to the PM2.5 surface concentrations) are systematically mislocated.
The representation of temporal correlations shows a specic linear pattern. The largest correlation values are positively correlated with differences. This temporal correlation represents the day-to-day variability at each location. This means that the studied problem is based on high day-to-day variability without similar consecutive days (in this case, one would have high correlations but low differences). This illustrates the fact that the studied problem is primarily an issue of sporadic events and the model is able to correctly nd this variability from one day to another.
6 Conclusions
At rst glance, using a different year than the simulated one for the day-to-day evaluation seems awkward. However, we can learn more about the performance of chemistry-transport models than by using a single year for the usual statistical indicators. Of course, this approach will never replace a strict evaluation of a pollution case analysis using time series, vertical proles and usual error statistics. However, it offers a
Geosci. Model Dev., 10, 11991208, 2017 www.geosci-model-dev.net/10/1199/2017/
Figure 4. Results of the Imv scores for the spatial and temporal correlations. For each model variable, its value is represented using the correlation on the x axis and the difference between the studied year and the others on the y axis. The colors represent the Imv values.
Table 4. The Imv values for all variables: the meteorology with T2 m, u10 m and precipitation rate; the vertically integrated column of aerosols with the aerosol optical depth (AOD) and the ngstrm exponent (ANG); the surface concentrations of all aerosols in terms of size distribution with PM2.5 and PM10; and the inorganic species with Dp < 10 m. Values of Imv above 0.4 are in bold. Units of the variables are detailed in Table 1.
Variable Rs Rt
Value D Imv Value D Imv
T2 m 0.60 0.02 0.04 0.91 0.59 0.82 u10 m 0.09 0.23 0.05 0.59 0.56 0.53
Precip. 0.89 0.20 0.49 0.08 0.07 0.02 AOD 0.97 0.02 0.09 0.45 0.34 0.33 ANG 0.91 0.04 0.14 0.59 0.44 0.49
O3 0.69 0.13 0.29 0.32 0.27 0.21 NO2 0.88 0.27 0.58 0.22 0.23 0.13
PM2.5 0.16 0.15 0.07 0.27 0.32 0.20 PM10 0.81 0.10 0.27 0.17 0.14 0.07
Ammonium 0.20 0.13 0.08 0.21 0.20 0.12 Sulfate 0.51 0.21 0.29 0.31 0.34 0.23 Nitrate 0.15 0.51 0.13 0.09 0.08 0.03
each SI (Rs and Rt) and each variable. Results are presented in Table 4 and are also displayed on single plots in Fig. 4.
In Table 4, the Imv values larger than 0.4 are highlighted. This threshold is clearly subjective but mentioned here to better highlight the variables being well modeled and with a correct variability from one year to another. As discussed in detail, the best scores are obtained for the meteorological variables and are better for the temporal variability than for the spatial variability.
In Fig. 4, the x axis represents the correlation (spatial or temporal) and the y axis represents the differences between
L. Menut et al.: An alternative way to evaluate chemistry-transport model variability 1207
very fast and integrated vision of the strengths and weaknesses of a model with very little calculation. This methodology can also be deployed in intercomparison exercises.
To answer the questions presented in the introduction, for this particular model and simulated period, the following conclusions can be drawn. The model always simulates the studied year better than any other meteorological year and it is able to reproduce the day-to-day variability for high concentrations of pollutants.
The spatial correlation is good for 2 m temperature and precipitation rate but not for wind speed: this highlights the fact that the modeled domain is large and the resolution is not optimized for small-scale processes. The spatial correlation is also very good for the long-range transport of particles, as demonstrated with Rs = 0.97 and 0.90 for AOD and ANG.
However, since this feature occurs every year, this leads to low Imv values. This means that, for a large domain, the main spatial patterns of particle concentrations are recurrent and well modeled. The chemical species that are best modeled are either species with a long atmospheric lifetime (PM10)
or species spatially well constrained on the domain (such as NO2, mainly due to anthropogenic emissions). For particles, the results depend on the size distribution: the coarse particles are better simulated than the ne ones.
The conclusions are different for the temporal correlation. The scores are calculated using daily observations and modeled outputs. Thus, these scores reect the ability of the model to retrieve the day-to-day variability. As for the spatial correlation, scores are good for the meteorological variables. For the aerosol, and mainly for the long-lived species (such as mineral dust), the temporal correlation is also correct as the Imv values: Imv = 0.33 and 0.49 for AOD and
ANG, respectively. However, for the short-lived species, the temporal correlation and the Imv values are low. This means that improvements are required in priority for the day-to-day variability compared to the locations of emissions. This may probably be due to the atmospheric transport, with the spatial variability of 10 m wind speed being poorly simulated. However, overall, the temporal correlation is better for the studied year than for the others, showing that the problem is highly variable from year to year, but the model is able to capture the evolution of atmospheric composition.
Code and data availability. This study presents a methodology using existing data and models; all required information is already included in this article.
Competing interests. The authors declare that they have no conict of interest.
Acknowledgements. This study is partly funded by the French Ministry of Ecology. The authors thank the British Atmospheric Data Centre, which is part of the NERC National Centre for Atmospheric Science (NCAS), for making the meteorological data available; the EMEP network for providing atmospheric composition measurements; and the investigators and staff who maintained and provided the AERONET data.
Edited by: S. BekkiReviewed by: two anonymous referees
References
Appel, K. W., Gilliam, R. C., Davis, N., Zubrow, A., and Howard, S. C.: Overview of the atmospheric model evaluation tool (AMET) v1.1 for evaluating meteorological and air quality models, Environ. Modell. Softw., 26, 434443, doi:http://dx.doi.org/10.1016/j.envsoft.2010.09.007
Web End =10.1016/j.envsoft.2010.09.007 http://dx.doi.org/10.1016/j.envsoft.2010.09.007
Web End = , 2011.
Baldridge, K. and Cox, W.: Evaluating air quality model performance, Environ. Softw., 1, 182187, doi:http://dx.doi.org/10.1016/0266-9838(86)90023-7
Web End =10.1016/0266- http://dx.doi.org/10.1016/0266-9838(86)90023-7
Web End =9838(86)90023-7 , 1986.
Bessagnet, B., Pirovano, G., Mircea, M., Cuvelier, C., Aulinger, A., Calori, G., Ciarelli, G., Manders, A., Stern, R., Tsyro, S., Garca Vivanco, M., Thunis, P., Pay, M.-T., Colette, A., Couvidat,F., Meleux, F., Roul, L., Ung, A., Aksoyoglu, S., Baldasano, J.M., Bieser, J., Briganti, G., Cappelletti, A., DIsidoro, M., Finardi, S., Kranenburg, R., Silibello, C., Carnevale, C., Aas, W., Dupont, J.-C., Fagerli, H., Gonzalez, L., Menut, L., Prvt, A. S.H., Roberts, P., and White, L.: Presentation of the EURODELTA III intercomparison exercise evaluation of the chemistry transport models performance on criteria pollutants and joint analysis with meteorology, Atmos. Chem. Phys., 16, 1266712701, doi:http://dx.doi.org/10.5194/acp-16-12667-2016
Web End =10.5194/acp-16-12667-2016 http://dx.doi.org/10.5194/acp-16-12667-2016
Web End = , 2016.
Campbell, P., Zhang, Y., Yahya, K., Wang, K., Hogrefe, C., Pouliot,G., Knote, C., Hodzic, A., Jose, R. S., Perez, J. L., Guerrero, P. J., Baro, R., and Makar, P.: A multi-model assessment for the 2006 and 2010 simulations under the Air Quality Model Evaluation International Initiative (AQMEII) phase 2 over North America: Part I. Indicators of the sensitivity of O3 and PM2.5 formation regimes, Atmos. Environ., 115, 569586, doi:http://dx.doi.org/10.1016/j.atmosenv.2014.12.026
Web End =10.1016/j.atmosenv.2014.12.026 http://dx.doi.org/10.1016/j.atmosenv.2014.12.026
Web End = , 2015.
Chang, J. and Hanna, S.: Air quality model performance evaluation, Meteorol. Atmos. Phys., 87, 167196, doi:http://dx.doi.org/10.1007/s00703-003-0070-7
Web End =10.1007/s00703-003- http://dx.doi.org/10.1007/s00703-003-0070-7
Web End =0070-7 , 2004.
Cox, W. M. and Tikvart, J. A.: A statistical procedure for determining the best performing air quality simulation model, Atmos. Environ. A-Gen., 24, 23872395, doi:http://dx.doi.org/10.1016/0960-1686(90)90331-G
Web End =10.1016/0960- http://dx.doi.org/10.1016/0960-1686(90)90331-G
Web End =1686(90)90331-G , 1990.
Galmarini, S., Bianconi, R., Appel, W., Solazzo, E., Mosca, S., Grossi, P., Moran, M., Schere, K., and Rao, S.: {ENSEMBLE} and AMET: Two systems and approaches to a harmonized, simplied and efcient facility for air quality models development and evaluation, Atmos. Environ., 53, 5159, doi:http://dx.doi.org/10.1016/j.atmosenv.2011.08.076
Web End =10.1016/j.atmosenv.2011.08.076 http://dx.doi.org/10.1016/j.atmosenv.2011.08.076
Web End = , 2012.
Menut, L., Mailler, S., Siour, G., Bessagnet, B., Turquety, S., Rea,G., Briant, R., Mallet, M., Sciare, J., Formenti, P., and Meleux,F.: Ozone and aerosol tropospheric concentrations variability analyzed using the ADRIMED measurements and the WRF
www.geosci-model-dev.net/10/1199/2017/ Geosci. Model Dev., 10, 11991208, 2017
1208 L. Menut et al.: An alternative way to evaluate chemistry-transport model variability
and CHIMERE models, Atmos. Chem. Phys., 15, 61596182, doi:http://dx.doi.org/10.5194/acp-15-6159-2015
Web End =10.5194/acp-15-6159-2015 http://dx.doi.org/10.5194/acp-15-6159-2015
Web End = , 2015.
Menut, L., Siour, G., Mailler, S., Couvidat, F., and Bessagnet, B.: Observations and regional modeling of aerosol optical properties, speciation and size distribution over Northern Africa and western Europe, Atmos. Chem. Phys., 16, 1296112982, doi:http://dx.doi.org/10.5194/acp-16-12961-2016
Web End =10.5194/acp-16-12961-2016 http://dx.doi.org/10.5194/acp-16-12961-2016
Web End = , 2016.
Porter, P. S., Rao, S. T., Hogrefe, C., Gego, E., and Mathur,R.: Methods for reducing biases and errors in regional photochemical model outputs for use in emission reduction and exposure assessments, Atmos. Environ., 112, 178188, doi:http://dx.doi.org/10.1016/j.atmosenv.2015.04.039
Web End =10.1016/j.atmosenv.2015.04.039 http://dx.doi.org/10.1016/j.atmosenv.2015.04.039
Web End = , 2015.
Prank, M., Soev, M., Tsyro, S., Hendriks, C., Semeena, V., Vazhappilly Francis, X., Butler, T., Denier van der Gon, H., Friedrich, R., Hendricks, J., Kong, X., Lawrence, M., Righi, M., Samaras, Z., Sausen, R., Kukkonen, J., and Sokhi, R.: Evaluation of the performance of four chemical transport models in predicting the aerosol chemical composition in Europe in 2005, Atmos. Chem. Phys., 16, 60416070, doi:http://dx.doi.org/10.5194/acp-16-6041-2016
Web End =10.5194/acp-16-6041- http://dx.doi.org/10.5194/acp-16-6041-2016
Web End =2016 , 2016.
Simon, H., Baker, K., and Phillips, S.: Compilation and interpretation of photochemical model performance statistics published between 2006 and 2012, Atmos. Environ., 61, 124139, doi:http://dx.doi.org/10.1016/j.atmosenv.2012.07.012
Web End =10.1016/j.atmosenv.2012.07.012 http://dx.doi.org/10.1016/j.atmosenv.2012.07.012
Web End = , 2012.
Solazzo, E. and Galmarini, S.: Comparing apples with apples: Using spatially distributed time series of monitoring data for model evaluation, Atmos. Environ., 112, 234245, doi:http://dx.doi.org/10.1016/j.atmosenv.2015.04.037
Web End =10.1016/j.atmosenv.2015.04.037 http://dx.doi.org/10.1016/j.atmosenv.2015.04.037
Web End = , 2015.
Solazzo, E. and Galmarini, S.: Error apportionment for atmospheric chemistry-transport models a new approach to model evaluation, Atmos. Chem. Phys., 16, 62636283, doi:http://dx.doi.org/10.5194/acp-16-6263-2016
Web End =10.5194/acp-16- http://dx.doi.org/10.5194/acp-16-6263-2016
Web End =6263-2016 , 2016.
Thunis, P., Pederzoli, A., and Pernigotti, D.: Performance criteria to evaluate air quality modeling applications, Atmos. Environ., 59, 476482, doi:http://dx.doi.org/10.1016/j.atmosenv.2012.05.043
Web End =10.1016/j.atmosenv.2012.05.043 http://dx.doi.org/10.1016/j.atmosenv.2012.05.043
Web End = , 2012.
Trseth, K., Aas, W., Breivik, K., Fjraa, A. M., Fiebig, M., Hjellbrekke, A. G., Lund Myhre, C., Solberg, S., and Yttri,K. E.: Introduction to the European Monitoring and Evaluation Programme (EMEP) and observed atmospheric composition change during 19722009, Atmos. Chem. Phys., 12, 54475481, doi:http://dx.doi.org/10.5194/acp-12-5447-2012
Web End =10.5194/acp-12-5447-2012 http://dx.doi.org/10.5194/acp-12-5447-2012
Web End = , 2012.
Valari, M. and Menut, L.: Does increase in air quality models resolution bring surface ozone concentrations closer to reality?, J. Atmos. Ocean. Tech., 25, 19551968, doi:http://dx.doi.org/10.1175/2008JTECHA1123.1
Web End =10.1175/2008JTECHA1123.1 http://dx.doi.org/10.1175/2008JTECHA1123.1
Web End = , 2008.
Vautard, R., Moran, M. D., Solazzo, E., Gilliam, R. C., Matthias, V., Bianconi, R., Chemel, C., Ferreira, J., Geyer, B., Hansen, A. B., Jericevic, A., Prank, M., Segers, A., Silver, J. D., Werhahn, J., Wolke, R., Rao, S., and Galmarini, S.: Evaluation of the meteorological forcing used for the Air Quality Model Evaluation International Initiative (AQMEII) air quality simulations, Atmos. Environ., 53, 1537, doi:http://dx.doi.org/10.1016/j.atmosenv.2011.10.065
Web End =10.1016/j.atmosenv.2011.10.065 http://dx.doi.org/10.1016/j.atmosenv.2011.10.065
Web End = , 2012.
Geosci. Model Dev., 10, 11991208, 2017 www.geosci-model-dev.net/10/1199/2017/
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright Copernicus GmbH 2017
Abstract
A simple and complementary model evaluation technique for regional chemistry transport is discussed. The methodology is based on the concept that we can learn about model performance by comparing the simulation results with observational data available for time periods other than the period originally targeted. First, the statistical indicators selected in this study (spatial and temporal correlations) are computed for a given time period, using colocated observation and simulation data in time and space. Second, the same indicators are used to calculate scores for several other years while conserving the spatial locations and Julian days of the year. The difference between the results provides useful insights on the model capability to reproduce the observed day-to-day and spatial variability. In order to synthesize the large amount of results, a new indicator is proposed, designed to compare several error statistics between all the years of validation and to quantify whether the period and area being studied were well captured by the model for the correct reasons.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer