1 Introduction
Desert (or mineral) dust is the most abundant aerosol by mass in the global atmosphere and plays a key role in the Earth system . It is emitted from the surface by aeolian processes an doriginates predominantly – but not only – from desert regions. Dust affects weather and climate by perturbing the radiative balance directly through scattering and absorption of solar and thermal radiation and indirectly by altering cloud formation and cloud chemistry . It also contributes to the fertilization of the ocean and the land through the deposition of iron and phosphorus, thus affecting the global carbon cycle. All in all, the amount of, spatial distribution of and variability in desert dust have implications on climate, the environment, air quality and human health , and a variety of socio-economic sectors such as aviation and solar energy production . Due to the nature of its emission and transport and its relatively short lifetime , dust varies strongly in space and time, which requires continuous monitoring both in situ and remotely by satellite, airborne and ground-based sensors . A major challenge in studying desert dust along with its impacts is the paucity of direct in situ measurements in the regions most affected by dust storms. There are some operational visibility observations providing qualitative estimates of dust presence , but there is a severe lack of routine surface aerosol concentration measurements . In addition to the lack of in situ observations, there is limited information on aerosol speciation, which is essential to distinguish dust from other aerosol types . Satellites mostly provide column-integrated aerosol information, but spatially and temporally resolved surface dust concentration and deposition estimates are needed to enable detailed impact assessments. Dust observations or retrievals are therefore best exploited in combination with model simulations either to provide optimal initial conditions (analyses) to forecast models or to monitor current and past states of the atmosphere through the production of reanalyses, i.e. complete and consistent four-dimensional reconstructions of the atmosphere.
There are several available global aerosol reanalyses that include desert dust, such as MERRA-2 (Modern-Era Retrospective analysis for Research and Applications, Version 2; ) and CAMSRA (Copernicus Atmosphere Monitoring Service Reanalysis; ) along with their predecessors MERRAero (Modern-Era Retrospective analysis for Research and Applications Aerosol Reanalysis; ) and MACC-II (Monitoring Atmospheric Composition and Climate-II; ), respectively, and the JRAero (Japanese Reanalysis for Aerosol; ) and the NAAPS (Navy Aerosol Analysis and Prediction System; ) reanalyses. These global data sets have been produced at relatively coarse spatial resolution and by assimilating total aerosol optical depth (AOD). MERRA-2 is NASA's latest reanalysis. It has been produced at a spatial resolution of 0.58 latitude 0.6258 longitude, with 72 hybrid eta layers and by assimilating bias-corrected, neural-network-retrieved AOD from the Moderate Resolution Imaging Spectroradiometer (MODIS) and from the Advanced Very High Resolution Radiometer (AVHRR; over ocean only), as well as AOD from the Multi-angle Imaging SpectroRadiometer (MISR; over bright surfaces only) and from the Aerosol Robotic Network (AERONET) of Sun photometers. The latest reanalysis for atmospheric composition produced by the Copernicus Atmosphere Monitoring Service (CAMS) CAMSRA covers the period January 2003 to 2020 and is extended by adding 1 year each year. It has been produced at a spatial resolution of 80 and with 60 hybrid sigma–pressure levels in the vertical, by assimilating Collection 6 MODIS AOD produced with Deep Blue (DB; over land) and Dark Target (over land and ocean) algorithms and by additionally assimilating the Advanced Along-Track Scanning Radiometer (AATSR) AOD from 2003 to March 2012. JRAero is a global 5-year (2011–2015) reanalysis product constructed by the Meteorological Research Institute of the Japan Meteorological Agency. It has been produced assimilating the MODIS 6-hourly Level 3 AOD product provided by the US Naval Research Laboratory (NRL) and the University of North Dakota (UND) for the purpose of aerosol data assimilation and is based on the NASA operational MODIS Level 2 Collection 5 (Dark Target) AOD data set. This same data set has been previously used, together with MISR AOD, by NRL to produce the NAAPS 11-year (2003–2013) global gridded aerosol reanalysis product at a resolution of 1 latitude 1 longitude.
At the European level, air quality regional reanalyses (including dust) are produced by nine different operational systems and the associated multi-model ensemble through the CAMS regional services of the Copernicus programme. These models assimilate surface observations of , , and and particulate matter ( and ) operationally, and one of the models additionally assimilates AOD in research mode. These products are restricted to an extended European domain, which excludes major desert dust sources in Northern Africa and the Middle East. These reanalyses are produced as an improved product compared to the daily CAMS analyses, by using the latest validated observations, but we note they may not be consistent over the different production periods as they are not necessarily produced with the same model version.
We present here a regional reanalysis focusing specifically on desert dust aerosols that overcomes some of the potential limitations of existing global and regional reanalysis products. The data set was obtained by combining satellite remote sensing dust retrievals with a dynamical model. It spans a 10-year period, from 2007 to 2016; has a horizontal resolution of 0.1 latitude 0.1 longitude in a rotated grid; and has 3-hourly output. It provides a regional reconstruction of past dust conditions across Northern Africa, the Middle East and Europe, including the Mediterranean Sea and parts of central Asia, and the Atlantic and Indian oceans. The reanalysis consists of a set of dust geophysical variables (and their uncertainties) produced with a consistent model and data assimilation scheme, i.e. a frozen version of the code used during the whole simulation period, including emission schemes, input data sets and the retrieval algorithm for the assimilated observations. This ensures the production of a consistent data set avoiding the introduction of spurious trends that could be associated with model or assimilation changes.
We have adopted an ensemble-based data assimilation scheme for the estimation of the dust analysis. The use of ensemble model simulations has allowed for the estimation of flow-dependent background uncertainty, which is otherwise difficult to estimate due to the highly varying nature of dust concentrations. Assimilating AOD may not necessarily constrain individual aerosol components because the aerosol attribution in the analysis increments is typically determined by the model first guess . To at least partly overcome this limitation, we have directly assimilated dust retrievals, namely satellite-derived coarse-mode dust optical depth () at 550 over land surfaces, including bright surfaces such as desert areas. The assimilated retrievals are based on the MODIS DB algorithm , which uses measurements at different wavelengths with a different contrast between the surface and atmospheric aerosols. In particular, the algorithm capitalizes on the much lower surface reflectance at ultraviolet wavelengths than at longer wavelengths.
This new reanalysis data set can be used to support the provision of climate services and monitoring. It can also contribute to the development of dust impact mitigation strategies. For instance, the design of the reanalysis output fields has been tailored to the specific needs in three socio-economic sectors affected by mineral dust, which are air quality and health, energy production, and transport. In addition to the 3D fields of dust mass concentration, the reanalysis data set includes dust extinction and deposition variables, along with other variables associated with meteorology and radiation. In summary, we present here a regional dust reanalysis at an unprecedented resolution using for the first time specific dust retrievals over dust source regions and including grid-level uncertainty estimates.
The following sections describe the different aspects related to the production of the reanalysis: the dust modelling aspect, including the dust sources and emission schemes is outlined in Sect. ; the generation of ensemble perturbations to best characterize model uncertainty is explained in Sect. ; the assimilated dust retrievals and the data assimilation scheme are described in Sects. and , respectively. Additionally, Sect. describes the details of the reanalysis simulation settings, while Sect. describes the content and structure of the reanalysis data set. Section provides an evaluation of the column-integrated dust optical depth (DOD) and in terms of geographical distribution, the study of analysis increments, data assimilation inner diagnostics and comparison against independent observations. Information about the data set availability is provided in Sect. . Finally, conclusions are drawn in Sect. .
Table 1
Overview of the characteristics of the reanalysis.
Reanalysis configuration | |
---|---|
Domain, resolution and output | |
Data set length | 10 years (2007–1016) |
Output frequency | 3 (starting at 03:00 UTC) |
Geographical domain | regional |
Horizontal resolution | 0.1 latitude 0.1 longitude in a rotated grid |
Vertical resolution | 40 hybrid pressure–sigma layers interpolated to 15 standard pressure levels (1000–100 ) |
Top pressure | 50 |
Output variables | 6 (surface), 3 (total column), 3 (upper air) |
Uncertainty estimation | based on the spread in the MONARCH ensemble (12 members) |
Data assimilation (DA) | |
Assimilation algorithm | ensemble-based DA (4D-LETKF; ) |
Control vector | 3D mixing ratio of dust coarse bins (ranging from 1.2 to 20 in dust particle diameter) |
Assimilated observations | MODIS DB at 550 |
Observation satellite platform | NASA Aqua (EOS PM-1) |
Observational coverage | clear sky, snow-free, land and daytime |
Length of the assimilation window | 24 |
Chemical weather system | |
Aerosol model | MONARCH (Multiscale Online Nonhydrostatic AtmospheRe CHemistry model v1.0, |
with improvements; ) | |
Dust emission scheme | MB95 , G01 , K14 |
Particle size bins | eight bins with ranges 0.2–0.36, 0.36–0.6, 0.6–1.2, 1.2–2, 2–3.6, 3.6–6, 6–12 and 12–20 in particle diameter |
Particle size distribution at emission(before perturbation) | PSD as in |
Meteorological model | NMMB |
Meteorological initialization | ERA-Interim and MERRA-2 |
with ERA5 soil information | |
Radiation scheme | RRTM |
LW: OPAC RIs ; SW: mineralogy-based RIs | |
spherical particle shape | |
Microphysics scheme | Ferrier |
Surface layer | NMMB similarity theory |
Land surface scheme | Noah |
Turbulence scheme | Mellor–Yamada–Janjić |
Convection scheme | Betts–Miller–Janjić |
Ensemble generation | multi-parameter, multi-physics source perturbations, and multi-meteorological initial |
and boundary conditions |
The reanalysis has been produced using the Multiscale Online Nonhydrostatic AtmospheRe CHemistry model (MONARCH; ), which consists of advanced chemistry and aerosol packages coupled online with the Nonhydrostatic Multiscale Model on the B grid (NMMB; ). MONARCH is able to work across a wide range of spatial scales thanks to its unified non-hydrostatic dynamical core. In the global setup, MONARCH is run on a latitude–longitude grid, while the regional version used in this work runs on a rotated latitude–longitude grid. Different physics schemes are available in the NMMB to resolve turbulence, convection, soil, radiation and clouds. The exact configuration used in this work is reported in Table , where the key configuration settings are summarized for both modelling and data assimilation aspects.
MONARCH represents the atmospheric dust cycle including emission, transport and deposition along with dust–radiation interactions. A variety of dust emission schemes and configurations are available as described in , ranging from strongly simplified to physics-based parameterizations. Dust transport is produced by horizontal advection, solved with the Adams–Bashforth scheme; vertical advection, solved with the Crank–Nicolson scheme; and lateral diffusion, which follows the Smagorinsky non-linear approach. Furthermore, dust is vertically mixed by turbulent diffusion and deep and shallow convection. Sinks include gravitational settling, dry deposition through turbulent diffusion, and in-cloud and below-cloud scavenging from both stratiform and convective clouds. MONARCH follows a sectional approach for dust, i.e. the size distribution is decomposed into small size bins that range from 0.2 to 20 in diameter. The particle size distribution (PSD) at emission either can be chosen from a set of pre-defined PSDs or is calculated online, depending on the selected emission scheme. In this work, we have used a PSD of emitted dust over sources derived from .
A more detailed description of the dust module of MONARCH can be found in and , with the latter work including also advances developed after the start of the dust reanalysis production. Those recent developments were therefore not yet used in the present work for which a frozen model version is important. Below we provide further details on the configuration of the emission and radiation schemes used in this work.
2.1 Dust emission schemes
MONARCH contains multiple dust emission schemes, of which we used the following three to generate ensemble perturbations for the production of the reanalysis: (i) a scheme based on , hereafter called MB95, which is based on saltation flux and soil texture and was combined with the topographic source mask from as described in ; (ii) the GOCART dust emission scheme from based mainly on a topographic source function, hereafter called G01; (iii) a scheme based on brittle fragmentation by saltation as in , hereafter called K14. The location of dust sources is identified by a climatology of frequency of occurrence (FoO) of DOD greater than 0.2 derived from MODIS DB Collection 6 at the resolution of 0.1 latitude 0.1 longitude ( – see their Sect. 4.3.1) with a minimal threshold for FoO equal to 0.05, below which there is no emission. Surface roughness is accounted for in the dust emission calculation using the drag partition parameterization from with input from MODIS Collection 5 monthly leaf area index for the specific year of simulation from 2007 to 2015 and from a climatology for 2016, combined with a static roughness length for arid regions as described in . The parameter in the drag partition follows . The USGS climatological database for vegetation is used by the meteorology and land surface scheme. A soil moisture correction is used for MB95 and K14 as in with a revised scaling factor as in and . G01 uses the default GOCART soil moisture correction, which is based on as described in , and a threshold friction velocity as described in .
2.2 Radiation and dust optical properties
In MONARCH, dust is coupled online with the RRTMG radiation scheme, which accounts for short-wave (SW) absorption and scattering and long-wave (LW) absorption . The input dust optical properties (extinction efficiency, single-scattering albedo and asymmetry factor) for each particle size bin and wavelength are based on refractive indices (RIs) that account for the variation in mineralogical composition by size in the SW and derived from the OPAC data set in the LW. Optical properties are calculated using Mie scattering theory assuming that dust is spherical despite its well-known non-sphericity . Although MONARCH now allows accounting for the effect of dust non-sphericity upon the optical properties , this option was not ready by the start of the reanalysis production.
Figure 1
Example of assimilated observations for 9 July 2012: retrieved from the Aqua MODIS DB Level 2 products (Collection 6; a) and the associated observation uncertainty used in the assimilation algorithm (b).
[Figure omitted. See PDF]
To calculate the mineralogy-based size-dependent RIs in the SW, we applied the multi-component Maxwell Garnett theory to internal mixtures of eight dominant dust minerals derived from the soil mineralogical atlas of . The single-mineral RIs were taken from . The mineral fractions in each size bin are estimated for each of the 28 soil types considered in the atlas based on brittle fragmentation theory . For each size bin and wavelength, we finally retain the median real and imaginary RIs across the 28 soil types. In the visible band, the obtained median RIs compare well with recent chamber-based retrievals and in situ aircraft measurements , as shown in .
The dust–radiation coupling allows the computation of the direct radiative effect at each radiation time step with a simple double-call approach. We also calculate direct normal irradiance (DNI) and global horizontal irradiance (GHI) at the surface, under all-sky conditions, from downward fluxes in ultraviolet–visible–near-infrared bands of the model. While GHI includes direct and diffuse beams collected by a horizontal unit surface, DNI accounts for the direct beam hitting a normal surface. These variables are useful for applications in the context of solar energy production.
3 Generation of ensemble perturbationsWe adopted an ensemble-based data assimilation scheme to estimate dust. Hence model uncertainty, expressed as background error covariance in the data assimilation algorithm, is estimated from the realizations of the dust fields in an ensemble of MONARCH model calculations. The use of an ensemble of model simulations allows the estimation of a flow-dependent background uncertainty that would otherwise be difficult to estimate due to the highly variable nature of dust concentrations. We generated a 12-member ensemble using different meteorological initial and boundary conditions and dust emission schemes, along with additional perturbations in the model emission parameters. Such perturbations aim at representing the model uncertainty, mainly in dust emission, which is one of the major contributors to model error , but also in other aspects of the dust cycle where meteorology has a role, such as transport and deposition. The characteristics of each ensemble member are listed in Table S1 in the Supplement and described below.
Figure 2
Maps of counts of assimilated observations for the whole period (2007–2016; top row) and for the different seasons (DJF, MAM, JJA, SON; rows 2 to 5) of the 10-year period.
[Figure omitted. See PDF]
The benefit of combining meteorological and aerosol source perturbations is shown in and . The meteorology in our reanalysis is re-initialized every day using global reanalyses. Our ensemble uses two different meteorological reanalyses as initial conditions at the start of every daily run (at 00:00 UTC) and as boundary conditions every 6 . ERA-Interim is used in six ensemble members, and MERRA-2 together with ERA5 soil information is used in the remaining six members.
Experiments conducted in showed that using different dust emission schemes provides a better characterization of the background covariance than a single scheme with parameter perturbations due to the large variability in the modelled emissions. The ensemble uses three different emission schemes briefly introduced in Sect. , namely MB95
We have used for assimilation an innovative DOD data set derived from the MODIS DB aerosol products (Collection 6), which covers all cloud-free and snow-free land surfaces. DB aerosol retrievals are available over areas not easily covered by other observational data sets, e.g. very bright reflective surfaces such as deserts, and are therefore particularly relevant for dust applications. The MODIS Dark Target product, for example, has a limited coverage over land since the retrieval algorithm assumes low surface albedo. The DB algorithm uses top-of-the-atmosphere reflectances at 412 and 470 , and, in the presence of a heavy dust load, also at 650 . It exploits the fact that, over most surfaces, a darker surface and stronger aerosol signal are seen in the blue wavelength range than at longer wavelengths. The quality of the MODIS DB AOD product is improved in Collection 6 compared to previous releases, as shown by the work of and , based on Level 2 and Level 3 retrievals, respectively. Furthermore, a recent study by showed that DB AOD from MODIS (on board the Aqua satellite) is one of the best products when compared to other satellite products.
More specifically, we have assimilated retrieved from MODIS DB Level 2 aerosol products as described in and . The generation of the dust retrievals includes the different steps of formatting, dust filtering and retrieval. First, aerosol products such as AOD, single-scattering albedo and the Ångström exponent are interpolated to a regular grid of 0.1 latitude 0.1 longitude using the algorithm described by . The DOD is then derived from AOD following the methods of with adaptions to MODIS Collection 6 aerosol products. To separate dust from other aerosols, two variables are used: the Ångström exponent, which is highly sensitive to particle size , and a single-scattering albedo at 412 less than 0.95 for dust due to its absorption of solar radiation . Subsequently, an empirical continuous function relating the Ångström exponent to fine-mode AOD
Since the retrievals are based on visible reflectances, their availability is limited to the daytime only. The MODIS instrument is on board two NASA polar-orbiting satellites, namely Aqua and Terra. However, we have considered for assimilation only retrievals based on measurements from MODIS on board the Aqua platform. The equatorial crossing local time of the Aqua satellite is at 13:30 in an ascending orbit. In our 3-hourly discretization of the assimilation window, the assimilated observations are associated with the time slot (or interval) centred at 12:00 UTC and, due to the 4D extension of the implemented LETKF scheme, affect the whole assimilation window.
Figure 3
Schematic of the 24 assimilation window for the production of the reanalysis. The ensemble member analyses are used to initialize the corresponding ensemble member first guess in the subsequent simulation/state estimation window.
[Figure omitted. See PDF]
We have used 0.07 0.075 to characterize the observation uncertainty in the assimilated observations, following the linear model of previous studies with the coefficients adjusted for our application by inflating the uncertainty for low values, which were otherwise detrimental for the analysis. We have assumed a diagonal observation error covariance matrix, i.e. uncorrelated error between the different retrievals. Observation coordinates were pre-processed to be mapped on the rotated longitude–latitude regional grid of MONARCH. Figure shows an example of the extent of the daily observational coverage on a given date (9 July 2012) together with the associated observational uncertainty.
Maps of observation counts are shown in Fig. for the whole reanalysis period (top row of Fig. ) and for the different seasons (rows 2 to 5 of Fig. ), namely the winter seasons represented by December, January and February (DJF); the spring season represented by March, April and May (MAM); the summer season represented by June, July and August (JJA); and the autumn season represented by September, October and November (SON). As expected, there is a higher number of dust retrievals closer to sources than far from them. The total number of retrievals is bigger in the SON and JJA seasons than in the other seasons. During the boreal winter the number of retrievals inland from the Gulf of Guinea increases compared to other times of the year due to transport of dust by northeasterly harmattan winds. The number of dust retrievals decreases in the north of Europe and Asia in the DJF season as MODIS DB covers only snow-free surfaces. Yearly observation counts are consistent throughout the whole period (see Fig. S1 in the Supplement).
5 Data assimilation algorithmThe reanalysis was produced using a local ensemble transform Kalman filter (LETKF) data assimilation scheme (; ; ; ) coupled to the MONARCH ensemble. We have used an implementation of the LETKF scheme with four-dimensional extension (4D-LETKF) as described in in order to estimate the dust analysis over a 24 assimilation window. The overall scheme implements an iterative approach consisting of a forward simulation of the MONARCH ensemble for 24 and a state estimation step. The two steps are coupled at each iteration. The state estimation step is an execution of the LETKF which combines information from the dust observations and the model ensemble simulations. The forward simulation of the MONARCH ensemble is named first guess (or background) to indicate a simulation initialized from an analysis and thus incorporates information from past observations. As a result of the estimation step, the analysis is estimated at each assimilation window using both concurrent and past observations.
The LETKF is well suited to computationally demanding calculations such as the estimation of a high-resolution analysis carried out in this work. The analysis at each model grid point can be calculated independently, and at each grid point only observations within a certain distance are assimilated. Furthermore, the use of a dynamic characterization of model background uncertainty, through ensemble forward simulations, is well suited for highly varying dust fields. A detailed description of the scheme can be found in . Below we discuss the basic concepts behind the LETKF algorithm.
Consider a state vector of the dynamic variables of a system, in our case the dust mass mixing ratio. The mean analysis increment at a grid point is estimated as a linear combination of the background ensemble perturbations :
1 where we use the superscripts a and b to denote the analysis and background state vector, respectively, and where the th column of the matrix is , with ensemble members (12 in our case), i.e. the difference between the th ensemble member and the ensemble mean . is termed the “weight” vector, specifying what linear combination of the background ensemble perturbations is added to the background mean to obtain the analysis ensemble. The weight vector is given by 2 where is the background ensemble perturbation matrix in observation space (or background observation ensemble perturbation matrix); is the observation error covariance matrix, which we assume is diagonal; is the identity matrix; is the vector of observations; and is the mean background observation ensemble. The background observation ensemble is obtained applying the observation operator to the ensemble members ; i.e. .
The 4D extension of the algorithm is coded such that background observation means and perturbation matrices are formed at the various time slots when the observations are available; then they are concatenated to form a combined background observation mean and perturbation matrix , where the time slots are the time intervals into which the assimilation window is split. and are used for the calculation of a weight vector using the standard LETKF; i.e. we calculate a single based on all innovations throughout the day. This same is then applied to the state vector at different times throughout the assimilation window.
Spatial covariance localization can be applied in the LETKF algorithm through localization; i.e. the localization is performed in the observation error covariance matrix, making the influence of an observation on the analysis decay gradually towards zero as the distance from the analysis location increases. The use of spatial localization reduces the effect of spurious long-range covariances due to sampling errors produced by a low dimensionality of the ensemble. To achieve this, the observation error is divided by a distance-dependent function that decays to zero with increasing distance: , where dist is the distance in the grid space between an observation and the model grid and is a horizontal localization factor. The localization factor was set to 15; hence the observation influence practically fades to zero before 30 model grid points away from the observation location (in the horizontal plane).
The control variable is formulated in terms of the total mixing ratio over the five model prognostic variables (corresponding to different dust particle size bins) used to simulate coarse dust in MONARCH. Therefore an observation operator is needed to map the ensemble mean control vector into the observation space. The observation operator has two components: (i) a spatial interpolation of the model simulation to the observation location, which is done at the observation longitude and latitude, and (ii) the calculation of simulated at the wavelength of 550 which is calculated using the five coarse model size bins ranging from 1.2 to 20 in dust particle diameter. The analysis of the model's fine dust fraction (i.e. the three model size bins from 0.2 to 1.2 in dust particle diameter) is estimated proportionally to the change (due to observation assimilation) of the coarse fraction. This choice is motivated by the fact that observations do not carry information about either fine dust particles or particle size distribution. Hereafter, DOD and refer to the wavelength of 550 .
6 Domain, resolution and other simulation settingsThis section presents the key settings for the modelling, observational and data assimilation aspects that have been described in Sects. to and that are summarized in Table . The reanalysis extends over the period 2007–2016 and covers a regional domain centred around Northern Africa, the Middle East and Europe (hereafter called the NAMEE region) that also includes parts of central Asia and the Atlantic and Indian oceans. The domain has a horizontal resolution of 0.1 latitude 0.1 longitude in a rotated grid and 40 hybrid pressure–sigma model layers in the vertical. The model top was set to 5000 . This domain configuration is used operationally to deliver daily forecasts at the World Meteorological Organization Barcelona Dust Regional Center (
Table 2
List of reanalysis variables. For each variable the following ensemble statistics are calculated and archived: ensemble mean, standard deviation, max and median. n/a – not applicable.
Variable description (name in archive) | Unit | Spatial dimension | Description ofdust particle size | First guess | Analysis | |
---|---|---|---|---|---|---|
Dust concentration (concdubin1-8) | 3D | eight bins | ✓ | ✓ | ||
Direct normal irradiance (dni) | 2D | n/a | ✓ | |||
Accumulated dry deposition over the previous 3 (drydu) | 2D | eight bins | ✓ | |||
Dust extinction coefficient at 550 (ec550du) | 3D | total | ✓ | ✓ | ||
Global horizontal irradiance (ghi) | 2D | n/a | ✓ | |||
Dust load (loaddu) | 2D | eight bins | ✓ | ✓ | ||
Dust optical depth at 550 (od550du) | unitless | 2D | total | ✓ | ✓ | |
Coarse dust optical depth at 550 (od550ducoarse) | unitless | 2D | total | ✓ | ✓ | |
Dust surface concentration (sconcdubin1-8) | 2D | eight bins | ✓ | ✓ | ||
Dust surface extinction coefficient (sec550du) | 2D | total | ✓ | ✓ | ||
Accumulated wet deposition over the previous 3 (wetdu) | 2D | eight bins | ✓ | |||
Height of pressure level above sea level () | 3D | n/a | ✓ |
Averaged DOD of first guess (fg), analysis (an) and analysis increments (an-fg) for the full period (2007–2016), for different seasons (DJF, MAM, JJA, SON) and for individual years.
Period | Mean fg DOD | Mean an DOD | Mean analysis |
---|---|---|---|
increments | |||
2007–2016 | 0.1066 | 0.1 | 0.0066 |
DJF | 0.0806 | 0.0781 | 0.0025 |
MAM | 0.1353 | 0.1268 | 0.0084 |
JJA | 0.1364 | 0.1261 | 0.0103 |
SON | 0.0734 | 0.0681 | 0.0053 |
2007 | 0.107 | 0.0994 | 0.0076 |
2008 | 0.1185 | 0.1117 | 0.0068 |
2009 | 0.1049 | 0.0993 | 0.0056 |
2010 | 0.1091 | 0.1041 | 0.005 |
2011 | 0.1056 | 0.0994 | 0.0063 |
2012 | 0.1113 | 0.1055 | 0.0058 |
2013 | 0.0986 | 0.091 | 0.0076 |
2014 | 0.0943 | 0.0879 | 0.0064 |
2015 | 0.1125 | 0.1046 | 0.0079 |
2016 | 0.1044 | 0.0968 | 0.0075 |
The model runs were conducted using a dynamics time step of 20 . Lateral diffusion is called every time step; advection every 2 time steps; turbulence, surface layer, dust emission, sedimentation and dry deposition routines every 4 time steps; moist convection, microphysics and wet deposition every eight time steps; and short- and long-wave radiation routines every 180 time steps. The MONARCH ensemble of forward simulations was run daily at 00:00 UTC during 24 , which was used as the first guess for the data assimilation. Simulation outputs are provided every 3 (03:00, 06:00, 09:00, 12:00, 15:00, 18:00, 21:00 and 00:00 UTC), which is also the time resolution of the reanalysis product. Figure shows the scheme of the 24 assimilation window for the production of the reanalysis where each ensemble member forward simulation is initialized at 00:00 UTC using the dust analysis produced in the previous window.
Simulations were run without inflating the background or analysis covariance errors during the assimilation cycle. A quality control has been applied as in that rejects observations by a first-guess departure check (observations further than 1.4, in , from the first guess are rejected). This quality control is applied since the observations have not been corrected before assimilation for possible systematic biases. After the estimation of total dust coarse mixing ratio analysis, the analysis increments are partitioned among the dust coarse size bins according to their fractional contribution to the total coarse mixing ratio in the forward simulation step (i.e. before assimilation).
A spin-up period was necessary for the soil variables that need a longer period to adjust. We have run a 1-year spin-up with a two-member experiment, each of them initialized using either MERRA-2 or ERA-Interim meteorology with ERA5 soil information. Furthermore, a 2-month spin-up period was needed for the ensemble without data assimilation, to have a good representation of the ensemble spread everywhere in the atmospheric domain.
6.1 Ensemble calibrationMONARCH uses a globally uniform, empirically constrained tuning (or calibration) factor for the total emitted dust mass, referred to as . This factor varies according to the specified configuration settings for the simulation. In particular, it depends on the emission scheme and the meteorological initial and boundary conditions used to initialize the simulation. We calibrated six free-running experiments, which cover all the different combinations between the emission scheme and meteorological conditions. The calibration factors were obtained by rescaling initial values for the calibration factors, namely , by the ratio between the MODIS DB mean and the ensemble free-run mean calculated over the whole domain; i.e.
3 where indicates an ensemble member. We have repeated the estimation twice where the second simulation re-run has used the calibration factors estimated from the first run. The final estimated calibration factors for each of six ensemble members are reported in Table S2.
7 Reanalysis product descriptionThe reanalysis data set consists of three-dimensional (3D) and two-dimensional (2D) variables (see Table ). The 3D, or upper-air, variables include dust mass concentration [] for each dust size bin, the dust extinction coefficient at 550 [] integrated over all size bins and the height of the pressure level above sea level []. The 2D variables are either surface fields or total column fields. The 2D variables for each dust size bin include dry and wet accumulated dust deposition over the previous 3 [] and instantaneous total column dust load [], dust mass surface concentration [], DOD [unitless] and [unitless] at 550 . The set of archived 2D variables is completed by the surface extinction coefficient at 550 [], direct normal irradiance [] and global horizontal irradiance []. These variables have been used to produce dust-relevant information for different sectors and related validation exercises . For example, a dust- field has been derived from the 2D, bin-resolved dust mass surface concentration for air quality applications. This field will be used to evaluate the ability of the reanalysis to reproduce dust concentration values at the ground . Over Europe, the latter will be extracted from measured values following a procedure similar to that described . Furthermore, visibility data from 3D dust-extinction coefficient fields have been used for aviation applications , while soiling index based on wet and dry dust deposition has been used to develop products for solar energy production .
Figure 4
Maps of mean 3-hourly first guess of wet and dry accumulated (over the previous 3 ) dust deposition [] and analysis of total column dust load [] and of the dust surface extinction coefficient at 550 [] calculated for the whole period (2007–2016). Model fields are the ensemble mean.
[Figure omitted. See PDF]
Both analysis and first-guess fields are available for the variables that are diagnosed from the state vector. As mentioned earlier, the first guesses are model forward simulations initialized with an analysis. When available, the analysis field is the recommended output for that variable. A set of ensemble statistics is calculated and archived for each output variable, namely the ensemble mean, standard deviation, maximum and median. The spread among the ensemble members, represented by the standard deviation with respect to the ensemble mean, can be interpreted as a measure for the uncertainty in the mean estimates. Figure shows the ensemble mean over the whole reanalysis period for the analysis or first guess of some of the 2D variables. While model fields have been produced on 40 vertical levels, the data are stored on 15 standard pressure levels between 1000 and 100 (i.e. 1000, 975, 900, 850, 750, 700, 600, 500, 400, 350, 300, 250, 175, 150, 100 ), which were defined taking into account regulatory standards in the aviation sector
The reanalysis data set is structured into individual Network Common Data Form (NetCDF) files per variable and type of ensemble statistics. Further details on the file structure of the data set are reported in Sect. , while the naming convention for the data set files and folders is explained in Appendix .
Figure 5
A number of desert, arid and semi-arid regions of interest for the description of the dust reanalysis. The underlying dust field is the mean 3-hourly DOD analysis calculated for the whole reanalysis period (2007–2016).
[Figure omitted. See PDF]
Figure 6
Maps of mean 3-hourly DOD first guess (first column), analysis (second column) and analysis increments (third column) calculated for the whole period (2007–2016; top row) and for different seasons (DJF, MAM, JJA, SON; rows 2 to 5). Model fields are the ensemble mean.
[Figure omitted. See PDF]
Figure 7
Maps of mean 3-hourly first guess (first column), analysis (second column) and MODIS DB assimilated observations (third column) calculated for the whole period (2007–2016; top row) and for different seasons (DJF, MAM, JJA, SON; rows 2 to 5). Model fields are the ensemble mean and are collocated with the observations.
[Figure omitted. See PDF]
8 DOD evaluationIn this section we validate the reanalysis DOD or in terms of data assimilation inner diagnostics (analysis increments and statistics of departures from assimilated observations) and verify it against independent ground-based observations. We also discuss the DOD spatial and temporal patterns over the reanalysis domain and period. Figure highlights the location of major dust source areas that will be used in the discussion. The verification of DOD and against long-term ground-based observations across the domain is a first step towards a more comprehensive evaluation of the reanalysis data set that is planned in follow-up papers , which include the comparison against independent sets of in situ, column-based and profile retrievals.
Figure 8
Maps of mean 3-hourly for first-guess departures (first column), analysis departures (second column), standard deviation of first-guess departures (third column) and standard deviation of analysis departures (fourth column) calculated for the whole period (2007–2016; top row) and for different seasons (DJF, MAM, JJA, SON; rows 2 to 5). Model fields are the ensemble mean and are collocated with the observations.
[Figure omitted. See PDF]
8.1 DOD geographical distributionFigure shows the ensemble annual and seasonal mean DOD for the first guess (left column) and the analysis (central column) during the whole reanalysis period. In agreement with observations, the highest DOD values are placed over the major emission areas of the domain, in particular in the Bodélé Depression in Chad, the Erg Chech in Algeria, and the El Djouf between Mauritania and Mali, followed by the Arabian Desert; the Taklamakan Desert in northwest China; and the smaller areas of the Grand Erg Oriental in Algeria, the Grand Sand Sea between Libya and Egypt; and the Kharan Desert in southwestern Pakistan. Table reports the averaged DOD of first guess, analysis and analysis minus first guess (analysis increments) when calculated for the whole domain for the full period, for different seasons (DJF, MAM, JJA, SON) and for individual years.
The decadal mean analysis DOD (top row of Fig. ) is generally smaller than the first-guess DOD except in the Taklamakan and Thar deserts and in areas where the mean DOD is below 0.3. Therefore, on average, MONARCH emissions are likely too strong for the configurations used, although a potentially too weak deposition cannot be discarded. The latter is strongly dependent upon the emitted size distribution that evolves during transport.
Seasonal changes in the geographical distribution of the analysis mean DOD (rows 2 to 5 of Fig. ) are consistent with well-known patterns : (i) dust peaks everywhere during spring and summer, in particular, across the Taklamakan Desert during spring when more dust-generating cold fronts arrive in the area; (ii) dust from the south Sahara and Sahel is preferentially transported by northeasterly harmattan winds towards the Gulf of Guinea in winter and spring; (iii) the dust plume that originated in western Africa and is transported across the tropical North Atlantic is shifted towards northern latitudes in summer along with the Intertropical Convergence Zone (ITCZ; ); (iv) dust is strongly mobilized on the Arabian Peninsula and in the Tigris–Euphrates Basin in summer by the north-northwesterly shamal winds; (v) the lowest overall DOD is simulated everywhere in autumn.
8.2 DOD analysis increments
Figure also shows the difference between DOD analysis and first guess (namely analysis increments; right column) averaged over the full reanalysis period (top row). Non-zero systematic analysis increments are to be interpreted as systematic corrections to the model simulations and can serve as a proxy for model bias. By applying these corrections, the analysis improves the underlying model. The patterns of these systematic corrections vary with season and geographical location. While over the entire domain the mean analysis and first guess are comparable, the biggest systematic negative corrections (removing mass from the atmosphere) are linked to overestimation of sources' strengths in the Bodélé Depression in Chad; in the Saudi Arabia lowlands; and in the Balochistan region of southwestern Asia that extends over Iran, Afghanistan and Pakistan and contains, for example, the Kharan Desert. Negative mean increments are also present but to a lesser extent in other arid and semi-arid areas such as the Erg Chech in Algeria, the Great Sand Sea in Libya, the Nubian Desert in Sudan and eastwards of the Caspian Sea. Positive mean increments calculated for the whole reanalysis period are less widespread than the negative increments. The strongest values are over the Thar Desert, in the northern part of Syria, over a long stretch inland from the Mediterranean Sea in the north of Africa, and in the desert of El Djouf between Mauritania and Mali. All in all, as expected, the largest positive or negative analysis increments correspond to areas with more dust load, i.e. to source regions and their vicinity.
The patterns of the mean increments depend upon the season (see rows 2 to 5 of Fig. ). These patterns are clearly linked to the seasonal changes in dust activities in the different regions, as mean increments are, in absolute value, higher in the presence of high mean DODs compared to low DOD values. The areas that show the strongest seasonality with respect to the analysis increments are the Bodélé Depression and the Arabian and Taklamakan deserts. The overestimation of the Bodélé source strength in the first guess is more pronounced in winter and spring. In spring the emissions from the Taklamakan Desert, Syria and the northern part of the Arabian Desert are clearly underestimated, while in summer strong negative increments are present all over the Arabian Desert. Wide areas in the Sahara are affected by negative increments in the spring and summer. The Balochistan region and the Thar Desert show negative and positive increments, respectively, throughout the year, but their magnitudes are greater in spring and summer.
The patterns of the increments are consistent among the different years (Figs. S2 and S3) and vary mostly in the amplitude of the mean corrections, although there are some exceptions. Positive increments over the Thar Desert, northern Syria and the north of the Arabian Desert mainly appear in the first part of the reanalysis, between 2007 and 2012, in contrast to the small positive or even negative increments in the case of the Arabian Desert in the subsequent years. Strong negative increments east of the Caspian Sea are applied mainly through 2007 to 2010. Those yearly differences suggest changes, for example in land use, that are not captured by the model. Negative corrections in the west of the Sahara are more widespread in 2007 and 2008 than in other years due to the higher mean DOD during those 2 years.
8.3 Statistics of departures from assimilated observations
We compare here the reanalysis with the assimilated observations. Figure shows the for the observation-collocated ensemble mean first guess and analysis and for the assimilated observations averaged over the full reanalysis period (top row of Fig. ) and over the DJF, MAM, JJA and SON seasons (from the second to the fifth row of Fig. ). Table reports the corresponding values averaged over the whole domain for the full period, for different seasons and for individual years. By visual inspection, the analysis is closer to the assimilated observations in all the time periods considered, which constitutes a good sanity check for the assimilation scheme. This is also confirmed when the averages are calculated for individual years of the reanalysis period (Figs. S4 and S5). The seasonality in the model simulations closely resembles that in the observations, with MAM and JJA being the most active dust seasons.
Table 4
Averaged of observation-collocated first guess (fg), observation-collocated analysis (an) and assimilated MODIS DB retrievals for the full period (2007–2016), for different seasons (DJF, MAM, JJA, SON) and for individual years.
Period | Mean fg | Mean an | Mean MODIS DB |
---|---|---|---|
2007–2016 | 0.1914 | 0.1685 | 0.1912 |
DJF | 0.1445 | 0.1374 | 0.1573 |
MAM | 0.2323 | 0.2074 | 0.2337 |
JJA | 0.2452 | 0.2073 | 0.2356 |
SON | 0.1427 | 0.1228 | 0.1394 |
2007 | 0.1905 | 0.1644 | 0.1858 |
2008 | 0.2108 | 0.1877 | 0.2114 |
2009 | 0.1892 | 0.1682 | 0.1921 |
2010 | 0.1947 | 0.1772 | 0.202 |
2011 | 0.1894 | 0.1679 | 0.192 |
2012 | 0.195 | 0.1752 | 0.1989 |
2013 | 0.1832 | 0.1577 | 0.1771 |
2014 | 0.1768 | 0.1546 | 0.176 |
2015 | 0.1948 | 0.1697 | 0.1929 |
2016 | 0.1896 | 0.1621 | 0.1836 |
Statistics (mean and SD) of departures of first guess (fg) and analysis (an) from assimilated observations calculated for the full period (2007–2016), for different seasons (DJF, MAM, JJA, SON) and for individual years. The number of observation counts is also reported.
Period | Observation | Mean of fg | Mean of an | SD of fg | SD of an |
---|---|---|---|---|---|
counts | departures | departures | departures | departures | |
2007–2016 | 4.373 10 | 0.0002 | 0.0227 | 0.1841 | 0.0985 |
DJF | 1.008 10 | 0.0128 | 0.0198 | 0.1502 | 0.0827 |
MAM | 1.033 10 | 0.0014 | 0.0262 | 0.212 | 0.1125 |
JJA | 1.157 10 | 0.0096 | 0.0283 | 0.2159 | 0.112 |
SON | 1.176 10 | 0.0033 | 0.0166 | 0.1357 | 0.0744 |
2007 | 4.522 10 | 0.0047 | 0.0214 | 0.1863 | 0.0961 |
2008 | 4.398 10 | 0.0006 | 0.0237 | 0.1928 | 0.1023 |
2009 | 4.274 10 | 0.0029 | 0.0239 | 0.1858 | 0.0984 |
2010 | 4.464 10 | 0.0073 | 0.0248 | 0.1828 | 0.1011 |
2011 | 4.43 10 | 0.0026 | 0.0240 | 0.1825 | 0.1008 |
2012 | 4.417 10 | 0.0039 | 0.0237 | 0.1913 | 0.1554 |
2013 | 4.299 10 | 0.006 | 0.0194 | 0.1767 | 0.0969 |
2014 | 4.41 10 | 0.0008 | 0.0214 | 0.1762 | 0.0924 |
2015 | 4.348 10 | 0.0019 | 0.0232 | 0.1924 | 0.1047 |
2016 | 4.17 10 | 0.0059 | 0.0215 | 0.1823 | 0.0952 |
Figure shows the mean (first and second column of Fig. ) and standard deviation (third and fourth column of Fig. ) of the first-guess and analysis departures (respectively) from assimilated observations averaged over the full reanalysis period (top row of Fig. ) and over the different four seasons (from the second to the fifth row of Fig. ). The corresponding values averaged over the whole domain are reported in Table , together with the number of observation counts and statistics calculated for individual years. The departure statistics, in particular the reduction in the standard deviation of the analysis departures compared to the first guess everywhere in the domain of interest, prove the consistency of our assimilation procedure. This is also the case on a seasonal and yearly basis (Figs. S6 and S7). With respect to the mean departures, the positive mean departures (model simulation minus observations) decrease considerably in the analysis compared to the first guess, while some of the negative mean departures remain unchanged in specific regions or seasons. The latter is the case for example in Europe and Russia, when considering the full reanalysis period or the different seasons. The aforementioned regions see on average much lower values than the rest of the domain and are analysed less efficiently. This is likely due to the ensemble not having a sufficient spread for low simulated concentrations. A similar issue was previously identified in other assimilation systems
Verification of DOD and against AERONET
We compare here the reanalysis DOD and with independent observations that have not been used in the assimilation process. We employed products from the Aerosol Robotic Network (AERONET) of ground-based Sun photometers .
Table 6Dust filters applied to the AERONET retrievals.
Filter name | Dust condition | Dust presence | Non-dust presence |
---|---|---|---|
DOD-dust1 | Pure dust | DOD AOD when AE 0.40 | – |
DOD-dust2 | Pure dust | DOD AOD when AE 0.60 | – |
DOD-mixed1 | Mixed dust | DOD AOD when AE 0.75 | – |
DOD-mixed2 | Mixed dust | DOD AOD when AE 0.75 | DOD 0 when AE 1.2 |
Pure dust | – |
Figure 9
Spatial distribution of the AERONET sites used in this study. The different colours indicate the sub-regions considered in the discussion of the results: northwestern Africa (green), the Middle East (red) and the Mediterranean (blue). When validating the NAMEE region, all the sites are considered, including the ones labelled “Others” (black). © OpenStreetMap contributors 2022. Distributed under the Open Data Commons Open Database License (ODbL) v1.0; © CARTO (
[Figure omitted. See PDF]
Figure 10
Density scatter plots of the reanalysis (MOD) DOD and versus AERONET (OBS) dust-filtered AOD, DOD-mixed2 (a–c) and (d–f), during the whole reanalysis period (2007–2016). The results are calculated for the NAMEE domain and for different time basis: 3-hourly (a, d), daily (b, e) and monthly (c, f). The dust filters applied to the AERONET observations are described in Table . The bin size is 0.01.
[Figure omitted. See PDF]
Figure 11
Density scatter plots of the reanalysis (MOD) DOD and versus AERONET (OBS) dust-filtered AOD, DOD-mixed2 (a–c) and (d–f), during the whole reanalysis period (2007–2016) and on a 3-hourly basis. The results are calculated for three different sub-regions of the reanalysis domain: northwestern Africa (a, d), the Middle East (b, e) and the Mediterranean (c, f). The dust filters applied to the AERONET observations are described in Table . The bin size is 0.01.
[Figure omitted. See PDF]
Table 7Verification statistics (, RMSE, MB and MFB) and number of samples (NDATA) for the reanalysis versus AERONET AODs for the entire period (2007–2016) and NAMEE region and for northwestern Africa, the Middle East and Mediterranean regions. AERONET version 3, cloud-screened, 3-hourly, dust-filtered AOD and comprise the reference. The definition of each of the DOD filters is in Table .
DOD-dust1 | DOD-dust2 | DOD-mixed1 | DOD-mixed2 | ||
---|---|---|---|---|---|
NAMEE region | |||||
NDATA | 68 493 | 99 821 | 122 145 | 260 622 | 242 582 |
0.74 | 0.75 | 0.76 | 0.82 | 0.81 | |
RMSE | 0.25 | 0.22 | 0.21 | 0.15 | 0.09 |
MB | 0.10 | 0.10 | 0.10 | 0.04 | 0.03 |
MFB | 0.43 | 0.51 | 0.58 | 0.79 | 0.91 |
Northwestern Africa | |||||
NDATA | 40 240 | 51 299 | 56 882 | 59 295 | 31 553 |
0.76 | 0.77 | 0.77 | 0.78 | 0.81 | |
RMSE | 0.26 | 0.25 | 0.24 | 0.23 | 0.15 |
MB | 0.10 | 0.10 | 0.10 | 0.09 | 0.05 |
MFB | 0.35 | 0.40 | 0.43 | 0.33 | 0.50 |
Middle East | |||||
NDATA | 15 281 | 23 659 | 29 826 | 41 123 | 49 526 |
0.66 | 0.67 | 0.68 | 0.72 | 0.73 | |
RMSE | 0.24 | 0.22 | 0.21 | 0.19 | 0.13 |
MB | 0.10 | 0.09 | 0.09 | 0.04 | 0.01 |
MFB | 0.27 | 0.29 | 0.30 | 0.33 | 0.04 |
Mediterranean Basin | |||||
NDATA | 9415 | 17 593 | 24 487 | 89 887 | 90 451 |
0.57 | 0.61 | 0.64 | 0.75 | 0.72 | |
RMSE | 0.21 | 0.18 | 0.17 | 0.09 | 0.07 |
MB | 0.12 | 0.11 | 0.10 | 0.02 | 0.03 |
MFB | 0.68 | 0.78 | 0.88 | 1.22 | 1.14 |
We used AERONET version 3 quality-assured data. On the one hand, the modelled at 550 is compared with coarse-mode AOD retrievals at 500 from the spectral de-convolution algorithm
We focus our verification on the NAMEE region. We used data from 140 AERONET stations (Fig. ; see also Tables S3 to S8 for the list of AERONET sites used), which include all the available AERONET sites in the NAMEE domain providing observations during the period of 2007–2016, with the exception of those sites that are at high altitudes (i.e. altitudes greater than about 1850 above sea level). Results are presented for different sub-regions, namely the Middle East, northwestern Africa and the Mediterranean, and for all available AERONET stations including those sites outside the three above-mentioned regions, and they are depicted in Fig. . Model values are ensemble mean analysis fields.
AERONET measurements are nominally taken at 15 intervals. Here we average observations within 30 of the 3-hourly model output times. These averaged observations are used to evaluate the model on a 3-hourly, daily and monthly basis. For the daily and monthly average evaluation, only coincident 3-hourly model output and observations are used. We use verification statistics such as the Pearson correlation coefficient (), mean bias (MB), root mean square error (RMSE) and mean fractional bias (MFB) (see Appendix ) to measure the skill of the model when performing diagnostic analyses of DOD and where AERONET sites are located.
8.4.2 Comparison with 3-hourly, daily and monthly reference data
Overall, the dust reanalysis can reproduce the 3-hourly, daily and monthly observed variability with Pearson correlation coefficients ranging from 0.74 and 0.82, depending on the dust filter, for 3-hourly DOD to up to 0.92 for monthly . The reanalysis tends to underestimate the DOD and compared to AERONET observations (see Fig. ). The model results are dominated by the results in northwestern Africa, and the largest relative underestimations are observed in the Mediterranean and the Middle East (Fig. ), likely because of marine aerosols at these sites. Therefore some model underestimation is expected, in particular in proximity to coastal stations or when mixtures of aerosols are present .
Table 8
Same as Table but using daily reanalysis values and AERONET dust-filtered AOD and averaged on a daily basis as reference.
DOD-dust1 | DOD-dust2 | DOD-mixed1 | DOD-mixed2 | ||
---|---|---|---|---|---|
NAMEE region | |||||
NDATA | 28 662 | 40 767 | 49 551 | 49 551 | 87 007 |
0.77 | 0.79 | 0.79 | 0.79 | 0.84 | |
RMSE | 0.23 | 0.21 | 0.20 | 0.20 | 0.09 |
MB | 0.11 | 0.10 | 0.10 | 0.10 | 0.03 |
MFB | 0.50 | 0.59 | 0.67 | 0.67 | 0.94 |
Northwestern Africa | |||||
NDATA | 15 996 | 19 702 | 21 590 | 21 590 | 11 563 |
0.78 | 0.79 | 0.79 | 0.79 | 0.83 | |
RMSE | 0.25 | 0.24 | 0.23 | 0.23 | 0.15 |
MB | 0.10 | 0.10 | 0.10 | 0.10 | 0.06 |
MFB | 0.37 | 0.42 | 0.46 | 0.46 | 0.54 |
Middle East | |||||
NDATA | 5930 | 8822 | 10 866 | 10 866 | 17 080 |
0.70 | 0.72 | 0.72 | 0.72 | 0.78 | |
RMSE | 0.22 | 0.20 | 0.19 | 0.19 | 0.12 |
MB | 0.11 | 0.10 | 0.09 | 0.09 | 0.02 |
MFB | 0.28 | 0.29 | 0.31 | 0.31 | 0.06 |
Mediterranean Basin | |||||
NDATA | 4637 | 8199 | 11 192 | 11 192 | 32 439 |
0.62 | 0.65 | 0.67 | 0.67 | 0.75 | |
RMSE | 0.20 | 0.17 | 0.16 | 0.16 | 0.06 |
MB | 0.12 | 0.11 | 0.10 | 0.10 | 0.03 |
MFB | 0.78 | 0.90 | 1.00 | 1.00 | 1.15 |
Same as Table but using monthly reanalysis values and AERONET dust-filtered AOD and averaged on a monthly basis as reference.
DOD-dust1 | DOD-dust2 | DOD-mixed1 | DOD-mixed2 | ||
---|---|---|---|---|---|
NAMEE region | |||||
NDATA | 4097 | 5011 | 5439 | 5439 | 4667 |
0.77 | 0.78 | 0.80 | 0.80 | 0.92 | |
RMSE | 0.19 | 0.17 | 0.15 | 0.15 | 0.05 |
MB | 0.12 | 0.11 | 0.11 | 0.11 | 0.03 |
MFB | 0.77 | 0.83 | 0.88 | 0.88 | 0.80 |
Northwestern Africa | |||||
NDATA | 1181 | 1219 | 1231 | 1231 | 611 |
0.81 | 0.84 | 0.85 | 0.85 | 0.92 | |
RMSE | 0.19 | 0.17 | 0.16 | 0.16 | 0.09 |
MB | 0.11 | 0.11 | 0.10 | 0.10 | 0.06 |
MFB | 0.39 | 0.41 | 0.43 | 0.43 | 0.41 |
Middle East | |||||
NDATA | 691 | 798 | 837 | 837 | 783 |
0.73 | 0.72 | 0.75 | 0.75 | 0.86 | |
RMSE | 0.17 | 0.15 | 0.13 | 0.13 | 0.07 |
MB | 0.10 | 0.09 | 0.08 | 0.08 | 0.02 |
MFB | 0.30 | 0.32 | 0.33 | 0.33 | 0.04 |
Mediterranean Basin | |||||
NDATA | 1328 | 1750 | 1942 | 1942 | 1774 |
0.63 | 0.62 | 0.67 | 0.67 | 0.84 | |
RMSE | 0.18 | 0.16 | 0.14 | 0.14 | 0.04 |
MB | 0.13 | 0.12 | 0.11 | 0.11 | 0.03 |
MFB | 0.89 | 0.95 | 1.00 | 1.00 | 0.91 |
Tables to present the verification statistics on a 3-hourly, daily and monthly basis when calculated using the five dust-filtered reference data sets. The stricter the dust filter, the lower the correlation coefficient.
Figure 12
Maps of verification statistics (, MB, RMSE, MFB, from top to bottom) of the analysis DOD (first and second columns) and (third column) versus AERONET dust-filtered AOD (Table ): DOD-mixed1 (left), DOD-mixed2 (middle) and (right) calculated for the whole period (2007–2016). The results are obtained using 3-hourly collocated reanalysis and observation values (see also Table ).
[Figure omitted. See PDF]
The verification results calculated using the DOD-mixed2 dust filter are comparable to those obtained with the reference data set in terms of correlation (0.82 versus 0.81 for the entire region) and MB (0.04 versus 0.03). When considering regional results, the use of the DOD-mixed2 dust filter shows a reduction in the MB together with an increase in MFB in the Mediterranean region (Fig. ). This is directly related to the assumption DOD 0 for AE 1.2 (Table ), which increases the number of collocations particularly in the Mediterranean, where the presence of dust is sporadic. This is confirmed by the comparison with the results obtained with the DOD-mixed1 filter where this condition is neglected (see Fig. ). The RMSE obtained with DOD-mixed2 and reference data shows a clear north-to-south gradient that scales with dust concentrations with maximum values over sources (in northwestern Africa and the Middle East, RMSE 0.12) and minimum values in the Mediterranean (RMSE 0.12).
Figure 13
Time series of monthly verification statistics (, MB, RMSE, MFB) and number of samples (NDATA) for the reanalysis DOD and versus dust-filtered AERONET observations for the period 2007–2016 for the NAMEE domain. Different colours are associated with the results obtained with the different dust filters: DOD-dust1, DOD-dust2, DOD-mixed1, DOD-mixed2 and . The definition of each dust filter is reported in Table . The results are obtained using 3-hourly collocated model and observation values.
[Figure omitted. See PDF]
Monthly DOD and verification statistics are sensitive to the number of AERONET observations, as shown in Fig. . A clear seasonal trend is identified with lower performance in the cloudy winter season than in summer when clear skies are more frequent. Time series of the verification statistics for show a change after 2011, with reductions in MB and RMSE in comparison to previous years. Also the MFB is closer to the MFB from the different dust filters (see Fig. ). This change is associated with a decrease in in the Mediterranean region (not shown here) that is captured by the reanalysis. Underestimations are observed in northwestern Africa and the Mediterranean regions when the DOD-mixed2 and data are used as reference. In summertime in northwestern Africa, we find the largest underestimations (monthly MB 0.10 for DOD-mixed2 and ). These underestimations are likely related to strong dust outbreaks associated with mesoscale convective systems (called haboobs) that the model is not able to capture. In the Middle East, the model shows a systematic underestimation when compared to DOD-mixed2 and reference data, although some overestimation in particular years (2011–2012) is observed. The observed DOD underestimations in comparison with AERONET in the Middle East can be partly attributed to a poor representation of small-scale emission processes such as the wind peak associated with the breakdown of the nocturnal low-level jet, the meteorological effects of orography, sea breezes and cold pools .
Overall, the comparison with AERONET observations shows a good performance of the reanalysis in reproducing the spatial and temporal distribution of mineral dust aerosols over the entire domain and for the 10-year period.
9 Data availabilityThe reanalysis data set is distributed via Thematic Real-time Environmental Distributed Data Services (THREDDS) at BSC and is made freely available at
10 Conclusions and further perspectives
A regional dust reanalysis has been produced using the MONARCH chemical weather prediction system and satellite retrievals of based on MODIS Aqua DB AOD at 550 . The reanalysis data set spans the period 2007–2016 at a horizontal resolution of 0.1 latitude 0.1 longitude in a rotated grid and a temporal resolution of 3 . The reanalysis covers a regional domain centred around Northern Africa, the Middle East and Europe (NAMEE region) that also includes parts of central Asia and the Atlantic and Indian oceans. This paper describes the modelling, observational and assimilation aspects related to the production of the reanalysis, whose unprecedentedly high resolution has required the use of advanced archiving and computing strategies, which are also described in the paper (see Appendix ). The assimilated observations have provided a total column optical constraint on the coarse fraction of dust particles over land, in cloud- and snow-free conditions, and in the daytime, with one satellite overpass per day. Analysis increments were estimated over the whole assimilation window through the 4D implementation of the assimilation algorithm and, to a certain extent (limited by the observation radius of influence), also over sea according to the model background spatial covariance. Re-partitions of analysis increments in the vertical dimension of the model control vector and across the individual model coarse size bins have relied on the model background.
The seasonal changes in the spatial distribution of the dust reanalysis are well characterized and follow well-known, region-dependent dust cycle features controlled by seasonal changes in meteorology (mainly surface winds but also precipitation) and in vegetation cover. The most prominent seasonal features that stand out in the reanalysis are the mobilization of dust during the so-called Asian dust events in the Taklamakan region in spring and by north-northwesterly shamal winds on the Arabian Peninsula and in the Tigris–Euphrates Basin during summer, the transport of south Saharan dust southwest towards the Gulf of Guinea by northeasterly harmattan trade winds during winter and spring, and the northward shift of the plume extending from western Africa over the tropical Atlantic during summer due to movements of the ITCZ.
Diagnostics based on departures of first guess and analysis from assimilated observations provided a sanity check for the quality of our assimilation procedure. As expected, the analysis is statistically closer to the assimilated observations than the first guess. The mean departures are larger in the analysis than in the first guess only in specific regions and seasons, which can be explained by the contamination of aerosols other than dust in the observational data set (for example, biomass burning aerosols produced by fires in central Africa that are advected further north during summer) or by the presence of fairly low values (mainly over Europe and western Russia) that are not analysed as efficiently by the assimilation scheme as the higher values. Mean analysis increments suggest seasonally dependent model biases that follow seasonal dust changes. By applying these corrections, the analysis improves the underlying model. Overall, the spatial distribution of the analysis increments over source regions, as well as in their proximity, highlights the pivotal role of the MODIS DB retrievals in providing an observational constraint over the most critical regions, confirming what previous studies have shown .
The reanalysis DOD and have been validated with highly accurate ground-truth measurements from AERONET on a 3-hourly, daily and monthly basis and with the application of specific dust filters to the reference products or the use of the coarse-mode AOD product. When the latter is used as reference, a Pearson correlation coefficient as high as 0.81 with a MB of 0.05 and a RMSE of 0.12 are estimated when considering the whole reanalysis period and 3-hourly AERONET retrievals. This confirms the good accuracy of the reanalysis data set and its suitability to be used in specific air quality/health and climate service applications. By extending the existing observation-based information intended for mineral dust monitoring, this reanalysis will allow a better quantification of dust impacts upon key sectors of society and the economy. This makes the data set a potentially useful tool in support of climate research and service, including the support to operational early warning systems and to the development of mitigation strategies.
This desert dust reanalysis data set is intended to be the first major endeavour towards the production of BSC aerosol reanalyses over regional or global domains. Extensions of the data set are planned for the near future. A series of companion papers will provide a more comprehensive evaluation of the reanalysis, an analysis of inter-annual variability and trends, and a description of the data set's application in dust-tailored services.
Appendix A Folder and file naming convention of the reanalysis data set
As described in Sect. , the reanalysis data set is structured into individual files per variable and type of ensemble statistics (e.g. an individual file contains the ensemble mean analysis of DOD at 550 ). The filenames include the following terms separated by the underscore sign: the short name of the variable (as reported in Table ); the initial date and time of the data included in the file; a suffix from among av, max, median and std indicating the ensemble mean, max, median and standard deviation, respectively, for that variable and optionally the label an for the variables for which an analysis field is produced. When the latter label is not present, the fields are model first guess. The filenames end with the extension suffix nc identifying NetCDF files. Each individual file contains eight time steps at a 3-hourly time frequency starting at 03:00 UTC. Therefore, for example, the filename for the ensemble mean analysis of DOD at 550 for a given date is od550du_YYYYMMDDHH_av_an.nc, where od550du is the variable short name and YYYYMMDDHH can take values from 2007010103 to 2016123103. The files are organized into folders containing the whole 10-year period for a given variable and type of statistics. Each folder is named with the variable short name followed by the hyphen sign and the suffix indicating the type of ensemble statistics and optionally by the label an preceded by the underscore sign for the variables for which there is an analysis field. Hence the folder containing the files of the example above is named od550du-av_an, while the corresponding ensemble mean first guess data are stored in the folder named od550du-av. To follow on with the same example, the ensemble mean analysis of DOD at 550 for 9 July 2012 can be found in the file path od550du-av_an/od550du_2012070903_av_an.nc.
Appendix B Verification metrics
The definitions of the verification metrics used in this study are reported in Table .
Table B1
Definitions of the verification statistics used in the study. and are the observed and the modelled concentrations at time and location , respectively; and are their averages; is the number of data.
Statistic parameter | Formula |
---|---|
Pearson correlationcoefficient () | |
Mean bias (MB) | |
Root mean squareerror (RMSE) | |
Mean fractionalbias (MFB) |
The reanalysis has been run on the BSC high-performance computing (HPC) infrastructure using the Autosubmit workflow manager , a Python-based tool to create, manage and monitor experiments running on one or multiple remote computing clusters or HPC via the Secure Shell protocol. Scripts and templates to use Autosubmit were developed specifically for the reanalysis to be able to easily run and monitor long simulations by using the BSC HPC resources and store their results in the BSC archive. Autosubmit handles the job submission of the different workflow steps automatically, taking into account interruptions and failures. A functionality to wrap the 12 daily model simulations (each using 768 computing cores) and the data assimilation calculations (using 576 cores) was used to minimize the queuing times. This allows processing a number of days in a row and increasing the parallelism since a single job allocates the total sum of computing nodes required (i.e. 9792 cores were reserved by each wrapper). The number of computing cores for each job has been estimated to balance the execution time of a model simulation and of the assimilation for these jobs to share the computing resources. Additionally, the jobs of two post-processing steps were also wrapped together with the simulation and data assimilation jobs. The post-processing steps are needed to compress and reduce the original model and assimilation output and for the calculation of some basic ensemble statistics (to be provided to the users as main output of the reanalysis) before the final output is transferred to a long-term archive. The job wrapper has been designed with a crossing-date strategy to run two different starting dates within the same experiment. This is done so that the model ensemble simulations from the first starting date can be run in parallel with the assimilation job of the second starting date. This choice of design was made since model ensemble simulations and data assimilation calculations from the same date cannot be run simultaneously due to the obvious dependency of the data assimilation job on the model output. This novel workflow design was developed specifically for the production of the dust reanalysis and has proven successful for such highly computationally intensive calculations.
The supplement related to this article is available online at:
Author contributions
CPGP, SB, EDT, OJ and JE designed the study and discussed the main results. FBa, EC, PF, LM, MM, AV, EW, MK and MG contributed to the discussions. EDT, JE, FM and NS developed and maintained the data assimilation code. PG prepared the assimilated observations. OJ led the MONARCH developments with contributions from CPGP, MG, MK, VO, JE and EDT. SB and CPGP carried out the independent validation, while EDT and JE performed the validation against assimilated observations. FM, MC, GMP, EDT and MO contributed to the development of the computing workflow. FM, AB, GMP and EDT ran the simulations. AB, SB, EDT, PAB and FBe performed the data set quality check and storage. EDT, CPGP and SB wrote the manuscript. All authors commented on the manuscript.
Competing interests
The contact author has declared that neither they nor their co-authors have any competing interests.
Disclaimer
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Acknowledgements
The BSC co-authors acknowledge PRACE (eDUST, eFRAGMENT1 and eFRAGMENT2) and RES (AECT-2019-3-0001, AECT-2020-1-0007, AECT-2020-3-0013) for awarding access to MareNostrum at the BSC and for providing technical support. The authors thank all the principal investigators and their staff for establishing and maintaining the NASA and PHOTONS AERONET sites and the MODIS mission scientists and associated NASA personnel for the production of the data used in this study. The authors also thank John P. Dunne, Julie Letertre-Danczak and the anonymous referee, who provided constructive comments that improved the manuscript.
Financial support
This research has been supported by the DustClim project, which is part of ERA4CS, an ERA-NET programme co-funded by the European Union's Horizon 2020 research and innovation programme (grant no. 690462); the European Research Council (FRAGMENT (grant no. 773051)); grant no. RYC-2015-18690 funded by MCIN/AEI/10.13039/501100011033 and ESF Investing in your future; grant no. CGL2017-88911-R funded by MCIN/AEI/10.13039/501100011033 and ERDF A way of making Europe; the AXA Research Fund (AXA Chair on Sand and Dust Storms); the European Commission, Horizon 2020 Framework Programme (grant no. 792103 (SOLWARIS)); and ATMO-ACCESS (Access to Atmospheric Research Facilities) funded in the frame of the programme H2020-EU.1.4.1.2 (grant no. 101008004, 1 April 2021–31 March 2025). Jerónimo Escribano and Martina Klose have received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreements H2020-MSCA-COFUND-2016-754433 and H2020-MSCA-IF-2017-789630, respectively. Martina Klose received further support through the Helmholtz Association's Initiative and Networking Fund (grant no. VH-NG-1533). This work has been partially funded by the contribution agreement between AEMET and BSC to carry out development and improvement activities of the products and services supplied by the World Meteorological Organization (WMO) Barcelona Dust Regional Center (i.e. the WMO Sand and Dust Storm Warning Advisory and Assessment System (SDS-WAS) Regional Center for Northern Africa, the Middle East and Europe).
Review statement
This paper was edited by Nellie Elguindi and reviewed by Julie Letertre-Danczak and one anonymous referee.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
One of the challenges in studying desert dust aerosol along with its numerous interactions and impacts is the paucity of direct in situ measurements, particularly in the areas most affected by dust storms. Satellites typically provide column-integrated aerosol measurements, but observationally constrained continuous 3D dust fields are needed to assess dust variability, climate effects and impacts upon a variety of socio-economic sectors. Here, we present a high-resolution regional reanalysis data set of desert dust aerosols that covers Northern Africa, the Middle East and Europe along with the Mediterranean Sea and parts of central Asia and the Atlantic and Indian oceans between 2007 and 2016. The horizontal resolution is 0.1
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
















1 Barcelona Supercomputing Center (BSC), Barcelona, Spain
2 NOAA Geophysical Fluid Dynamics Laboratory, Princeton, New Jersey, USA
3 Consiglio Nazionale delle Ricerche–Istituto di Scienze dell'Atmosfera e del Clima (CNR–ISAC), Rome, Italy
4 Izaña Atmospheric Research Center (IARC), Agencia Estatal de Meteorología (AEMET), Santa Cruz de Tenerife, Spain
5 Université Paris Cité and Univ Paris-Est Créteil, CNRS, LISA, 75013 Paris, France
6 Barcelona Supercomputing Center (BSC), Barcelona, Spain; Department of Project and Construction Engineering, Universitat Politècnica de Catalunya – BarcelonaTech (UPC), Terrassa, Spain
7 Barcelona Supercomputing Center (BSC), Barcelona, Spain; Department Troposphere Research, Institute of Meteorology and Climate Research (IMK-TRO), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
8 Consiglio Nazionale delle Ricerche–Istituto di Metodologie per l'Analisi Ambientale (CNR–IMAA), Tito Scalo (PZ), Italy
9 Barcelona Supercomputing Center (BSC), Barcelona, Spain; now at: NASA Goddard Institute for Space Studies (GISS), New York, New York, USA
10 Department of Earth Sciences, Vrije Universiteit Amsterdam, 1081 HV Amsterdam, the Netherlands
11 Section of Governance and Technology for Sustainability (BMS-CSTM), University of Twente, Enschede, the Netherlands; Weather and Climate Change Impact Research, Finnish Meteorological Institute (FMI), Helsinki, Finland
12 Agencia Estatal de Meteorología (AEMET), Barcelona, Spain
13 Barcelona Supercomputing Center (BSC), Barcelona, Spain; ICREA, Catalan Institution for Research and Advanced Studies, Barcelona, Spain