The MONARCH high-resolution reanalysis of desert

Full text

Turn on search term navigation

1 Introduction

Desert (or mineral) dust is the most abundant aerosol by mass in the global atmosphere and plays a key role in the Earth system . It is emitted from the surface by aeolian processes an doriginates predominantly – but not only – from desert regions. Dust affects weather and climate by perturbing the radiative balance directly through scattering and absorption of solar and thermal radiation and indirectly by altering cloud formation and cloud chemistry . It also contributes to the fertilization of the ocean and the land through the deposition of iron and phosphorus, thus affecting the global carbon cycle. All in all, the amount of, spatial distribution of and variability in desert dust have implications on climate, the environment, air quality and human health , and a variety of socio-economic sectors such as aviation and solar energy production . Due to the nature of its emission and transport and its relatively short lifetime , dust varies strongly in space and time, which requires continuous monitoring both in situ and remotely by satellite, airborne and ground-based sensors . A major challenge in studying desert dust along with its impacts is the paucity of direct in situ measurements in the regions most affected by dust storms. There are some operational visibility observations providing qualitative estimates of dust presence , but there is a severe lack of routine surface aerosol concentration measurements . In addition to the lack of in situ observations, there is limited information on aerosol speciation, which is essential to distinguish dust from other aerosol types . Satellites mostly provide column-integrated aerosol information, but spatially and temporally resolved surface dust concentration and deposition estimates are needed to enable detailed impact assessments. Dust observations or retrievals are therefore best exploited in combination with model simulations either to provide optimal initial conditions (analyses) to forecast models or to monitor current and past states of the atmosphere through the production of reanalyses, i.e. complete and consistent four-dimensional reconstructions of the atmosphere.

There are several available global aerosol reanalyses that include desert dust, such as MERRA-2 (Modern-Era Retrospective analysis for Research and Applications, Version 2; ) and CAMSRA (Copernicus Atmosphere Monitoring Service Reanalysis; ) along with their predecessors MERRAero (Modern-Era Retrospective analysis for Research and Applications Aerosol Reanalysis; ) and MACC-II (Monitoring Atmospheric Composition and Climate-II; ), respectively, and the JRAero (Japanese Reanalysis for Aerosol; ) and the NAAPS (Navy Aerosol Analysis and Prediction System; ) reanalyses. These global data sets have been produced at relatively coarse spatial resolution and by assimilating total aerosol optical depth (AOD). MERRA-2 is NASA's latest reanalysis. It has been produced at a spatial resolution of 0.58 $^{\circ}$ latitude $\times$ 0.6258 $^{\circ}$ longitude, with 72 hybrid eta layers and by assimilating bias-corrected, neural-network-retrieved AOD from the Moderate Resolution Imaging Spectroradiometer (MODIS) and from the Advanced Very High Resolution Radiometer (AVHRR; over ocean only), as well as AOD from the Multi-angle Imaging SpectroRadiometer (MISR; over bright surfaces only) and from the Aerosol Robotic Network (AERONET) of Sun photometers. The latest reanalysis for atmospheric composition produced by the Copernicus Atmosphere Monitoring Service (CAMS) CAMSRA covers the period January 2003 to 2020 and is extended by adding 1 year each year. It has been produced at a spatial resolution of $\sim$ 80 $km$ and with 60 hybrid sigma–pressure levels in the vertical, by assimilating Collection 6 MODIS AOD produced with Deep Blue (DB; over land) and Dark Target (over land and ocean) algorithms and by additionally assimilating the Advanced Along-Track Scanning Radiometer (AATSR) AOD from 2003 to March 2012. JRAero is a global 5-year (2011–2015) reanalysis product constructed by the Meteorological Research Institute of the Japan Meteorological Agency. It has been produced assimilating the MODIS 6-hourly Level 3 AOD product provided by the US Naval Research Laboratory (NRL) and the University of North Dakota (UND) for the purpose of aerosol data assimilation and is based on the NASA operational MODIS Level 2 Collection 5 (Dark Target) AOD data set. This same data set has been previously used, together with MISR AOD, by NRL to produce the NAAPS 11-year (2003–2013) global gridded aerosol reanalysis product at a resolution of 1 $^{\circ}$ latitude $\times$ 1 $^{\circ}$ longitude.

At the European level, air quality regional reanalyses (including dust) are produced by nine different operational systems and the associated multi-model ensemble through the CAMS regional services of the Copernicus programme. These models assimilate surface observations of $O_{3}$ , ${SO}_{2}$ , ${NO}_{2}$ and $CO$ and particulate matter ( ${PM}_{2.5}$ and ${PM}_{10}$ ) operationally, and one of the models additionally assimilates AOD in research mode. These products are restricted to an extended European domain, which excludes major desert dust sources in Northern Africa and the Middle East. These reanalyses are produced as an improved product compared to the daily CAMS analyses, by using the latest validated observations, but we note they may not be consistent over the different production periods as they are not necessarily produced with the same model version.

We present here a regional reanalysis focusing specifically on desert dust aerosols that overcomes some of the potential limitations of existing global and regional reanalysis products. The data set was obtained by combining satellite remote sensing dust retrievals with a dynamical model. It spans a 10-year period, from 2007 to 2016; has a horizontal resolution of 0.1 $^{\circ}$ latitude $\times$ 0.1 $^{\circ}$ longitude in a rotated grid; and has 3-hourly output. It provides a regional reconstruction of past dust conditions across Northern Africa, the Middle East and Europe, including the Mediterranean Sea and parts of central Asia, and the Atlantic and Indian oceans. The reanalysis consists of a set of dust geophysical variables (and their uncertainties) produced with a consistent model and data assimilation scheme, i.e. a frozen version of the code used during the whole simulation period, including emission schemes, input data sets and the retrieval algorithm for the assimilated observations. This ensures the production of a consistent data set avoiding the introduction of spurious trends that could be associated with model or assimilation changes.

We have adopted an ensemble-based data assimilation scheme for the estimation of the dust analysis. The use of ensemble model simulations has allowed for the estimation of flow-dependent background uncertainty, which is otherwise difficult to estimate due to the highly varying nature of dust concentrations. Assimilating AOD may not necessarily constrain individual aerosol components because the aerosol attribution in the analysis increments is typically determined by the model first guess . To at least partly overcome this limitation, we have directly assimilated dust retrievals, namely satellite-derived coarse-mode dust optical depth ( ${DOD}_{coarse}$ ) at 550 $nm$ over land surfaces, including bright surfaces such as desert areas. The assimilated retrievals are based on the MODIS DB algorithm , which uses measurements at different wavelengths with a different contrast between the surface and atmospheric aerosols. In particular, the algorithm capitalizes on the much lower surface reflectance at ultraviolet wavelengths than at longer wavelengths.

This new reanalysis data set can be used to support the provision of climate services and monitoring. It can also contribute to the development of dust impact mitigation strategies. For instance, the design of the reanalysis output fields has been tailored to the specific needs in three socio-economic sectors affected by mineral dust, which are air quality and health, energy production, and transport. In addition to the 3D fields of dust mass concentration, the reanalysis data set includes dust extinction and deposition variables, along with other variables associated with meteorology and radiation. In summary, we present here a regional dust reanalysis at an unprecedented resolution using for the first time specific dust retrievals over dust source regions and including grid-level uncertainty estimates.

The following sections describe the different aspects related to the production of the reanalysis: the dust modelling aspect, including the dust sources and emission schemes is outlined in Sect. ; the generation of ensemble perturbations to best characterize model uncertainty is explained in Sect. ; the assimilated dust retrievals and the data assimilation scheme are described in Sects. and , respectively. Additionally, Sect. describes the details of the reanalysis simulation settings, while Sect. describes the content and structure of the reanalysis data set. Section provides an evaluation of the column-integrated dust optical depth (DOD) and ${DOD}_{coarse}$ in terms of geographical distribution, the study of analysis increments, data assimilation inner diagnostics and comparison against independent observations. Information about the data set availability is provided in Sect. . Finally, conclusions are drawn in Sect. .

Table 1

Overview of the characteristics of the reanalysis.

	Reanalysis configuration
Domain, resolution and output
Data set length	10 years (2007–1016)
Output frequency	3 $h$ (starting at 03:00 UTC)
Geographical domain	regional
Horizontal resolution	0.1 $^{\circ}$ latitude $\times$ 0.1 $^{\circ}$ longitude in a rotated grid
Vertical resolution	40 hybrid pressure–sigma layers interpolated to 15 standard pressure levels (1000–100 $hPa$ )
Top pressure	50 $hPa$
Output variables	6 (surface), 3 (total column), 3 (upper air)
Uncertainty estimation	based on the spread in the MONARCH ensemble (12 members)
Data assimilation (DA)
Assimilation algorithm	ensemble-based DA (4D-LETKF; )
Control vector	3D mixing ratio of dust coarse bins (ranging from 1.2 to 20 $µ m$ in dust particle diameter)
Assimilated observations	MODIS DB ${DOD}_{coarse}$ at 550 $nm$
Observation satellite platform	NASA Aqua (EOS PM-1)
Observational coverage	clear sky, snow-free, land and daytime
Length of the assimilation window	24 $h$
Chemical weather system
Aerosol model	MONARCH (Multiscale Online Nonhydrostatic AtmospheRe CHemistry model v1.0,
	with improvements; )
Dust emission scheme	MB95 , G01 , K14
Particle size bins	eight bins with ranges 0.2–0.36, 0.36–0.6, 0.6–1.2, 1.2–2, 2–3.6, 3.6–6, 6–12 and 12–20 $µ m$ in particle diameter
Particle size distribution at emission(before perturbation)	PSD as in
Meteorological model	NMMB Nonhydrostatic Multi-scale Model on the B grid;
Meteorological initialization	ERA-Interim and MERRA-2
	with ERA5 soil information
Radiation scheme	RRTM
	LW: OPAC RIs ; SW: mineralogy-based RIs
	spherical particle shape
Microphysics scheme	Ferrier
Surface layer	NMMB similarity theory
Land surface scheme	Noah
Turbulence scheme	Mellor–Yamada–Janjić
Convection scheme	Betts–Miller–Janjić
Ensemble generation	multi-parameter, multi-physics source perturbations, and multi-meteorological initial
	and boundary conditions

2 MONARCH modelling system

The reanalysis has been produced using the Multiscale Online Nonhydrostatic AtmospheRe CHemistry model (MONARCH; ), which consists of advanced chemistry and aerosol packages coupled online with the Nonhydrostatic Multiscale Model on the B grid (NMMB; ). MONARCH is able to work across a wide range of spatial scales thanks to its unified non-hydrostatic dynamical core. In the global setup, MONARCH is run on a latitude–longitude grid, while the regional version used in this work runs on a rotated latitude–longitude grid. Different physics schemes are available in the NMMB to resolve turbulence, convection, soil, radiation and clouds. The exact configuration used in this work is reported in Table , where the key configuration settings are summarized for both modelling and data assimilation aspects.

MONARCH represents the atmospheric dust cycle including emission, transport and deposition along with dust–radiation interactions. A variety of dust emission schemes and configurations are available as described in , ranging from strongly simplified to physics-based parameterizations. Dust transport is produced by horizontal advection, solved with the Adams–Bashforth scheme; vertical advection, solved with the Crank–Nicolson scheme; and lateral diffusion, which follows the Smagorinsky non-linear approach. Furthermore, dust is vertically mixed by turbulent diffusion and deep and shallow convection. Sinks include gravitational settling, dry deposition through turbulent diffusion, and in-cloud and below-cloud scavenging from both stratiform and convective clouds. MONARCH follows a sectional approach for dust, i.e. the size distribution is decomposed into small size bins that range from 0.2 to 20 $µ m$ in diameter. The particle size distribution (PSD) at emission either can be chosen from a set of pre-defined PSDs or is calculated online, depending on the selected emission scheme. In this work, we have used a PSD of emitted dust over sources derived from .

A more detailed description of the dust module of MONARCH can be found in and , with the latter work including also advances developed after the start of the dust reanalysis production. Those recent developments were therefore not yet used in the present work for which a frozen model version is important. Below we provide further details on the configuration of the emission and radiation schemes used in this work.

2.1 Dust emission schemes

MONARCH contains multiple dust emission schemes, of which we used the following three to generate ensemble perturbations for the production of the reanalysis: (i) a scheme based on , hereafter called MB95, which is based on saltation flux and soil texture and was combined with the topographic source mask from as described in ; (ii) the GOCART dust emission scheme from based mainly on a topographic source function, hereafter called G01; (iii) a scheme based on brittle fragmentation by saltation as in , hereafter called K14. The location of dust sources is identified by a climatology of frequency of occurrence (FoO) of DOD greater than 0.2 derived from MODIS DB Collection 6 at the resolution of 0.1 $^{\circ}$ latitude $\times$ 0.1 $^{\circ}$ longitude ( – see their Sect. 4.3.1) with a minimal threshold for FoO equal to 0.05, below which there is no emission. Surface roughness is accounted for in the dust emission calculation using the drag partition parameterization from with input from MODIS Collection 5 monthly leaf area index for the specific year of simulation from 2007 to 2015 and from a climatology for 2016, combined with a static roughness length for arid regions as described in . The $X$ parameter in the drag partition follows . The USGS climatological database for vegetation is used by the meteorology and land surface scheme. A soil moisture correction is used for MB95 and K14 as in with a revised scaling factor as in and . G01 uses the default GOCART soil moisture correction, which is based on as described in , and a threshold friction velocity as described in .

2.2 Radiation and dust optical properties

In MONARCH, dust is coupled online with the RRTMG radiation scheme, which accounts for short-wave (SW) absorption and scattering and long-wave (LW) absorption . The input dust optical properties (extinction efficiency, single-scattering albedo and asymmetry factor) for each particle size bin and wavelength are based on refractive indices (RIs) that account for the variation in mineralogical composition by size in the SW and derived from the OPAC data set in the LW. Optical properties are calculated using Mie scattering theory assuming that dust is spherical despite its well-known non-sphericity . Although MONARCH now allows accounting for the effect of dust non-sphericity upon the optical properties , this option was not ready by the start of the reanalysis production.

Figure 1

Example of assimilated observations for 9 July 2012: ${DOD}_{coarse}$ retrieved from the Aqua MODIS DB Level 2 products (Collection 6; a) and the associated observation uncertainty used in the assimilation algorithm (b).

[Figure omitted. See PDF]

To calculate the mineralogy-based size-dependent RIs in the SW, we applied the multi-component Maxwell Garnett theory to internal mixtures of eight dominant dust minerals derived from the soil mineralogical atlas of . The single-mineral RIs were taken from . The mineral fractions in each size bin are estimated for each of the 28 soil types considered in the atlas based on brittle fragmentation theory . For each size bin and wavelength, we finally retain the median real and imaginary RIs across the 28 soil types. In the visible band, the obtained median RIs compare well with recent chamber-based retrievals and in situ aircraft measurements , as shown in .

The dust–radiation coupling allows the computation of the direct radiative effect at each radiation time step with a simple double-call approach. We also calculate direct normal irradiance (DNI) and global horizontal irradiance (GHI) at the surface, under all-sky conditions, from downward fluxes in ultraviolet–visible–near-infrared bands of the model. While GHI includes direct and diffuse beams collected by a horizontal unit surface, DNI accounts for the direct beam hitting a normal surface. These variables are useful for applications in the context of solar energy production.

3 Generation of ensemble perturbations

We adopted an ensemble-based data assimilation scheme to estimate dust. Hence model uncertainty, expressed as background error covariance in the data assimilation algorithm, is estimated from the realizations of the dust fields in an ensemble of MONARCH model calculations. The use of an ensemble of model simulations allows the estimation of a flow-dependent background uncertainty that would otherwise be difficult to estimate due to the highly variable nature of dust concentrations. We generated a 12-member ensemble using different meteorological initial and boundary conditions and dust emission schemes, along with additional perturbations in the model emission parameters. Such perturbations aim at representing the model uncertainty, mainly in dust emission, which is one of the major contributors to model error , but also in other aspects of the dust cycle where meteorology has a role, such as transport and deposition. The characteristics of each ensemble member are listed in Table S1 in the Supplement and described below.

Figure 2

Maps of counts of assimilated observations for the whole period (2007–2016; top row) and for the different seasons (DJF, MAM, JJA, SON; rows 2 to 5) of the 10-year period.

[Figure omitted. See PDF]

The benefit of combining meteorological and aerosol source perturbations is shown in and . The meteorology in our reanalysis is re-initialized every day using global reanalyses. Our ensemble uses two different meteorological reanalyses as initial conditions at the start of every daily run (at 00:00 UTC) and as boundary conditions every 6 $h$ . ERA-Interim is used in six ensemble members, and MERRA-2 together with ERA5 soil information is used in the remaining six members.

Experiments conducted in showed that using different dust emission schemes provides a better characterization of the background covariance than a single scheme with parameter perturbations due to the large variability in the modelled emissions. The ensemble uses three different emission schemes briefly introduced in Sect. , namely MB95 as in, G01 as in and K14 as in. Each emission scheme was used four times (twice in each of the two six-member groups driven by the different meteorological reanalyses). In addition, each of the 12 ensemble members was run with a different value for one or more parameters in the corresponding emission scheme following . Specifically, we perturbed the threshold friction (or threshold wind velocity for one of the emission schemes), which is soil-moisture-dependent and determines the friction or wind velocity above which soil particles begin to move in saltation, and the dust emission flux across each of the eight dust model bins. The threshold friction or wind velocity was perturbed by drawing a multiplicative random factor from a normal distribution with mean 1 and spread 0.4. The dust emission flux was perturbed imposing a physical constraint. Correlated noise was used across the bins so that noise correlation decreases with increased difference in the normalized cubic radius between the bins; the noise has mean 1 and a standard deviation of 30 % of the unperturbed value in each bin. These emitted size distribution perturbations used here are analogous to those in Fig. 1 in but departing from instead of . The structure of the emission parameter perturbations is temporally and spatially constant.

4 Assimilated observations

We have used for assimilation an innovative DOD data set derived from the MODIS DB aerosol products (Collection 6), which covers all cloud-free and snow-free land surfaces. DB aerosol retrievals are available over areas not easily covered by other observational data sets, e.g. very bright reflective surfaces such as deserts, and are therefore particularly relevant for dust applications. The MODIS Dark Target product, for example, has a limited coverage over land since the retrieval algorithm assumes low surface albedo. The DB algorithm uses top-of-the-atmosphere reflectances at 412 and 470 $nm$ , and, in the presence of a heavy dust load, also at 650 $nm$ . It exploits the fact that, over most surfaces, a darker surface and stronger aerosol signal are seen in the blue wavelength range than at longer wavelengths. The quality of the MODIS DB AOD product is improved in Collection 6 compared to previous releases, as shown by the work of and , based on Level 2 and Level 3 retrievals, respectively. Furthermore, a recent study by showed that DB AOD from MODIS (on board the Aqua satellite) is one of the best products when compared to other satellite products.

More specifically, we have assimilated ${DOD}_{coarse}$ retrieved from MODIS DB Level 2 aerosol products as described in and . The generation of the dust retrievals includes the different steps of formatting, dust filtering and retrieval. First, aerosol products such as AOD, single-scattering albedo and the Ångström exponent are interpolated to a regular grid of 0.1 $^{\circ}$ latitude $\times$ 0.1 $^{\circ}$ longitude using the algorithm described by . The DOD is then derived from AOD following the methods of with adaptions to MODIS Collection 6 aerosol products. To separate dust from other aerosols, two variables are used: the Ångström exponent, which is highly sensitive to particle size , and a single-scattering albedo at 412 $nm$ less than 0.95 for dust due to its absorption of solar radiation . Subsequently, an empirical continuous function relating the Ångström exponent to fine-mode AOD their Eq. 5 is applied to retrieve the dust fine-mode fraction of optical depth.

Since the retrievals are based on visible reflectances, their availability is limited to the daytime only. The MODIS instrument is on board two NASA polar-orbiting satellites, namely Aqua and Terra. However, we have considered for assimilation only ${DOD}_{coarse}$ retrievals based on measurements from MODIS on board the Aqua platform. The equatorial crossing local time of the Aqua satellite is at 13:30 in an ascending orbit. In our 3-hourly discretization of the assimilation window, the assimilated observations are associated with the time slot (or interval) centred at 12:00 UTC and, due to the 4D extension of the implemented LETKF scheme, affect the whole assimilation window.

Figure 3

Schematic of the 24 $h$ assimilation window for the production of the reanalysis. The ensemble member analyses are used to initialize the corresponding ensemble member first guess in the subsequent simulation/state estimation window.

[Figure omitted. See PDF]

We have used 0.07 $+$ 0.075 ${DOD}_{coarse}$ to characterize the observation uncertainty in the assimilated observations, following the linear model of previous studies with the coefficients adjusted for our application by inflating the uncertainty for low ${DOD}_{coarse}$ values, which were otherwise detrimental for the analysis. We have assumed a diagonal observation error covariance matrix, i.e. uncorrelated error between the different retrievals. Observation coordinates were pre-processed to be mapped on the rotated longitude–latitude regional grid of MONARCH. Figure shows an example of the extent of the daily observational coverage on a given date (9 July 2012) together with the associated observational uncertainty.

Maps of observation counts are shown in Fig. for the whole reanalysis period (top row of Fig. ) and for the different seasons (rows 2 to 5 of Fig. ), namely the winter seasons represented by December, January and February (DJF); the spring season represented by March, April and May (MAM); the summer season represented by June, July and August (JJA); and the autumn season represented by September, October and November (SON). As expected, there is a higher number of dust retrievals closer to sources than far from them. The total number of retrievals is bigger in the SON and JJA seasons than in the other seasons. During the boreal winter the number of retrievals inland from the Gulf of Guinea increases compared to other times of the year due to transport of dust by northeasterly harmattan winds. The number of dust retrievals decreases in the north of Europe and Asia in the DJF season as MODIS DB covers only snow-free surfaces. Yearly observation counts are consistent throughout the whole period (see Fig. S1 in the Supplement).

5 Data assimilation algorithm

The reanalysis was produced using a local ensemble transform Kalman filter (LETKF) data assimilation scheme (; ; ; ) coupled to the MONARCH ensemble. We have used an implementation of the LETKF scheme with four-dimensional extension (4D-LETKF) as described in in order to estimate the dust analysis over a 24 $h$ assimilation window. The overall scheme implements an iterative approach consisting of a forward simulation of the MONARCH ensemble for 24 $h$ and a state estimation step. The two steps are coupled at each iteration. The state estimation step is an execution of the LETKF which combines information from the dust observations and the model ensemble simulations. The forward simulation of the MONARCH ensemble is named first guess (or background) to indicate a simulation initialized from an analysis and thus incorporates information from past observations. As a result of the estimation step, the analysis is estimated at each assimilation window using both concurrent and past observations.

The LETKF is well suited to computationally demanding calculations such as the estimation of a high-resolution analysis carried out in this work. The analysis at each model grid point can be calculated independently, and at each grid point only observations within a certain distance are assimilated. Furthermore, the use of a dynamic characterization of model background uncertainty, through ensemble forward simulations, is well suited for highly varying dust fields. A detailed description of the scheme can be found in . Below we discuss the basic concepts behind the LETKF algorithm.

Consider a state vector $x$ of the dynamic variables of a system, in our case the dust mass mixing ratio. The mean analysis increment at a grid point is estimated as a linear combination of the background ensemble perturbations $X^{b}$ :

1 ${\overline{x}}^{a} = {\overline{x}}^{b} + X^{b} w,$ where we use the superscripts a and b to denote the analysis and background state vector, respectively, and where the $i$ th column of the matrix $X^{b}$ is $x^{b (i)} - {\overline{x}}^{b}$ , ${i = 1, 2, \dots, k}$ with $k$ ensemble members (12 in our case), i.e. the difference between the $i$ th ensemble member $x^{b (i)}$ and the ensemble mean ${\overline{x}}^{b}$ . $w$ is termed the “weight” vector, specifying what linear combination of the background ensemble perturbations is added to the background mean to obtain the analysis ensemble. The weight vector is given by 2 $w = [Y^{b} R^{- 1} Y^{b} + (k - 1) I]^{- 1} Y^{b} R^{- 1} (y^{o} - {\overline{y}}^{b}),$ where $Y^{b}$ is the background ensemble perturbation matrix in observation space (or background observation ensemble perturbation matrix); $R$ is the observation error covariance matrix, which we assume is diagonal; $I$ is the identity matrix; $y^{o}$ is the vector of observations; and ${\overline{y}}^{b}$ is the mean background observation ensemble. The background observation ensemble is obtained applying the observation operator $h (\cdot)$ to the ensemble members $x^{b (i)}$ ; i.e. $y^{b (i)} = h (x^{b (i)})$ .

The 4D extension of the algorithm is coded such that background observation means ${\overline{y}}_{j}$ and perturbation matrices $Y_{j}$ are formed at the various time slots $j$ when the observations are available; then they are concatenated to form a combined background observation mean $\overline{y}$ and perturbation matrix $Y$ , where the time slots are the time intervals into which the assimilation window is split. $\overline{y}$ and $Y$ are used for the calculation of a weight vector $w$ using the standard LETKF; i.e. we calculate a single $w$ based on all innovations throughout the day. This same $w$ is then applied to the state vector at different times throughout the assimilation window.

Spatial covariance localization can be applied in the LETKF algorithm through $R$ localization; i.e. the localization is performed in the observation error covariance matrix, making the influence of an observation on the analysis decay gradually towards zero as the distance from the analysis location increases. The use of spatial localization reduces the effect of spurious long-range covariances due to sampling errors produced by a low dimensionality of the ensemble. To achieve this, the observation error is divided by a distance-dependent function that decays to zero with increasing distance: $e^{- \frac{{dist}^{2}}{l^{2}}}$ , where dist is the distance in the grid space between an observation and the model grid and $l$ is a horizontal localization factor. The localization factor was set to 15; hence the observation influence practically fades to zero before 30 model grid points away from the observation location (in the horizontal plane).

The control variable is formulated in terms of the total mixing ratio over the five model prognostic variables (corresponding to different dust particle size bins) used to simulate coarse dust in MONARCH. Therefore an observation operator is needed to map the ensemble mean control vector into the observation space. The observation operator has two components: (i) a spatial interpolation of the model simulation to the observation location, which is done at the observation longitude and latitude, and (ii) the calculation of simulated ${DOD}_{coarse}$ at the wavelength of 550 $nm$ which is calculated using the five coarse model size bins ranging from 1.2 to 20 $µ m$ in dust particle diameter. The analysis of the model's fine dust fraction (i.e. the three model size bins from 0.2 to 1.2 $µ m$ in dust particle diameter) is estimated proportionally to the change (due to observation assimilation) of the coarse fraction. This choice is motivated by the fact that observations do not carry information about either fine dust particles or particle size distribution. Hereafter, DOD and ${DOD}_{coarse}$ refer to the wavelength of 550 $nm$ .

6 Domain, resolution and other simulation settings

This section presents the key settings for the modelling, observational and data assimilation aspects that have been described in Sects. to and that are summarized in Table . The reanalysis extends over the period 2007–2016 and covers a regional domain centred around Northern Africa, the Middle East and Europe (hereafter called the NAMEE region) that also includes parts of central Asia and the Atlantic and Indian oceans. The domain has a horizontal resolution of 0.1 $^{\circ}$ latitude $\times$ 0.1 $^{\circ}$ longitude in a rotated grid and 40 hybrid pressure–sigma model layers in the vertical. The model top was set to 5000 $Pa$ . This domain configuration is used operationally to deliver daily forecasts at the World Meteorological Organization Barcelona Dust Regional Center (https://dust.aemet.es/, last access: 8 April 2022).

Table 2

List of reanalysis variables. For each variable the following ensemble statistics are calculated and archived: ensemble mean, standard deviation, max and median. n/a – not applicable.

Variable description (name in archive)	Unit	Spatial dimension	Description ofdust particle size	First guess	Analysis
Dust concentration (concdubin1-8)	$kg m^{- 3}$	3D	eight bins	✓	✓
Direct normal irradiance (dni)	$W m^{- 2}$	2D	n/a	✓
Accumulated dry deposition over the previous 3 $h$ (drydu)	$kg m^{- 2} (3 h)^{- 1}$	2D	eight bins	✓
Dust extinction coefficient at 550 $nm$ (ec550du)	$m^{- 1}$	3D	total	✓	✓
Global horizontal irradiance (ghi)	$W m^{- 2}$	2D	n/a	✓
Dust load (loaddu)	$kg m^{- 2}$	2D	eight bins	✓	✓
Dust optical depth at 550 $nm$ (od550du)	unitless	2D	total	✓	✓
Coarse dust optical depth at 550 $nm$ (od550ducoarse)	unitless	2D	total	✓	✓
Dust surface concentration (sconcdubin1-8)	$kg m^{- 3}$	2D	eight bins	✓	✓
Dust surface extinction coefficient (sec550du)	$m^{- 1}$	2D	total	✓	✓
Accumulated wet deposition over the previous 3 $h$ (wetdu)	$kg m^{- 2} (3 h)^{- 1}$	2D	eight bins	✓
Height of pressure level above sea level ( $z$ )	$m$	3D	n/a	✓

Table 3

Averaged DOD of first guess (fg), analysis (an) and analysis increments (an-fg) for the full period (2007–2016), for different seasons (DJF, MAM, JJA, SON) and for individual years.

Period	Mean fg DOD	Mean an DOD	Mean analysis
			increments
2007–2016	0.1066	0.1	$-$ 0.0066
DJF	0.0806	0.0781	$-$ 0.0025
MAM	0.1353	0.1268	$-$ 0.0084
JJA	0.1364	0.1261	$-$ 0.0103
SON	0.0734	0.0681	$-$ 0.0053
2007	0.107	0.0994	$-$ 0.0076
2008	0.1185	0.1117	$-$ 0.0068
2009	0.1049	0.0993	$-$ 0.0056
2010	0.1091	0.1041	$-$ 0.005
2011	0.1056	0.0994	$-$ 0.0063
2012	0.1113	0.1055	$-$ 0.0058
2013	0.0986	0.091	$-$ 0.0076
2014	0.0943	0.0879	$-$ 0.0064
2015	0.1125	0.1046	$-$ 0.0079
2016	0.1044	0.0968	$-$ 0.0075

The model runs were conducted using a dynamics time step of 20 $s$ . Lateral diffusion is called every time step; advection every 2 time steps; turbulence, surface layer, dust emission, sedimentation and dry deposition routines every 4 time steps; moist convection, microphysics and wet deposition every eight time steps; and short- and long-wave radiation routines every 180 time steps. The MONARCH ensemble of forward simulations was run daily at 00:00 UTC during 24 $h$ , which was used as the first guess for the data assimilation. Simulation outputs are provided every 3 $h$ (03:00, 06:00, 09:00, 12:00, 15:00, 18:00, 21:00 and 00:00 UTC), which is also the time resolution of the reanalysis product. Figure shows the scheme of the 24 $h$ assimilation window for the production of the reanalysis where each ensemble member forward simulation is initialized at 00:00 UTC using the dust analysis produced in the previous window.

Simulations were run without inflating the background or analysis covariance errors during the assimilation cycle. A quality control has been applied as in that rejects observations by a first-guess departure check (observations further than 1.4, in ${DOD}_{coarse}$ , from the first guess are rejected). This quality control is applied since the observations have not been corrected before assimilation for possible systematic biases. After the estimation of total dust coarse mixing ratio analysis, the analysis increments are partitioned among the dust coarse size bins according to their fractional contribution to the total coarse mixing ratio in the forward simulation step (i.e. before assimilation).

A spin-up period was necessary for the soil variables that need a longer period to adjust. We have run a 1-year spin-up with a two-member experiment, each of them initialized using either MERRA-2 or ERA-Interim meteorology with ERA5 soil information. Furthermore, a 2-month spin-up period was needed for the ensemble without data assimilation, to have a good representation of the ensemble spread everywhere in the atmospheric domain.

6.1 Ensemble calibration

MONARCH uses a globally uniform, empirically constrained tuning (or calibration) factor for the total emitted dust mass, referred to as $d_{cal}$ . This factor varies according to the specified configuration settings for the simulation. In particular, it depends on the emission scheme and the meteorological initial and boundary conditions used to initialize the simulation. We calibrated six free-running experiments, which cover all the different combinations between the emission scheme and meteorological conditions. The calibration factors were obtained by rescaling initial values for the calibration factors, namely $d_{cal} (m)_{old}$ , by the ratio between the MODIS DB mean ${DOD}_{coarse}$ and the ensemble free-run mean ${DOD}_{coarse}$ calculated over the whole domain; i.e.

3 $\begin{aligned} d_{cal} (m)_{new} & = d_{cal} (m)_{old} \frac{{DOD}_{coarse, MODIS}}{{DOD}_{coarse, model}} \\ with m = 1, \dots, 6, \end{aligned}$ where $m$ indicates an ensemble member. We have repeated the estimation twice where the second simulation re-run has used the calibration factors estimated from the first run. The final estimated calibration factors for each of six ensemble members are reported in Table S2.

7 Reanalysis product description

The reanalysis data set consists of three-dimensional (3D) and two-dimensional (2D) variables (see Table ). The 3D, or upper-air, variables include dust mass concentration [ $kg m^{- 3}$ ] for each dust size bin, the dust extinction coefficient at 550 $nm$ [ $m^{- 1}$ ] integrated over all size bins and the height of the pressure level above sea level [ $m$ ]. The 2D variables are either surface fields or total column fields. The 2D variables for each dust size bin include dry and wet accumulated dust deposition over the previous 3 $h$ [ $kg m^{- 2} (3 h)^{- 1}$ ] and instantaneous total column dust load [ $kg m^{- 2}$ ], dust mass surface concentration [ $kg m^{- 3}$ ], DOD [unitless] and ${DOD}_{coarse}$ [unitless] at 550 $nm$ . The set of archived 2D variables is completed by the surface extinction coefficient at 550 $nm$ [ $m^{- 1}$ ], direct normal irradiance [ $W m^{- 2}$ ] and global horizontal irradiance [ $W m^{- 2}$ ]. These variables have been used to produce dust-relevant information for different sectors and related validation exercises . For example, a dust- ${PM}_{10}$ field has been derived from the 2D, bin-resolved dust mass surface concentration for air quality applications. This field will be used to evaluate the ability of the reanalysis to reproduce dust concentration values at the ground . Over Europe, the latter will be extracted from measured ${PM}_{10}$ values following a procedure similar to that described . Furthermore, visibility data from 3D dust-extinction coefficient fields have been used for aviation applications , while soiling index based on wet and dry dust deposition has been used to develop products for solar energy production .

Figure 4

Maps of mean 3-hourly first guess of wet and dry accumulated (over the previous 3 $h$ ) dust deposition [ $kg m^{- 2} (3 h)^{- 1}$ ] and analysis of total column dust load [ $kg m^{- 2}$ ] and of the dust surface extinction coefficient at 550 $nm$ [ $m^{- 1}$ ] calculated for the whole period (2007–2016). Model fields are the ensemble mean.

[Figure omitted. See PDF]

Both analysis and first-guess fields are available for the variables that are diagnosed from the state vector. As mentioned earlier, the first guesses are model forward simulations initialized with an analysis. When available, the analysis field is the recommended output for that variable. A set of ensemble statistics is calculated and archived for each output variable, namely the ensemble mean, standard deviation, maximum and median. The spread among the ensemble members, represented by the standard deviation with respect to the ensemble mean, can be interpreted as a measure for the uncertainty in the mean estimates. Figure shows the ensemble mean over the whole reanalysis period for the analysis or first guess of some of the 2D variables. While model fields have been produced on 40 vertical levels, the data are stored on 15 standard pressure levels between 1000 and 100 $hPa$ (i.e. 1000, 975, 900, 850, 750, 700, 600, 500, 400, 350, 300, 250, 175, 150, 100 $hPa$ ), which were defined taking into account regulatory standards in the aviation sector in view of end-user products developed from the reanalysis in this sector;. In that way we reduced storage space while easing the use of the vertical information.

The reanalysis data set is structured into individual Network Common Data Form (NetCDF) files per variable and type of ensemble statistics. Further details on the file structure of the data set are reported in Sect. , while the naming convention for the data set files and folders is explained in Appendix .

Figure 5

A number of desert, arid and semi-arid regions of interest for the description of the dust reanalysis. The underlying dust field is the mean 3-hourly DOD analysis calculated for the whole reanalysis period (2007–2016).

[Figure omitted. See PDF]

Figure 6

Maps of mean 3-hourly DOD first guess (first column), analysis (second column) and analysis increments (third column) calculated for the whole period (2007–2016; top row) and for different seasons (DJF, MAM, JJA, SON; rows 2 to 5). Model fields are the ensemble mean.

[Figure omitted. See PDF]

Figure 7

Maps of mean 3-hourly ${DOD}_{coarse}$ first guess (first column), analysis (second column) and MODIS DB assimilated observations (third column) calculated for the whole period (2007–2016; top row) and for different seasons (DJF, MAM, JJA, SON; rows 2 to 5). Model fields are the ensemble mean and are collocated with the observations.

[Figure omitted. See PDF]

8 DOD evaluation

In this section we validate the reanalysis DOD or ${DOD}_{coarse}$ in terms of data assimilation inner diagnostics (analysis increments and statistics of departures from assimilated observations) and verify it against independent ground-based observations. We also discuss the DOD spatial and temporal patterns over the reanalysis domain and period. Figure highlights the location of major dust source areas that will be used in the discussion. The verification of DOD and ${DOD}_{coarse}$ against long-term ground-based observations across the domain is a first step towards a more comprehensive evaluation of the reanalysis data set that is planned in follow-up papers , which include the comparison against independent sets of in situ, column-based and profile retrievals.

Figure 8

Maps of mean 3-hourly ${DOD}_{coarse}$ for first-guess departures (first column), analysis departures (second column), standard deviation of first-guess departures (third column) and standard deviation of analysis departures (fourth column) calculated for the whole period (2007–2016; top row) and for different seasons (DJF, MAM, JJA, SON; rows 2 to 5). Model fields are the ensemble mean and are collocated with the observations.

[Figure omitted. See PDF]

8.1 DOD geographical distribution

Figure shows the ensemble annual and seasonal mean DOD for the first guess (left column) and the analysis (central column) during the whole reanalysis period. In agreement with observations, the highest DOD values are placed over the major emission areas of the domain, in particular in the Bodélé Depression in Chad, the Erg Chech in Algeria, and the El Djouf between Mauritania and Mali, followed by the Arabian Desert; the Taklamakan Desert in northwest China; and the smaller areas of the Grand Erg Oriental in Algeria, the Grand Sand Sea between Libya and Egypt; and the Kharan Desert in southwestern Pakistan. Table reports the averaged DOD of first guess, analysis and analysis minus first guess (analysis increments) when calculated for the whole domain for the full period, for different seasons (DJF, MAM, JJA, SON) and for individual years.

The decadal mean analysis DOD (top row of Fig. ) is generally smaller than the first-guess DOD except in the Taklamakan and Thar deserts and in areas where the mean DOD is below 0.3. Therefore, on average, MONARCH emissions are likely too strong for the configurations used, although a potentially too weak deposition cannot be discarded. The latter is strongly dependent upon the emitted size distribution that evolves during transport.

Seasonal changes in the geographical distribution of the analysis mean DOD (rows 2 to 5 of Fig. ) are consistent with well-known patterns : (i) dust peaks everywhere during spring and summer, in particular, across the Taklamakan Desert during spring when more dust-generating cold fronts arrive in the area; (ii) dust from the south Sahara and Sahel is preferentially transported by northeasterly harmattan winds towards the Gulf of Guinea in winter and spring; (iii) the dust plume that originated in western Africa and is transported across the tropical North Atlantic is shifted towards northern latitudes in summer along with the Intertropical Convergence Zone (ITCZ; ); (iv) dust is strongly mobilized on the Arabian Peninsula and in the Tigris–Euphrates Basin in summer by the north-northwesterly shamal winds; (v) the lowest overall DOD is simulated everywhere in autumn.

8.2 DOD analysis increments

Figure also shows the difference between DOD analysis and first guess (namely analysis increments; right column) averaged over the full reanalysis period (top row). Non-zero systematic analysis increments are to be interpreted as systematic corrections to the model simulations and can serve as a proxy for model bias. By applying these corrections, the analysis improves the underlying model. The patterns of these systematic corrections vary with season and geographical location. While over the entire domain the mean analysis and first guess are comparable, the biggest systematic negative corrections (removing mass from the atmosphere) are linked to overestimation of sources' strengths in the Bodélé Depression in Chad; in the Saudi Arabia lowlands; and in the Balochistan region of southwestern Asia that extends over Iran, Afghanistan and Pakistan and contains, for example, the Kharan Desert. Negative mean increments are also present but to a lesser extent in other arid and semi-arid areas such as the Erg Chech in Algeria, the Great Sand Sea in Libya, the Nubian Desert in Sudan and eastwards of the Caspian Sea. Positive mean increments calculated for the whole reanalysis period are less widespread than the negative increments. The strongest values are over the Thar Desert, in the northern part of Syria, over a long stretch inland from the Mediterranean Sea in the north of Africa, and in the desert of El Djouf between Mauritania and Mali. All in all, as expected, the largest positive or negative analysis increments correspond to areas with more dust load, i.e. to source regions and their vicinity.

The patterns of the mean increments depend upon the season (see rows 2 to 5 of Fig. ). These patterns are clearly linked to the seasonal changes in dust activities in the different regions, as mean increments are, in absolute value, higher in the presence of high mean DODs compared to low DOD values. The areas that show the strongest seasonality with respect to the analysis increments are the Bodélé Depression and the Arabian and Taklamakan deserts. The overestimation of the Bodélé source strength in the first guess is more pronounced in winter and spring. In spring the emissions from the Taklamakan Desert, Syria and the northern part of the Arabian Desert are clearly underestimated, while in summer strong negative increments are present all over the Arabian Desert. Wide areas in the Sahara are affected by negative increments in the spring and summer. The Balochistan region and the Thar Desert show negative and positive increments, respectively, throughout the year, but their magnitudes are greater in spring and summer.

The patterns of the increments are consistent among the different years (Figs. S2 and S3) and vary mostly in the amplitude of the mean corrections, although there are some exceptions. Positive increments over the Thar Desert, northern Syria and the north of the Arabian Desert mainly appear in the first part of the reanalysis, between 2007 and 2012, in contrast to the small positive or even negative increments in the case of the Arabian Desert in the subsequent years. Strong negative increments east of the Caspian Sea are applied mainly through 2007 to 2010. Those yearly differences suggest changes, for example in land use, that are not captured by the model. Negative corrections in the west of the Sahara are more widespread in 2007 and 2008 than in other years due to the higher mean DOD during those 2 years.

8.3 Statistics of departures from assimilated observations

We compare here the reanalysis ${DOD}_{coarse}$ with the assimilated observations. Figure shows the ${DOD}_{coarse}$ for the observation-collocated ensemble mean first guess and analysis and for the assimilated observations averaged over the full reanalysis period (top row of Fig. ) and over the DJF, MAM, JJA and SON seasons (from the second to the fifth row of Fig. ). Table reports the corresponding values averaged over the whole domain for the full period, for different seasons and for individual years. By visual inspection, the analysis is closer to the assimilated observations in all the time periods considered, which constitutes a good sanity check for the assimilation scheme. This is also confirmed when the averages are calculated for individual years of the reanalysis period (Figs. S4 and S5). The seasonality in the model simulations closely resembles that in the observations, with MAM and JJA being the most active dust seasons.

Table 4

Averaged ${DOD}_{coarse}$ of observation-collocated first guess (fg), observation-collocated analysis (an) and assimilated MODIS DB retrievals for the full period (2007–2016), for different seasons (DJF, MAM, JJA, SON) and for individual years.

Period	Mean fg	Mean an	Mean MODIS DB
	${DOD}_{coarse}$	${DOD}_{coarse}$	${DOD}_{coarse}$
2007–2016	0.1914	0.1685	0.1912
DJF	0.1445	0.1374	0.1573
MAM	0.2323	0.2074	0.2337
JJA	0.2452	0.2073	0.2356
SON	0.1427	0.1228	0.1394
2007	0.1905	0.1644	0.1858
2008	0.2108	0.1877	0.2114
2009	0.1892	0.1682	0.1921
2010	0.1947	0.1772	0.202
2011	0.1894	0.1679	0.192
2012	0.195	0.1752	0.1989
2013	0.1832	0.1577	0.1771
2014	0.1768	0.1546	0.176
2015	0.1948	0.1697	0.1929
2016	0.1896	0.1621	0.1836

Table 5

Statistics (mean and SD) of departures of ${DOD}_{coarse}$ first guess (fg) and analysis (an) from assimilated observations calculated for the full period (2007–2016), for different seasons (DJF, MAM, JJA, SON) and for individual years. The number of observation counts is also reported.

Period	Observation	Mean of fg	Mean of an	SD of fg	SD of an
	counts	departures	departures	departures	departures
2007–2016	4.373 $\times$ 10 $^{8}$	0.0002	$-$ 0.0227	0.1841	0.0985
DJF	1.008 $\times$ 10 $^{8}$	$-$ 0.0128	$-$ 0.0198	0.1502	0.0827
MAM	1.033 $\times$ 10 $^{8}$	$-$ 0.0014	$-$ 0.0262	0.212	0.1125
JJA	1.157 $\times$ 10 $^{8}$	0.0096	$-$ 0.0283	0.2159	0.112
SON	1.176 $\times$ 10 $^{8}$	0.0033	$-$ 0.0166	0.1357	0.0744
2007	4.522 $\times$ 10 $^{7}$	0.0047	$-$ 0.0214	0.1863	0.0961
2008	4.398 $\times$ 10 $^{7}$	$-$ 0.0006	$-$ 0.0237	0.1928	0.1023
2009	4.274 $\times$ 10 $^{7}$	$-$ 0.0029	$-$ 0.0239	0.1858	0.0984
2010	4.464 $\times$ 10 $^{7}$	$-$ 0.0073	$-$ 0.0248	0.1828	0.1011
2011	4.43 $\times$ 10 $^{7}$	$-$ 0.0026	$-$ 0.0240	0.1825	0.1008
2012	4.417 $\times$ 10 $^{7}$	$-$ 0.0039	$-$ 0.0237	0.1913	0.1554
2013	4.299 $\times$ 10 $^{7}$	0.006	$-$ 0.0194	0.1767	0.0969
2014	4.41 $\times$ 10 $^{7}$	0.0008	$-$ 0.0214	0.1762	0.0924
2015	4.348 $\times$ 10 $^{7}$	0.0019	$-$ 0.0232	0.1924	0.1047
2016	4.17 $\times$ 10 $^{7}$	0.0059	$-$ 0.0215	0.1823	0.0952

Figure shows the mean (first and second column of Fig. ) and standard deviation (third and fourth column of Fig. ) of the first-guess and analysis ${DOD}_{coarse}$ departures (respectively) from assimilated observations averaged over the full reanalysis period (top row of Fig. ) and over the different four seasons (from the second to the fifth row of Fig. ). The corresponding values averaged over the whole domain are reported in Table , together with the number of observation counts and statistics calculated for individual years. The departure statistics, in particular the reduction in the standard deviation of the analysis departures compared to the first guess everywhere in the domain of interest, prove the consistency of our assimilation procedure. This is also the case on a seasonal and yearly basis (Figs. S6 and S7). With respect to the mean departures, the positive mean departures (model simulation minus observations) decrease considerably in the analysis compared to the first guess, while some of the negative mean departures remain unchanged in specific regions or seasons. The latter is the case for example in Europe and Russia, when considering the full reanalysis period or the different seasons. The aforementioned regions see on average much lower ${DOD}_{coarse}$ values than the rest of the domain and are analysed less efficiently. This is likely due to the ensemble not having a sufficient spread for low simulated concentrations. A similar issue was previously identified in other assimilation systems seetheir Sect. 4 and attributed to the fact that aerosol mass is a positive definite variable, which intrinsically deviates from the assumed Gaussian conditions in the prior in the analysis step. Negative mean departures are present in the sub-Sahel region and in particular over a stretch along the Gulf of Guinea in the summer season, with respect to both the first guess and the analysis. This might be due to the contamination of aerosols other than dust in the observational data set, which might be of anthropogenic or natural origin, e.g. aerosol produced by biomass burning in central Africa advected northwards . Due to the above regional or seasonal issues, the total bias (i.e. mean departure from the observations calculated over the entire domain) is higher in the analysis compared to the first guess.

8.4

Verification of DOD and ${DOD}_{coarse}$ against AERONET

We compare here the reanalysis DOD and ${DOD}_{coarse}$ with independent observations that have not been used in the assimilation process. We employed products from the Aerosol Robotic Network (AERONET) of ground-based Sun photometers .

Table 6

Dust filters applied to the AERONET retrievals.

Filter name	Dust condition	Dust presence	Non-dust presence
DOD-dust1	Pure dust	DOD $=$ AOD when AE $<$ 0.40	–
DOD-dust2	Pure dust	DOD $=$ AOD when AE $<$ 0.60	–
DOD-mixed1	Mixed dust	DOD $=$ AOD when AE $<$ 0.75	–
DOD-mixed2	Mixed dust	DOD $=$ AOD when AE $<$ 0.75	DOD $=$ 0 when AE $>$ 1.2
${DOD}_{coarse}$	Pure dust	${DOD}_{coarse}$ $=$ ${AOD}_{coarse}$	–

Figure 9

Spatial distribution of the AERONET sites used in this study. The different colours indicate the sub-regions considered in the discussion of the results: northwestern Africa (green), the Middle East (red) and the Mediterranean (blue). When validating the NAMEE region, all the sites are considered, including the ones labelled “Others” (black). © OpenStreetMap contributors 2022. Distributed under the Open Data Commons Open Database License (ODbL) v1.0; © CARTO (https://carto.com/attribution/, last access: 8 June 2022).

[Figure omitted. See PDF]

Figure 10

Density scatter plots of the reanalysis (MOD) DOD and ${DOD}_{coarse}$ versus AERONET (OBS) dust-filtered AOD, DOD-mixed2 (a–c) and ${DOD}_{coarse}$ (d–f), during the whole reanalysis period (2007–2016). The results are calculated for the NAMEE domain and for different time basis: 3-hourly (a, d), daily (b, e) and monthly (c, f). The dust filters applied to the AERONET observations are described in Table . The bin size is 0.01.

[Figure omitted. See PDF]

Figure 11

Density scatter plots of the reanalysis (MOD) DOD and ${DOD}_{coarse}$ versus AERONET (OBS) dust-filtered AOD, DOD-mixed2 (a–c) and ${DOD}_{coarse}$ (d–f), during the whole reanalysis period (2007–2016) and on a 3-hourly basis. The results are calculated for three different sub-regions of the reanalysis domain: northwestern Africa (a, d), the Middle East (b, e) and the Mediterranean (c, f). The dust filters applied to the AERONET observations are described in Table . The bin size is 0.01.

[Figure omitted. See PDF]

Table 7

Verification statistics ( $r$ , RMSE, MB and MFB) and number of samples (NDATA) for the reanalysis versus AERONET AODs for the entire period (2007–2016) and NAMEE region and for northwestern Africa, the Middle East and Mediterranean regions. AERONET version 3, cloud-screened, 3-hourly, dust-filtered AOD and ${AOD}_{coarse}$ comprise the reference. The definition of each of the DOD filters is in Table .

	DOD-dust1	DOD-dust2	DOD-mixed1	DOD-mixed2	${DOD}_{coarse}$
NAMEE region
NDATA	68 493	99 821	122 145	260 622	242 582
$r$	0.74	0.75	0.76	0.82	0.81
RMSE	0.25	0.22	0.21	0.15	0.09
MB	$-$ 0.10	$-$ 0.10	$-$ 0.10	$-$ 0.04	$-$ 0.03
MFB	$-$ 0.43	$-$ 0.51	$-$ 0.58	0.79	$-$ 0.91
Northwestern Africa
NDATA	40 240	51 299	56 882	59 295	31 553
$r$	0.76	0.77	0.77	0.78	0.81
RMSE	0.26	0.25	0.24	0.23	0.15
MB	$-$ 0.10	$-$ 0.10	$-$ 0.10	$-$ 0.09	$-$ 0.05
MFB	$-$ 0.35	$-$ 0.40	$-$ 0.43	$-$ 0.33	$-$ 0.50
Middle East
NDATA	15 281	23 659	29 826	41 123	49 526
$r$	0.66	0.67	0.68	0.72	0.73
RMSE	0.24	0.22	0.21	0.19	0.13
MB	$-$ 0.10	$-$ 0.09	$-$ 0.09	$-$ 0.04	$-$ 0.01
MFB	$-$ 0.27	$-$ 0.29	$-$ 0.30	0.33	$-$ 0.04
Mediterranean Basin
NDATA	9415	17 593	24 487	89 887	90 451
$r$	0.57	0.61	0.64	0.75	0.72
RMSE	0.21	0.18	0.17	0.09	0.07
MB	$-$ 0.12	$-$ 0.11	$-$ 0.10	$-$ 0.02	$-$ 0.03
MFB	$-$ 0.68	$-$ 0.78	$-$ 0.88	1.22	$-$ 1.14

8.4.1 Verification methodology

We used AERONET version 3 quality-assured data. On the one hand, the modelled ${DOD}_{coarse}$ at 550 $nm$ is compared with coarse-mode AOD retrievals at 500 $nm$ from the spectral de-convolution algorithm SDA;. In general, ${AOD}_{coarse}$ is dominated by maritime/oceanic aerosols and desert dust. However, sea salt is usually associated with low AOD $<$ 0.03; and mainly affects coastal stations, and therefore inland high ${AOD}_{coarse}$ values can be assumed to be mineral dust. On the other hand, the modelled DOD at 550 $nm$ is compared with dust-filtered AOD values from the direct-Sun algorithm . We used direct-Sun AOD retrievals between 440 and 870 $nm$ to obtain the AOD at 550 $nm$ using the Ångström law. Dust-dominated conditions are identified using a specific set of dust filters based on the AERONET Ångström exponent (AE). The AE is inversely related to the average size of the particles: the smaller the particles, the larger the AE. The AE ranges normally from 4, corresponding to pure molecular extinction, down to close to null values, corresponding to extinction dominated by coarse-mode aerosols (sea salt and mineral dust), producing a spectral neutral AOD . Values of AE $>$ 1.2 typically indicate a significant presence of fine-mode particles biomass burning or urban aerosols;. Quantitative evaluations of the modelled DOD are conducted for dust-dominated conditions based on four different AE filters (Table ) where AE ranges from typical desert dust source values (AE $<$ 0.4) to values characteristic of dust long-range transport conditions (AE $<$ 0.75). Additionally, for one of the filter methods (namely DOD-mixed2), DOD is assumed to be 0 when the AE is greater than 1.2. These dust filters roughly represent “pure” desert dust conditions (i.e. DOD-dust1, DOD-dust2 and ${DOD}_{coarse}$ ) and long-range transport (i.e. mixed) dust conditions (i.e. DOD-mixed1 and DOD-mixed2).

We focus our verification on the NAMEE region. We used data from 140 AERONET stations (Fig. ; see also Tables S3 to S8 for the list of AERONET sites used), which include all the available AERONET sites in the NAMEE domain providing observations during the period of 2007–2016, with the exception of those sites that are at high altitudes (i.e. altitudes greater than about 1850 $m$ above sea level). Results are presented for different sub-regions, namely the Middle East, northwestern Africa and the Mediterranean, and for all available AERONET stations including those sites outside the three above-mentioned regions, and they are depicted in Fig. . Model values are ensemble mean analysis fields.

AERONET measurements are nominally taken at 15 $\min$ intervals. Here we average observations within $\pm$ 30 $\min$ of the 3-hourly model output times. These averaged observations are used to evaluate the model on a 3-hourly, daily and monthly basis. For the daily and monthly average evaluation, only coincident 3-hourly model output and observations are used. We use verification statistics such as the Pearson correlation coefficient ( $r$ ), mean bias (MB), root mean square error (RMSE) and mean fractional bias (MFB) (see Appendix ) to measure the skill of the model when performing diagnostic analyses of DOD and ${DOD}_{coarse}$ where AERONET sites are located.

8.4.2 Comparison with 3-hourly, daily and monthly reference data

Overall, the dust reanalysis can reproduce the 3-hourly, daily and monthly observed variability with Pearson correlation coefficients ranging from 0.74 and 0.82, depending on the dust filter, for 3-hourly DOD to up to 0.92 for monthly ${DOD}_{coarse}$ . The reanalysis tends to underestimate the DOD and ${DOD}_{coarse}$ compared to AERONET observations (see Fig. ). The model results are dominated by the results in northwestern Africa, and the largest relative underestimations are observed in the Mediterranean and the Middle East (Fig. ), likely because of marine aerosols at these sites. Therefore some model underestimation is expected, in particular in proximity to coastal stations or when mixtures of aerosols are present .

Table 8

Same as Table but using daily reanalysis values and AERONET dust-filtered AOD and ${AOD}_{coarse}$ averaged on a daily basis as reference.

	DOD-dust1	DOD-dust2	DOD-mixed1	DOD-mixed2	${DOD}_{coarse}$
NAMEE region
NDATA	28 662	40 767	49 551	49 551	87 007
$r$	0.77	0.79	0.79	0.79	0.84
RMSE	0.23	0.21	0.20	0.20	0.09
MB	$-$ 0.11	$-$ 0.10	$-$ 0.10	$-$ 0.10	$-$ 0.03
MFB	$-$ 0.50	$-$ 0.59	$-$ 0.67	$-$ 0.67	$-$ 0.94
Northwestern Africa
NDATA	15 996	19 702	21 590	21 590	11 563
$r$	0.78	0.79	0.79	0.79	0.83
RMSE	0.25	0.24	0.23	0.23	0.15
MB	$-$ 0.10	$-$ 0.10	$-$ 0.10	$-$ 0.10	$-$ 0.06
MFB	$-$ 0.37	$-$ 0.42	$-$ 0.46	$-$ 0.46	$-$ 0.54
Middle East
NDATA	5930	8822	10 866	10 866	17 080
$r$	0.70	0.72	0.72	0.72	0.78
RMSE	0.22	0.20	0.19	0.19	0.12
MB	$-$ 0.11	$-$ 0.10	$-$ 0.09	$-$ 0.09	$-$ 0.02
MFB	$-$ 0.28	$-$ 0.29	$-$ 0.31	$-$ 0.31	$-$ 0.06
Mediterranean Basin
NDATA	4637	8199	11 192	11 192	32 439
$r$	0.62	0.65	0.67	0.67	0.75
RMSE	0.20	0.17	0.16	0.16	0.06
MB	$-$ 0.12	$-$ 0.11	$-$ 0.10	$-$ 0.10	$-$ 0.03
MFB	$-$ 0.78	$-$ 0.90	$-$ 1.00	$-$ 1.00	$-$ 1.15

Table 9

Same as Table but using monthly reanalysis values and AERONET dust-filtered AOD and ${AOD}_{coarse}$ averaged on a monthly basis as reference.

	DOD-dust1	DOD-dust2	DOD-mixed1	DOD-mixed2	${DOD}_{coarse}$
NAMEE region
NDATA	4097	5011	5439	5439	4667
$r$	0.77	0.78	0.80	0.80	0.92
RMSE	0.19	0.17	0.15	0.15	0.05
MB	$-$ 0.12	$-$ 0.11	$-$ 0.11	$-$ 0.11	$-$ 0.03
MFB	$-$ 0.77	$-$ 0.83	$-$ 0.88	$-$ 0.88	$-$ 0.80
Northwestern Africa
NDATA	1181	1219	1231	1231	611
$r$	0.81	0.84	0.85	0.85	0.92
RMSE	0.19	0.17	0.16	0.16	0.09
MB	$-$ 0.11	$-$ 0.11	$-$ 0.10	$-$ 0.10	$-$ 0.06
MFB	$-$ 0.39	$-$ 0.41	$-$ 0.43	$-$ 0.43	$-$ 0.41
Middle East
NDATA	691	798	837	837	783
$r$	0.73	0.72	0.75	0.75	0.86
RMSE	0.17	0.15	0.13	0.13	0.07
MB	$-$ 0.10	$-$ 0.09	$-$ 0.08	$-$ 0.08	$-$ 0.02
MFB	$-$ 0.30	$-$ 0.32	$-$ 0.33	$-$ 0.33	$-$ 0.04
Mediterranean Basin
NDATA	1328	1750	1942	1942	1774
$r$	0.63	0.62	0.67	0.67	0.84
RMSE	0.18	0.16	0.14	0.14	0.04
MB	$-$ 0.13	$-$ 0.12	$-$ 0.11	$-$ 0.11	$-$ 0.03
MFB	$-$ 0.89	$-$ 0.95	$-$ 1.00	$-$ 1.00	$-$ 0.91

Tables to present the verification statistics on a 3-hourly, daily and monthly basis when calculated using the five dust-filtered reference data sets. The stricter the dust filter, the lower the correlation coefficient.

Figure 12

Maps of verification statistics ( $r$ , MB, RMSE, MFB, from top to bottom) of the analysis DOD (first and second columns) and ${DOD}_{coarse}$ (third column) versus AERONET dust-filtered AOD (Table ): DOD-mixed1 (left), DOD-mixed2 (middle) and ${DOD}_{coarse}$ (right) calculated for the whole period (2007–2016). The results are obtained using 3-hourly collocated reanalysis and observation values (see also Table ).

[Figure omitted. See PDF]

The verification results calculated using the DOD-mixed2 dust filter are comparable to those obtained with the ${DOD}_{coarse}$ reference data set in terms of correlation (0.82 versus 0.81 for the entire region) and MB ( $-$ 0.04 versus $-$ 0.03). When considering regional results, the use of the DOD-mixed2 dust filter shows a reduction in the MB together with an increase in MFB in the Mediterranean region (Fig. ). This is directly related to the assumption DOD $=$ 0 for AE $>$ 1.2 (Table ), which increases the number of collocations particularly in the Mediterranean, where the presence of dust is sporadic. This is confirmed by the comparison with the results obtained with the DOD-mixed1 filter where this condition is neglected (see Fig. ). The RMSE obtained with DOD-mixed2 and ${DOD}_{coarse}$ reference data shows a clear north-to-south gradient that scales with dust concentrations with maximum values over sources (in northwestern Africa and the Middle East, RMSE $>$ 0.12) and minimum values in the Mediterranean (RMSE $<$ 0.12).

Figure 13

Time series of monthly verification statistics ( $r$ , MB, RMSE, MFB) and number of samples (NDATA) for the reanalysis DOD and ${DOD}_{coarse}$ versus dust-filtered AERONET observations for the period 2007–2016 for the NAMEE domain. Different colours are associated with the results obtained with the different dust filters: DOD-dust1, DOD-dust2, DOD-mixed1, DOD-mixed2 and ${DOD}_{coarse}$ . The definition of each dust filter is reported in Table . The results are obtained using 3-hourly collocated model and observation values.

[Figure omitted. See PDF]

Monthly DOD and ${DOD}_{coarse}$ verification statistics are sensitive to the number of AERONET observations, as shown in Fig. . A clear seasonal trend is identified with lower performance in the cloudy winter season than in summer when clear skies are more frequent. Time series of the verification statistics for ${DOD}_{coarse}$ show a change after 2011, with reductions in MB and RMSE in comparison to previous years. Also the MFB is closer to the MFB from the different dust filters (see Fig. ). This change is associated with a decrease in ${DOD}_{coarse}$ in the Mediterranean region (not shown here) that is captured by the reanalysis. Underestimations are observed in northwestern Africa and the Mediterranean regions when the DOD-mixed2 and ${DOD}_{coarse}$ data are used as reference. In summertime in northwestern Africa, we find the largest underestimations (monthly MB $<$ $-$ 0.10 for DOD-mixed2 and ${DOD}_{coarse}$ ). These underestimations are likely related to strong dust outbreaks associated with mesoscale convective systems (called haboobs) that the model is not able to capture. In the Middle East, the model shows a systematic underestimation when compared to DOD-mixed2 and ${DOD}_{coarse}$ reference data, although some overestimation in particular years (2011–2012) is observed. The observed DOD underestimations in comparison with AERONET in the Middle East can be partly attributed to a poor representation of small-scale emission processes such as the wind peak associated with the breakdown of the nocturnal low-level jet, the meteorological effects of orography, sea breezes and cold pools .

Overall, the comparison with AERONET observations shows a good performance of the reanalysis in reproducing the spatial and temporal distribution of mineral dust aerosols over the entire domain and for the 10-year period.

9 Data availability

The reanalysis data set is distributed via Thematic Real-time Environmental Distributed Data Services (THREDDS) at BSC and is made freely available at http://hdl.handle.net/21.12146/c6d4a608-5de3-47f6-a004-67cb1d498d98 (last access: 10 June 2022). The data set (78 $TB$ in size) is structured into individual NetCDF files per geophysical variable and type of ensemble statistics (ensemble mean, standard deviation, maximum and median). Each individual file covers the time period of one assimilation window (24 $h$ ) and contains 8 time steps at a 3-hourly time frequency starting at 03:00 UTC. The files are organized into folders, where each folder contains the files relative to the whole reanalysis period (10 years) for a given variable and type of statistics.

10 Conclusions and further perspectives

A regional dust reanalysis has been produced using the MONARCH chemical weather prediction system and satellite retrievals of ${DOD}_{coarse}$ based on MODIS Aqua DB AOD at 550 $nm$ . The reanalysis data set spans the period 2007–2016 at a horizontal resolution of 0.1 $^{\circ}$ latitude $\times$ 0.1 $^{\circ}$ longitude in a rotated grid and a temporal resolution of 3 $h$ . The reanalysis covers a regional domain centred around Northern Africa, the Middle East and Europe (NAMEE region) that also includes parts of central Asia and the Atlantic and Indian oceans. This paper describes the modelling, observational and assimilation aspects related to the production of the reanalysis, whose unprecedentedly high resolution has required the use of advanced archiving and computing strategies, which are also described in the paper (see Appendix ). The assimilated observations have provided a total column optical constraint on the coarse fraction of dust particles over land, in cloud- and snow-free conditions, and in the daytime, with one satellite overpass per day. Analysis increments were estimated over the whole assimilation window through the 4D implementation of the assimilation algorithm and, to a certain extent (limited by the observation radius of influence), also over sea according to the model background spatial covariance. Re-partitions of analysis increments in the vertical dimension of the model control vector and across the individual model coarse size bins have relied on the model background.

The seasonal changes in the spatial distribution of the dust reanalysis are well characterized and follow well-known, region-dependent dust cycle features controlled by seasonal changes in meteorology (mainly surface winds but also precipitation) and in vegetation cover. The most prominent seasonal features that stand out in the reanalysis are the mobilization of dust during the so-called Asian dust events in the Taklamakan region in spring and by north-northwesterly shamal winds on the Arabian Peninsula and in the Tigris–Euphrates Basin during summer, the transport of south Saharan dust southwest towards the Gulf of Guinea by northeasterly harmattan trade winds during winter and spring, and the northward shift of the plume extending from western Africa over the tropical Atlantic during summer due to movements of the ITCZ.

Diagnostics based on departures of first guess and analysis from assimilated observations provided a sanity check for the quality of our assimilation procedure. As expected, the analysis is statistically closer to the assimilated observations than the first guess. The mean departures are larger in the analysis than in the first guess only in specific regions and seasons, which can be explained by the contamination of aerosols other than dust in the observational data set (for example, biomass burning aerosols produced by fires in central Africa that are advected further north during summer) or by the presence of fairly low ${DOD}_{coarse}$ values (mainly over Europe and western Russia) that are not analysed as efficiently by the assimilation scheme as the higher ${DOD}_{coarse}$ values. Mean analysis increments suggest seasonally dependent model biases that follow seasonal dust changes. By applying these corrections, the analysis improves the underlying model. Overall, the spatial distribution of the analysis increments over source regions, as well as in their proximity, highlights the pivotal role of the MODIS DB retrievals in providing an observational constraint over the most critical regions, confirming what previous studies have shown .

The reanalysis DOD and ${DOD}_{coarse}$ have been validated with highly accurate ground-truth measurements from AERONET on a 3-hourly, daily and monthly basis and with the application of specific dust filters to the reference products or the use of the coarse-mode AOD product. When the latter is used as reference, a Pearson correlation coefficient as high as 0.81 with a MB of $-$ 0.05 and a RMSE of 0.12 are estimated when considering the whole reanalysis period and 3-hourly AERONET retrievals. This confirms the good accuracy of the reanalysis data set and its suitability to be used in specific air quality/health and climate service applications. By extending the existing observation-based information intended for mineral dust monitoring, this reanalysis will allow a better quantification of dust impacts upon key sectors of society and the economy. This makes the data set a potentially useful tool in support of climate research and service, including the support to operational early warning systems and to the development of mitigation strategies.

This desert dust reanalysis data set is intended to be the first major endeavour towards the production of BSC aerosol reanalyses over regional or global domains. Extensions of the data set are planned for the near future. A series of companion papers will provide a more comprehensive evaluation of the reanalysis, an analysis of inter-annual variability and trends, and a description of the data set's application in dust-tailored services.

Appendix A Folder and file naming convention of the reanalysis data set

As described in Sect. , the reanalysis data set is structured into individual files per variable and type of ensemble statistics (e.g. an individual file contains the ensemble mean analysis of DOD at 550 $nm$ ). The filenames include the following terms separated by the underscore sign: the short name of the variable (as reported in Table ); the initial date and time of the data included in the file; a suffix from among av, max, median and std indicating the ensemble mean, max, median and standard deviation, respectively, for that variable and optionally the label an for the variables for which an analysis field is produced. When the latter label is not present, the fields are model first guess. The filenames end with the extension suffix nc identifying NetCDF files. Each individual file contains eight time steps at a 3-hourly time frequency starting at 03:00 UTC. Therefore, for example, the filename for the ensemble mean analysis of DOD at 550 $nm$ for a given date is od550du_YYYYMMDDHH_av_an.nc, where od550du is the variable short name and YYYYMMDDHH can take values from 2007010103 to 2016123103. The files are organized into folders containing the whole 10-year period for a given variable and type of statistics. Each folder is named with the variable short name followed by the hyphen sign and the suffix indicating the type of ensemble statistics and optionally by the label an preceded by the underscore sign for the variables for which there is an analysis field. Hence the folder containing the files of the example above is named od550du-av_an, while the corresponding ensemble mean first guess data are stored in the folder named od550du-av. To follow on with the same example, the ensemble mean analysis of DOD at 550 $nm$ for 9 July 2012 can be found in the file path od550du-av_an/od550du_2012070903_av_an.nc.

Appendix B Verification metrics

The definitions of the verification metrics used in this study are reported in Table .

Table B1

Definitions of the verification statistics used in the study. $o_{i}$ and $c_{i}$ are the observed and the modelled concentrations at time and location $i$ , respectively; $\overline{o}$ and $\overline{c}$ are their averages; $n$ is the number of data.

Statistic parameter	Formula
Pearson correlationcoefficient ( $r$ )	$r = \frac{\sum_{i = 1}^{n} (c_{i} - \overline{c}) \cdot (o_{i} - \overline{o})}{\sqrt{\sum_{i = 1}^{n} (c_{i} - \overline{c})^{2}} \cdot \sqrt{\sum_{i = 1}^{n} (o_{i} - \overline{o})^{2}}}$
Mean bias (MB)	$MB = \frac{1}{n} \sum_{i = 1}^{n} (c_{i} - o_{i})$
Root mean squareerror (RMSE)	$RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} (c_{i} - o_{i})^{2}}$
Mean fractionalbias (MFB)	$MFB = \frac{2}{n} \sum_{i = 1}^{n} (\frac{c_{i} - o_{i}}{c_{i} + o_{i}})$

Appendix C Simulation workflow

The reanalysis has been run on the BSC high-performance computing (HPC) infrastructure using the Autosubmit workflow manager , a Python-based tool to create, manage and monitor experiments running on one or multiple remote computing clusters or HPC via the Secure Shell protocol. Scripts and templates to use Autosubmit were developed specifically for the reanalysis to be able to easily run and monitor long simulations by using the BSC HPC resources and store their results in the BSC archive. Autosubmit handles the job submission of the different workflow steps automatically, taking into account interruptions and failures. A functionality to wrap the 12 daily model simulations (each using 768 computing cores) and the data assimilation calculations (using 576 cores) was used to minimize the queuing times. This allows processing a number of days in a row and increasing the parallelism since a single job allocates the total sum of computing nodes required (i.e. 9792 cores were reserved by each wrapper). The number of computing cores for each job has been estimated to balance the execution time of a model simulation and of the assimilation for these jobs to share the computing resources. Additionally, the jobs of two post-processing steps were also wrapped together with the simulation and data assimilation jobs. The post-processing steps are needed to compress and reduce the original model and assimilation output and for the calculation of some basic ensemble statistics (to be provided to the users as main output of the reanalysis) before the final output is transferred to a long-term archive. The job wrapper has been designed with a crossing-date strategy to run two different starting dates within the same experiment. This is done so that the model ensemble simulations from the first starting date can be run in parallel with the assimilation job of the second starting date. This choice of design was made since model ensemble simulations and data assimilation calculations from the same date cannot be run simultaneously due to the obvious dependency of the data assimilation job on the model output. This novel workflow design was developed specifically for the production of the dust reanalysis and has proven successful for such highly computationally intensive calculations.

The supplement related to this article is available online at: https://doi.org/10.5194/essd-14-2785-2022-supplement.

Author contributions

CPGP, SB, EDT, OJ and JE designed the study and discussed the main results. FBa, EC, PF, LM, MM, AV, EW, MK and MG contributed to the discussions. EDT, JE, FM and NS developed and maintained the data assimilation code. PG prepared the assimilated observations. OJ led the MONARCH developments with contributions from CPGP, MG, MK, VO, JE and EDT. SB and CPGP carried out the independent validation, while EDT and JE performed the validation against assimilated observations. FM, MC, GMP, EDT and MO contributed to the development of the computing workflow. FM, AB, GMP and EDT ran the simulations. AB, SB, EDT, PAB and FBe performed the data set quality check and storage. EDT, CPGP and SB wrote the manuscript. All authors commented on the manuscript.

Competing interests

The contact author has declared that neither they nor their co-authors have any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Acknowledgements

The BSC co-authors acknowledge PRACE (eDUST, eFRAGMENT1 and eFRAGMENT2) and RES (AECT-2019-3-0001, AECT-2020-1-0007, AECT-2020-3-0013) for awarding access to MareNostrum at the BSC and for providing technical support. The authors thank all the principal investigators and their staff for establishing and maintaining the NASA and PHOTONS AERONET sites and the MODIS mission scientists and associated NASA personnel for the production of the data used in this study. The authors also thank John P. Dunne, Julie Letertre-Danczak and the anonymous referee, who provided constructive comments that improved the manuscript.

Financial support

This research has been supported by the DustClim project, which is part of ERA4CS, an ERA-NET programme co-funded by the European Union's Horizon 2020 research and innovation programme (grant no. 690462); the European Research Council (FRAGMENT (grant no. 773051)); grant no. RYC-2015-18690 funded by MCIN/AEI/10.13039/501100011033 and ESF Investing in your future; grant no. CGL2017-88911-R funded by MCIN/AEI/10.13039/501100011033 and ERDF A way of making Europe; the AXA Research Fund (AXA Chair on Sand and Dust Storms); the European Commission, Horizon 2020 Framework Programme (grant no. 792103 (SOLWARIS)); and ATMO-ACCESS (Access to Atmospheric Research Facilities) funded in the frame of the programme H2020-EU.1.4.1.2 (grant no. 101008004, 1 April 2021–31 March 2025). Jerónimo Escribano and Martina Klose have received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreements H2020-MSCA-COFUND-2016-754433 and H2020-MSCA-IF-2017-789630, respectively. Martina Klose received further support through the Helmholtz Association's Initiative and Networking Fund (grant no. VH-NG-1533). This work has been partially funded by the contribution agreement between AEMET and BSC to carry out development and improvement activities of the products and services supplied by the World Meteorological Organization (WMO) Barcelona Dust Regional Center (i.e. the WMO Sand and Dust Storm Warning Advisory and Assessment System (SDS-WAS) Regional Center for Northern Africa, the Middle East and Europe).

Review statement

This paper was edited by Nellie Elguindi and reviewed by Julie Letertre-Danczak and one anonymous referee.

Word count: 12547

Show less

© 2022. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

One of the challenges in studying desert dust aerosol along with its numerous interactions and impacts is the paucity of direct in situ measurements, particularly in the areas most affected by dust storms. Satellites typically provide column-integrated aerosol measurements, but observationally constrained continuous 3D dust fields are needed to assess dust variability, climate effects and impacts upon a variety of socio-economic sectors. Here, we present a high-resolution regional reanalysis data set of desert dust aerosols that covers Northern Africa, the Middle East and Europe along with the Mediterranean Sea and parts of central Asia and the Atlantic and Indian oceans between 2007 and 2016. The horizontal resolution is 0.1 $^{\circ}$ latitude $\times$ 0.1 $^{\circ}$ longitude in a rotated grid, and the temporal resolution is 3 $h$ . The reanalysis was produced using local ensemble transform Kalman filter (LETKF) data assimilation in the Multiscale Online Nonhydrostatic AtmospheRe CHemistry model (MONARCH) developed at the Barcelona Supercomputing Center (BSC). The assimilated data are coarse-mode dust optical depth retrieved from the Moderate Resolution Imaging Spectroradiometer (MODIS) Deep Blue Level 2 products. The reanalysis data set consists of upper-air variables (dust mass concentrations and the extinction coefficient), surface variables (dust deposition and solar irradiance fields among them) and total column variables (e.g. dust optical depth and load). Some dust variables, such as concentrations and wet and dry deposition, are expressed for a binned size distribution that ranges from 0.2 to 20 $µ m$ in particle diameter. Both analysis and first-guess (analysis-initialized simulation) fields are available for the variables that are diagnosed from the state vector. A set of ensemble statistics is archived for each output variable, namely the ensemble mean, standard deviation, maximum and median. The spatial and temporal distribution of the dust fields follows well-known dust cycle features controlled by seasonal changes in meteorology and vegetation cover. The analysis is statistically closer to the assimilated retrievals than the first guess, which proves the consistency of the data assimilation method. Independent evaluation using Aerosol Robotic Network (AERONET) dust-filtered optical depth retrievals indicates that the reanalysis data set is highly accurate (mean bias $=$ $-$ 0.05, RMSE $=$ 0.12 and $r$ $=$ 0.81 when compared to retrievals from the spectral de-convolution algorithm on a 3-hourly basis). Verification statistics are broadly homogeneous in space and time with regional differences that can be partly attributed to model limitations (e.g. poor representation of small-scale emission processes), the presence of aerosols other than dust in the observations used in the evaluation and differences in the number of observations among seasons. Such a reliable high-resolution historical record of atmospheric desert dust will allow a better quantification of dust impacts upon key sectors of society and economy, including health, solar energy production and transportation. The reanalysis data set is distributed via Thematic Real-time Environmental Distributed Data Services (THREDDS) at BSC and is freely available athttp://hdl.handle.net/21.12146/c6d4a608-5de3-47f6-a004-67cb1d498d98 (last access: 10 June 2022).

Details

Title

The MONARCH high-resolution reanalysis of desert dust aerosol over Northern Africa, the Middle East and Europe (2007–2016)

Author

Enza Di Tomaso¹; Escribano, Jerónimo¹

; Basart, Sara¹

; Ginoux, Paul²

; Macchia, Francesca¹

; Barnaba, Francesca³

; Benincasa, Francesco¹; Pierre-Antoine Bretonnière¹

; Buñuel, Arnau¹; Castrillo, Miguel¹

; Cuevas, Emilio⁴

; Formenti, Paola⁵

; Gonçalves, María⁶

; Jorba, Oriol¹

; Klose, Martina⁷

; Lucia, Mona⁸; Gilbert Montané Pinto¹; Mytilinaios, Michail⁸

; Obiso, Vincenzo⁹; Olid, Miriam¹; Schutgens, Nick¹⁰

; Votsis, Athanasios¹¹

; Werner, Ernest¹²; Carlos Pérez García-Pando¹³

¹ Barcelona Supercomputing Center (BSC), Barcelona, Spain
² NOAA Geophysical Fluid Dynamics Laboratory, Princeton, New Jersey, USA
³ Consiglio Nazionale delle Ricerche–Istituto di Scienze dell'Atmosfera e del Clima (CNR–ISAC), Rome, Italy
⁴ Izaña Atmospheric Research Center (IARC), Agencia Estatal de Meteorología (AEMET), Santa Cruz de Tenerife, Spain
⁵ Université Paris Cité and Univ Paris-Est Créteil, CNRS, LISA, 75013 Paris, France
⁶ Barcelona Supercomputing Center (BSC), Barcelona, Spain; Department of Project and Construction Engineering, Universitat Politècnica de Catalunya – BarcelonaTech (UPC), Terrassa, Spain
⁷ Barcelona Supercomputing Center (BSC), Barcelona, Spain; Department Troposphere Research, Institute of Meteorology and Climate Research (IMK-TRO), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
⁸ Consiglio Nazionale delle Ricerche–Istituto di Metodologie per l'Analisi Ambientale (CNR–IMAA), Tito Scalo (PZ), Italy
⁹ Barcelona Supercomputing Center (BSC), Barcelona, Spain; now at: NASA Goddard Institute for Space Studies (GISS), New York, New York, USA
¹⁰ Department of Earth Sciences, Vrije Universiteit Amsterdam, 1081 HV Amsterdam, the Netherlands
¹¹ Section of Governance and Technology for Sustainability (BMS-CSTM), University of Twente, Enschede, the Netherlands; Weather and Climate Change Impact Research, Finnish Meteorological Institute (FMI), Helsinki, Finland
¹² Agencia Estatal de Meteorología (AEMET), Barcelona, Spain
¹³ Barcelona Supercomputing Center (BSC), Barcelona, Spain; ICREA, Catalan Institution for Research and Advanced Studies, Barcelona, Spain

Pages

2785-2816

Publication year

2022

Publication date

2022

Publisher

Copernicus GmbH

ISSN

18663508

e-ISSN

18663516

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.5194/essd-14-2785-2022

ProQuest document ID

2678641803

The MONARCH high-resolution reanalysis of desert dust aerosol over Northern Africa, the Middle East and Europe (2007–2016)

Jump to:

Full text

Abstract

Details

Suggested sources