1 Introduction
The atmosphere is an essential element for the environment and life-forms on Earth. Therefore, any change in natural composition of the atmosphere due to the presence of one or more pollutants in the atmosphere, such as gases or aerosols released directly into the atmosphere from natural or anthropogenic sources, can dramatically influence the Earth's climate and biosphere, human life and health, and economic activities . Short-term exposure to very high concentrations can also be a significant risk factor to human health in addition to prolonged exposure at lower concentrations. One major concern nowadays is the air quality in urban areas due to its significant health risks determined by prolonged population exposure to gaseous pollutants like nitrogen dioxide (NO2), as well as particulate matter (PM2.5 and PM10) . Numerous epidemiological studies related short- and long-term PM10 and NO2 exposure with mortality and morbidity. Short-term exposure to high concentrations of pollutants can be related to both minor discomfort, such as irritation of the eyes, respiratory tract, or skin, and serious conditions, such as asthma, pneumonia, bronchitis, chronic obstructive pulmonary disease and heart problems . Furthermore, years of continuous exposure to PM was shown to be associated with both newborn mortality and cardiovascular disorders. A PM2.5 concentration increase of 10 g m−3 was associated with an increase of 0.67 %–1.04 % in all-cause mortality, 0.52 % in cardiovascular hospital admissions and 1.74 % increase in respiratory admissions . A PM10 concentration increase of 10 g m−3 was associated with a 43 % increase in fatal coronary heart disease and 39.31 % increase in deaths from cardiovascular diseases from short-term exposure . A smaller impact is foreseen in the case of short-term exposure to NO2 concentration, where a 10 ppb increase in concentration was associated with a 0.19 % increase in all-cause mortality in the USA . Despite its critical impact , air pollution information in urban areas is not always available or not at an appropriate spatial resolution, hindering effective air quality management efforts. High-resolution air quality maps are pivotal for environmental stewardship and public awareness by filling the gaps in our understanding of urban air quality. These maps can help to identify pollution hotspots, offering new opportunities for pollution mitigation strategies and influencing both policy and individual behavior .
Mapping pollutant concentrations in urban areas requires fine-scale spatial interpolation of data collected at air quality monitoring stations, taking into account known emission sources and sinks to estimate the actual distribution of pollutants at ambient surface level. Moreover, changes in the composition of the atmosphere caused by urban agglomeration are highly variable in space and time, making their spatial variation difficult to assess with air quality monitoring instruments from ground-based networks or on board satellites . Fixed monitoring stations are suitable for recording the temporal variation of air pollution, including long-term trends, but unsuitable to capture the spatial variation of air pollution at the local level .
It has been demonstrated that gradients at the urban scale can be identified by mobile monitoring . High-resolution mapping of air quality can be done based on long-term averages of a significant number of repeated measurements . However, mobile monitoring measurements to obtain reliable small-scale variations (for a street segment or a residential area) that are subsequently time-averaged to provide long-term concentrations is time-consuming, involving extended resources, due to the necessity to collect a large number of co-located data and at the same time to cover the whole relevant area . Several models have been developed to overcome the weakness of the limited availability of the observational data, collected either at fixed locations or during mobile campaigns or measured by instruments on board of satellites. Data from ground-based mobile and fixed stations as well as satellite data have been used in land-use regression (LUR) models (e.g., ) and dispersion models (e.g., ). Both model types have emerged as very promising and efficient tools for high-resolution mapping of the changes in the composition of the atmosphere, as well as for quantifying the air quality by long-term averaging at a high spatial resolution.
The LUR model is more widely used in air quality studies compared to dispersion modeling because of the following: (a) it is a multivariate linear regression model built on significant covariates that can be further used to estimate pollutant concentration elsewhere; (b) linear regression is one of the most used fine-scale spatial interpolation methods because it is fast, easy to implement , and does not require high computing power such as computational fluid dynamics based on large-eddy simulation or Reynolds-averaged Navier–Stokes approaches ; and (c) a LUR model does not require detailed information on atmospheric conditions or an emission inventory as input data. A LUR model usually requires measurement data and land-use predictor variables (e.g., CORINE dataset) . Initially, LUR models were developed to estimate the concentration of air pollutants linked with traffic emissions, specifically NO2 and NOx . Lately, LUR models have been successfully expanded to include other air pollutants, such as particulate matter (PM) , ozone (O3) , carbon monoxide (CO) and sulfur dioxide (SO2) . LUR models can now estimate a wide range of air pollutants, including black carbon (BC) , volatile organic compounds (VOCs) and ultrafine particles (UFP) . LUR models can be used for both specific sites, such as highways or neighborhoods , and in detailed studies covering a wide range of land-use types across large city areas . Recent studies report on variations in pollutant levels across different times of the year, based on seasonal measurement campaigns .
In general, LUR models tend to smooth concentration levels over a wide area, leading to underestimation or overestimation of observed concentrations within each pixel. Therefore, one of the most feasible and robust approaches to map air quality at high resolution is to use the mixed-effects modeling framework that combines the advantages of measurement-only mapping and LUR modeling . Mixed-effects modeling is mostly used in scenarios where data are hierarchical or clustered (e.g., ). In air pollution research, mixed-effects models are powerful tools that can account for spatial or temporal clustering inherent in air quality data. They can accommodate factors like geographic regions or repeated measurements over time, providing a nuanced understanding of pollutant distribution. These models can be computationally complex, especially when dealing with large datasets or complicated random-effect structures .
The mixed-effects model framework has been used in recent air quality studies for urban areas, like Amsterdam and Copenhagen or Oakland (USA) . Also, a similar mixed-effects approach has been used to estimate NO2 concentrations over Hong Kong SAR . Up to now, no high-resolution mapping of air pollutants at high spatial resolution has been performed for urban areas in Romania, although many cities face serious atmospheric pollution episodes .
In this paper, we present the development and use of the mixed-effects modeling framework for high-resolution mapping of NO2, PM10, PM2.5 and UFP concentrations in Bucharest, the capital of Romania. Data from two mobile measurement campaigns, representative for warm and cold seasons, were combined with fine-scale land-use parameters to provide the spatiotemporal information necessary to predict seasonal surface concentrations. Results were validated against in situ measurements from the Magurele Center for Atmosphere and Radiation Studies (MARS) and eight fixed observation stations operated by the National Air Quality Monitoring Network (NAQMN) of Romania. A detailed description of the study area, measurements, and data treatment are given in Sect. , and the mixed-effects modeling framework tuning for Bucharest is given in Sect. , together with model performance evaluation and aggregated pollutant maps for the warm and cold seasons in Bucharest.
2 Materials and methods
2.1 Study area
Bucharest is the most populated urban area and the most important industrial and commercial center of Romania. According to the latest census, the population of Bucharest is approximately 2.1 million residents , making it the sixth-largest city in the European Union by population. The city covers an area of about 240 km2 and has a dense urban structure. The land use of Bucharest is diverse (Fig. ). The central and northern parts of the city are predominantly residential areas, characterized by a mix of old and new housing developments. Most of the production sectors (such as machinery, textile, chemical, and electronics industries, as well as business parks – all contributing significantly to the economic base of Bucharest) are located in the southern and western areas . The surroundings of Bucharest are mostly agricultural areas and rural/pre-urban residential areas.
Figure 1
Land-use distribution in Bucharest, according to CORINE, 2018. The black line overlapping the data represents the mobile measurement route, and the dashed rectangle represents the modeled area.
[Figure omitted. See PDF]
Located in the southeastern part of Romania, in the Romanian Plain, Bucharest has a humid continental climate, characterized by hot summers, cold winters and two short transitional seasons, spring and autumn. Due to atmospheric circulation patterns specific to the north midlatitude zone, episodes of long-range transport of aerosols from desert regions (Sahara, Arabian Peninsula and Persia) and from wildfires can affect Bucharest's air quality. However, the major pollution sources are local, influenced by the topography, the different local-scale wind regimes and anthropocentric activities .
2.2 Observational dataData were collected during two intensive mobile measurement campaigns carried out in Bucharest during May–July 2022 and January–February 2023. For each campaign, at least 15 measurement routes of approximately 100 km long and around 8 h duration were carried out, under various meteorological conditions. In order to ensure consistent and quality data that highlight the variability of pollutants specific to warm or cold seasons, rainy and/or windy days were excluded from the measurement campaigns. Measurements were performed from Monday to Friday, from early morning to the afternoon, being therefore representative for daytime working times. The route comprised high-traffic streets; residential, industrial, and commercial districts; and suburban neighborhoods. The mobile measurement route is limited to the areas where the car had access, excluding some urban areas (e.g., parks, agricultural zones and water bodies). Portable instruments measuring UFP, particle matter fractions (PM1, PM2.5, PM10) and NO2 were employed in both campaigns, with a measurement rate of 1 s. An additional GPS system had been used to independently save the precise location. A Nafion dryer was also used during warm periods to reach a humidity below 40 % for UFP measurements.
The mobile data had been filtered before being ingested into the model. A moving-average filter with a 3-data-point window was used to remove data points with values exceeding 1.5 times the window mean (above or below).
2.3 High-resolution mapping model
We used the land-use regression and mixed-effects modeling framework to develop high-resolution air quality maps for Bucharest. In a LUR model, the concentration of a pollutant is expressed as a linear combination of variables that approximates the influence of different emission sources and sinks. Usually, LUR models are fixed-effects models because of the following: (a) they use predictor variables that are temporally invariant (e.g., classes from the land cover inventories, statistical values of population density or traffic intensity, and street networks) and, (b) they are applied to the average atmospheric state over the entire observation period. Therefore, LUR models based on fixed variables are not sensitive to unobserved heterogeneity arising from temporal variability in emissions and/or other environmental conditions.
The mixed-effects LUR model considered in this work is similar to the one developed in . The daily mean concentration at a reference point on day () is assumed to be a linear function of the random effect () and the predictor variables () computed at the same reference point:
1 In Eq. (1), is the random intercept, while is the random slopes of Aij, respectively. The represents the regression coefficients of predictor variables at reference point and day . The regression coefficients are the same for all measurement days. The error term of the model is represented by . Mixed-effects model results were averaged per point, similar to the average of the data-only approach. Mean NO2, PM10, PM2.5 and UFP concentrations from each reference point were used as the dependent variables Yij in Eq. (1).
The route taken during campaigns was divided into road segments with a length of approximately 250 m, equivalent to the distance traveled by a car with an average speed of 30 km h−1 in a time interval of 60 s. The reference points were considered at the midpoint of each road segment.
A temporal correction was applied on the data to synchronize the measurements. The correction factor is calculated for each point as the difference between the daily average values corresponding to the point and the whole-campaign average value corresponding to the same point. Values measured within a 250 m street segment were averaged for each day. Similar correction and averaging methods were reported in previous studies (e.g ).
The performance of the model has been evaluated in three steps. First, a subset containing 15 % of the data collected through mobile measurements (and not used to tune the model) was used for cross-validation. This percentage represents the optimal value for which the models developed in this study can recognize the relationship between the attributes of the input data and the output variable with score greater than 0.75. When selecting this percentage, providing as much quality data as possible (85 %) was considered an important factor in the learning process to increase the performance of the model, as well as to avoid data leakage between the learning process and cross-validation. Second, an independent set of data collected at fixed sites was used for validation. In addition to the MARS site, eight monitoring stations operated by NAQMN in Bucharest were selected for validation, based on data availability, representing all types of environments: two urban-type stations located in the east and northeast of the city, one suburban station located 4 km to the south of the capital city, three industrial-type stations situated in the southern half of the city, and two traffic-type stations located in the center of Bucharest. The evaluation involved direct comparison of statistical metrics (correlation, root-mean-square error, relative differences) between model outputs and pollutant direct measurements. The last step was the evaluation of the model performance to resolve different types of environment (traffic, urban, industrial). Details and results are provided in Sect. .
2.4 Tuning the mixed-effects land-use regression model for BucharestBased on the specificity of the climate in the region of Bucharest, we decided to use the residential areas (predictor variable) as time-dependent variables with major differences between the warm season and the cold season . Residential areas are considered sources of pollution due to household activities, heating being responsible for the major difference between the cold and the warm seasons. However, the time dependence of this variable is not sufficient to describe the day-to-day variability, because the residential heating is generally switched on in late autumn and off in late spring, with no real daily variability. To model NO2, PM10, PM2.5 and UFP concentrations over the Bucharest area, an additional variable was needed in the LUR models to cover the fast time dependencies (the so-called random effects). In this mixed-effects framework, the pollutant concentration can be expressed as a linear relationship between fixed variables and time-dependent variables, where the random effect is modeled by including a discrete dummy variable.
In our work, for each reference point, the random effect was modeled as the difference between the standard deviation calculated for the entire period of mobile measurements and the standard deviation calculated for each day of mobile measurements. Therefore, the magnitude and/or sign of the random effect were not the same over all reference points of measurements. The mixed-effects models were fitted to the observational dataset using the Python modules scipy, sklearn and statsmodels.
Other predictor variables used for the LUR models include vehicle traffic intensity (calculated separately for the warm and for the cold seasons) as well as aggregated values of spatial predictor variables calculated within circular buffers ranging from 25 m to 2 km in radius.
The mixed-effects LUR models (one model for each pollutant considered in this study) were adjusted and trained for Bucharest to obtain consistent datasets. For the training process, 85 % subsets of mobile measurements were randomly selected. The remaining 15 % of the mobile measurements were used to cross-validate the LUR models. By dividing the dataset used in the learning process into 85 % for training and 15 % for testing, it was followed, on the one hand, to increase the performance of the models developed in this study, and, on the other hand, the aim was to reduce the overfitting effect of the models by obtaining the smallest possible difference between the score obtained during training and testing. The regression coefficients obtained as an output of the training were further used to generate high-resolution maps of seasonal concentrations of NO2, PM10, PM2.5 and UFP with a resolution of 100 m, over an area of approximately 240 km2 (the entire area of the city of Bucharest).
2.4.1 Spatial predictor variables
To define the optimal configurations of LUR architectures for Bucharest, for each 250 m street segment, spatial predictor variables were extracted using the following data sources: (i) CORINE for the land cover , (ii) Open Street Map (OSM) for the road network and (iii) the National Institute of Statistics (INS) of Romania for the population density.
There is no recent source quantifying the traffic intensity on road segments in Bucharest; therefore, the value for this predictor was estimated for each direction of the street segment using the following relationship:
2 where represents the total number of vehicles per day obtained from INS, represents the total length of street segments (km), represents the speed on the street segment (km h−1) and represents cost function for the street segment (h) retrieved from the geofabrik.de database .
In order to tune and train the mixed-effects models for the specifics of Bucharest, the spatial predictor variables were selected from a number of proxies that describe the possible sources and sinks of NO2, PM10, PM2.5, PM1 and UFP emissions. These variables are presented in Table . The column “Effect” represents the effect that the predictor variables have on the concentration of the pollutant in the atmosphere. Variables associated with emission sources have a positive effect, while variables associated with sinks, such as vegetation cover, have a negative effect. The circle radii most commonly used for buffering the variables describing the sources and sinks at a given location are given in column “Buffer sizes”.
Table 1Description of spatial predictor variables.
Source | Variable | Description | Unit | Effect | Buffer |
---|---|---|---|---|---|
data | name | sizes (m) | |||
INS | POPDENS_X | Population density | no. m−2 | 100, 150, 200, 250, 300, 500, 1000, 2000 | |
LDRES_X | Low-density residential | m2 | 100, 150, 200, 250, 300, 500, 1000, 2000 | ||
HDRES_X | High-density residential | m2 | 100, 150, 200, 250, 300, 500, 1000, 2000 | ||
AIRPORT_X | Airport area | m2 | 100, 150, 200, 250, 300, 500, 1000, 2000 | ||
INDUSTRY_X | Industrial areas | m2 | 100, 150, 200, 250, 300, 500, 1000, 2000 | ||
CORINE 2018 | AGRI_X | Agricultural areas | m2 | 100, 150, 200, 250, 300, 500, 1000, 2000 | |
FOREST_X | Forest areas | m2 | – | 100, 150, 200, 250, 300, 500, 1000, 2000 | |
GREEN_X | Urban green areas | m2 | – | 100, 150, 200, 250, 300, 500, 1000, 2000 | |
CONSTR_X | Construction sites | m2 | 100, 150, 200, 250, 300, 500, 1000, 2000 | ||
WATER_X | Water bodies | m2 | 100, 150, 200, 250, 300, 500, 1000, 2000 | ||
Road network | LENGTH_X | Length of road segments | m | 25, 50, 75, 100, 150, 200, 250, 300, 500, 1000, 2000 | |
TRAFFIC_X | Total traffic load | (veh d−1) m | 25, 50, 75, 100, 150, 200, 250, 300, 500, 1000 | ||
geofabrik.de, INS | COUNTS | Traffic intensity | veh d−1 | N/A |
For the selection of variables, we first removed proxies with a percentage of null values greater than 90 %. After applying this filter, proxies that describe the possible sources and sinks for LUR models tuned for Bucharest were associated with the following:
-
traffic, with predictor “COUNTS” (number of vehicles per day) and road length variables in buffers of 50 to 1000 m;
-
land use, with predictors industry, green, residential lower density (individual residential), residential higher density (collective residential), construction, and water in buffers of 100 to 2000 m;
-
population density in buffers of 100 to 2000 m.
In a second step, we calculated the confidence level ( value) and variance inflation factor (VIF) to identify that predictor's contribution to a collinearity problem. The statistically insignificant variables () and predictor variables where VIF 5 sequentially were not used in the model.
The predictor variables and the size of the buffers used in this work are shown in Table . The predictors passing the above conditions are shown in bold. The sizes of the buffers are those established within the ESCAPE project for the development of LUR models. In the case where no buffer sizes were used, their value was noted with N/A. The “number of vehicles” was noted in the table with “veh”. These buffer sizes are used to determine the spatial proximity of different features by defining a distance zone around the features.
Figure 2
Relative difference between model-predicted values and mobile measurements of NO2 (a, b) and PM10 (c, d) during the warm (a, c) and cold (b, d) seasons.
[Figure omitted. See PDF]
3 Results and discussion3.1 Evaluation of the model performances
For each individual pollutant, four configurations of LUR models were defined, out of those showing the adjusted greater than 0.5. Furthermore, only the results and performances of the LUR models for which the highest adjusted value was obtained are discussed.
The mixed-effects model tuned for Bucharest city has been evaluated using mobile measurements and fixed-site measurements. The performance assessment involved not only the averages at some points, but also the overall agreement on specific types of urban areas covering the entire pollutants concentration intervals. The performance of each model (one for each type of pollutant) was tested following three steps described in detail in Sect. 2.3. Moreover, in the first step, the 15 % kept for testing covers all possible situations regarding the spatial distribution of the predictor variables used in the model. Second, the average concentrations calculated by the model at the locations of the fixed observation sites have been validated by comparison with the seasonal-average concentrations measured at those sites. Also, model data clustered based on the type of environment (traffic, industrial, urban) have been compared to similarly clustered data collected at the fixed observation sites.
3.1.1 Cross-validation against mobile measurements
Cross-validation is performed for each model by comparing the model-predicted dataset against the observed data collected by mobile measurements. Agreement between the two datasets is quantified by calculating the adjusted and root-mean-square error (RMSE). The RMSE of each model was calculated as the square root of the mean of the squared errors. We also present the relative differences between modeled and measured data as a general indicator of the model accuracy. Results show very good correlations () for each model as a follow-up to the training process. The cross-validation shows higher correlations for the cold season ( = 0.81 for NO2 and = 0.88 for PM10) than for the warm season ( = 0.59 for NO2 and = 0.72 for PM10). The weaker performance of the models during the warm season can be explained by the large variability of NO2 and PM10 concentrations in the warm season compared to the cold season , which is not completely captured by our method. Moreover, this strong variability is also suggested by the RMSE values which for the NO2 are higher in the warm season (2.94 ppb) than in the cold season (2.64 ppb). In contrast, the RMSE values for PM10 are lower in the warm season (2.18 g m−3) than in the cold season (3.06 g m−3). Cross-validated scatterplots were also added in the Supplement.
The relative differences between mobile measurements and model retrievals, computed as “(Model Observed)/Observed”, show the ability of the models to estimate PM10 and NO2 concentrations (Fig. ). It can be seen that the model for NO2 tends to overestimate the predicted concentrations in the warm season, especially in urban agglomeration areas (upper left panel), while the model for PM10 tends to underestimate the predicted concentrations (lower left panel). However, such differences are acceptable considering the assumptions made, the uncertainty of the data used for training the models and the general performances reported in the literature .
3.1.2 Validation against independent measurements at fixed observation sites
The model output was evaluated against the seasonal average of the hourly values of NO2 and PM10 measured at nine long-term in situ stations, out of which eight stations were operated by NAQMN. These sites are considered representative for urban, industrial and suburban areas as pictured in Fig. . The model could not be evaluated in the case of UFP, due to unavailability of the data at all fixed stations. More information about what type of variables are measured by NAQMN are given in . A detailed description of the MARS site is given in , where continuous PM concentrations are performed using optical particle counters and gas analyzers . Statistical metrics, like , RMSE and relative differences, were calculated similarly as for the cross-validation. Statistical parameters are summarized in Table .
In Fig. , the mean mass concentrations measured at the fixed observation sites are represented by the diameter of the circle, and the type of environment is represented by the color of the circle. The missing data in Fig. are due to the fact that no measurements were available for these periods, due to various non-scientific reasons (technical problems with the measurement equipment, labor problems, etc.) It can be seen that the variation of the concentrations across the city is relatively low, especially for particulate matter during the cold season. PM10 concentrations range from 21 to 32 g m−3 in both seasons. The highest PM10 concentrations are observed at the urban, suburban and traffic stations, while the lowest PM10 concentration is measured at the industrial stations. The western part of the city shows highest PM10 values. NO2 concentrations range from 8 to 20 ppb in the warm season and from 9 to 24 ppb in the cold season. The highest NO2 concentrations correspond to the areas with intense traffic and industrial activities, while the lowest concentrations are observed in the suburban areas, which are less impacted by traffic.
Figure 3
Average concentrations of NO2 (a, b, units ppb) and PM10 (c, d, units g m−3) during warm (a, c) and cold (b, d) seasons at fixed sites: the diameter of the circle represents the mean mass concentrations measured at the site, and the color of the circle represents the type of environment at the site
[Figure omitted. See PDF]
.
The comparison between model-predicted values and the observed values at the nine fixed sites is presented in Table . Overall, the model performed well, even if the NO2 values tend to be slightly overestimated, and PM10 tends to be slightly underestimated when compared with measured mean concentrations. Mean values of modeled NO2 are within the range of the observed values, as indicated by the standard deviation. The differences can be explained by the local topography and the specifics of the land use. The road system in Bucharest is very dense, so the distance from a street to residential or industrial sectors is often very short, sometimes less than a few meters; therefore, the NO2 m grid resolution cannot always resolve the variations. This smoothing effect caused by insufficient spatial resolution of the model is pinpointed by the lower values of the standard deviations returned by the model in comparison with those returned by observations.
The trends and seasonal differences of both NO2 and PM10 are resolved well by the model, as shown by the comparison with observations. The correlation between observed mean concentrations and modeled mean concentration is above 0.65 for all pollutants for the warm season and 0.75 for the cold season. These values can be attributed to the good performance of the model, in accordance with the values reported for other cities . The lowest correlation value is noted for particulate matter during the warm season. This result can be explained by the fact that, during the warm season, photochemical processes are intensified at the street level, and the model cannot capture the effects properly.
The RMSE is another important parameter for assessing the performance of the model, accounting for the level of absolute error. As in the case of , we noticed a better model performance for the cold season. The difficulties of the model to capture the small variations during the warm period for both NO2 and PM10 are depicted by higher RMSE values. High values of RMSE correspond to low values, demonstrating that the model cannot fully capture the variations of NO2 or PM10 concentrations.
Table 2Comparison between model output and measured values at fixed sites for NO2 and PM10 in Bucharest (Romania) during the warm period of 2022 and the cold period of 2023, as well as statistical metrics.
Pollutant | Season | Observed mean concentration | Modeled mean concentration | RMSE | |
---|---|---|---|---|---|
NO2 | warm | 12.58 7.71 ppb | 16.38 2.47 ppb | 0.66 | 4.97 ppb |
cold | 15.98 9.52 ppb | 17.25 1.17 ppb | 0.75 | 2.27 ppb | |
PM10 | warm | 24.64 13.18 g m−3 | 24.29 4.38 g m−3 | 0.65 | 2.02 g m−3 |
cold | 26.33 18.50 g m−3 | 25.64 4.43 g m−3 | 0.76 | 1.69 g m−3 |
Figure 4
Model (light color) versus measurement (dark color) mean concentrations of NO2 (a) and PM10 (b) during the warm (red) and cold (blue) seasons, along with the relative difference between the cold and warm seasons from model (grey diamond mark) or measurements (black diamond mark); symbols show the relative difference between model and measurements for the warm (purple circle mark) and cold (dark purple circle mark) seasons.
[Figure omitted. See PDF]
3.1.3 Evaluation of the model performance to resolve different types of environmentThe model was tested for its robustness in capturing differences between various types of environment such as urban (including pre-urban), industrial and traffic. Long-term datasets collected at the nine fixed observation sites were used for this purpose. Figure 4 shows mean concentrations of NO2 and PM10 as retrieved by the model and measured at the stations, clustered by the type of environment and separated for the warm and cold seasons. The relative differences between values modeled and measured in different environments are also highlighted.
NO2 is the most variable species, with high differences between seasons and between environment types. Lower NO2 concentrations are depicted for the urban group, whereas the highest NO2 concentration is associated with traffic, as anticipated. The traffic group has the lowest variability among seasons and is also better captured by the model. The model shows increased NO2 concentrations during the warm season for urban and industrial categories, while in the case of traffic areas the model underestimates a bit the measurement averages. The overall relative differences between the cold and warm seasons for both model and observational data inside each defined environment are less than 35 %. Higher relative differences between modeled and measured data are observed for the warm season for all areas, with the lowest difference in the case of traffic areas (around 10 %). Overall lower relative differences between modeled and measured data are observed during the cold season in comparison with the warm period, with NO2 average concentration in industrial and traffic areas underestimated by the model, highlighted by relative differences up to %.
PM10 concentrations show an overall lower seasonal variability and also lower differences between model and observed data, with relative differences less than 20 %. Particulate matter concentrations for urban, industrial and traffic areas varied slightly among seasons. Modeled PM10 concentrations show higher values for industrial environment in comparison with observational data for both seasons. Moreover, industrial areas present the lowest PM10 concentrations during the cold season, as shown by both modeled and measured data. Urban and traffic environment PM10 concentrations are slightly underestimated by the model. The overall relative differences of PM10 concentration between cold and warm seasons for both model and observational data inside each defined environment is less than 15 %. Small relative differences between modeled and measured data are observed for PM10 concentrations, during both seasons. PM10 average concentration in urban and traffic areas are slightly underestimated by the model, highlighted by relative differences up to %, which can also be influenced by the low number of available fixed stations in each environment.
Figure 5
Near-surface concentration maps for Bucharest, resulting from the model for NO2, PM10 and UFP during warm (a, c, e) and cold (b, d, f) seasons.
[Figure omitted. See PDF]
3.2 Mapping atmospheric pollution in BucharestThe validated model was further used to produce NO2, PM10 and UFP concentration maps for Bucharest, representative of the warm and cold seasons (Fig. ). The results are valid for daytime and working days of the week. It should also be taken into account that, in the absence of spatiotemporal emission inventories with a high spatial resolution and traffic data, modeled data were used. In analyzing these results, it must be noted that the near-surface concentrations of atmospheric pollutants are influenced not only by the emissions, but also by the height of the planetary boundary layer, transport from other regions, dry and wet deposition, and chemical processes, all in relation to relatively fast changing meteorological conditions (e.g., air temperature, wind field). Model results show that (overall and regardless of the season) NO2, PM10 and UFP concentrations are higher on the main road sections, with higher values in Bucharest's western area.
Figure 6
Model maps of PM2.5 PM10 ratio during the warm and cold seasons.
[Figure omitted. See PDF]
NO2 concentration maps show that this pollutant is highly related to traffic, the road network of Bucharest being clearly visible both during the warm and during the cold season. The main roadways, especially the Bucharest Ring ring road, are depicted as the primary NO2 source. Also featured are the city's central routes, where traffic remains heavy throughout the day and seasons. The highest NO2 concentrations is noted around busy highways due to the presence of a large number of NO-emitting automobiles. Sinks related to the green areas and water bodies are identified in dark green colors. Overall, NO2 concentration is higher during the warm period, when concentrations are higher on key roadways (35.79 8.38 ppb) and other sources in the city add up. The conversion of NO to NO2 in the presence of sunlight and ozone is significant. During cold months, the NO2 concentration is lower in absolute values across the city; however, the main roads are still depicted as major sources, followed by several industrial areas such as power plants serving the centralized heating of the city. The distribution of the NO2 concentration on all street segments is almost uniform during cold season, with a slight increase for the main street segments, including the Bucharest ring road (20.67 3.44 ppb). At the level of the city of Bucharest, the average value of the NO2 concentration as estimated by the model for the cold season is 16.66 4.04 ppb, while for the warm season it is 18.75 1.98 ppb.
The spatial variation of PM in the city area is substantial, with an abundance of small particles and a high mass concentration of larger particles in densely populated residential areas. Significant concentrations of particles have been identified, mostly in industrial areas and anthropogenic agglomerations, but also along certain major transportation routes. During the cold periods, the PM fractions (all sizes) have larger loading and lower gradients, as reported also for other cities . This is related to increased emissions from residential heating. A gradient of PM10 concentrations is evidenced within the city, with higher loading in the western and southern areas. Average PM10 concentration in Bucharest during cold season is 1.2 times higher than in the warm season. During the warm periods, the PM10 clusters are localized around source areas, while during cold periods the source distributions are more homogeneous.
The average UFP number concentration throughout the mobile route exhibits an important spatial gradient, particularly during the warm season, with variations up to a factor of 2 in the mean, highlighting extensive human exposure to ultrafine particles. The UFP number concentrations are elevated on the main roads, as well as on some areas related to industrial activities in southern, northern and western city regions, mostly during the warm season. A more uniform distribution of UFP mean number concentration is observed during the cold season, when the house heating emissions add up to the traffic. Traffic sources have less impact during the cold season, as chemical processes diminish due to limited sunlight. House heating sources, more evenly spatially distributed than the roads, generate a more homogeneous distribution but also larger absolute values of UFP. Average seasonal concentration for Bucharest city during the cold season (29132 4362 particles cm−3) is 1.4 times higher than in the case of the warm season (21469 3528 particles cm−3).
Since measurements of PM2.5 were only available at a few fixed stations, we included the modeled ratio of PM2.5 to PM10 as the result to show what we expect for the fraction of fine particles from the model. The model shows the fine particle fraction (PM2.5 PM10) to be larger during the cold periods compared to warm periods, with fine particles accounting for up to 95 % of the PM10 concentration (Fig. ). This is explained by the fact that household activities generate predominantly small particles, and higher percentages are seen in the peri-urban regions (outside of Bucharest) where the house-heating sources are contributing more (lighter color of purple, Fig. right panel) to PM2.5 concentrations. During warm periods, the fine particle fraction is approximately 50 % within the city and less than 40 % in the villages close to Bucharest, where the agricultural activities increase the PM10 fraction (Fig. 6 left panel). The main rivers and lakes within Bucharest’s perimeter are clearly sinks for small particles, producing lower fine mode fractions in both seasons.
4 ConclusionsThe regression-based methods fed by mobile data can predict NO2, PM10 and UFP concentrations for regions which are not properly covered by observations. In order to do this, the right combination of data sampling frequency, duration and route, as well as the correct number and type of predictor factors (corresponding to the surrounding environment), must be considered. Mobile monitoring together with modeling tools can therefore compensate for spatial and temporal data gaps which are collected by the monitoring stations and can assist individuals and policymakers in identifying regions and causes of poor air quality. In this study, we demonstrate the effectiveness of combining mobile measurements with mixed-effects LUR models to derive seasonal maps of near-surface PM10, NO2 and UFP. The study shows first-time validated results for Bucharest, the capital city of Romania, which is characterized by a large, densely populated surface with a very dense and heavily used street network.
Despite the limited number of fixed stations available for this work (8 MARS), the tuned mixed-effects LUR model proved to be robust and accurate in producing high-resolution mapping of NO2, PM10 and UFP for the warm and cold seasons. Overall, good model performance was observed for both seasons and all concentrations, similar to other studies. The slightly higher mean squared error values coupled with smaller cross-validation values obtained for the warm season suggest that the mobile campaign data collected for this study did not capture all the important NO2 and PM10 concentration variations. Even if the route selected for the two mobile measurement campaigns included all urban structures, the limitation of car access remains a source of error, which can lead to an underestimation of the concentration of pollutants. The performance of this model can be greatly improved by the involvement of citizens (pedestrian or by bicycle) to collect data from areas where cars are not allowed. Datasets systematically collected by citizens during daily (repeated) activities or walks could provide improved estimates of spatial variability for these areas . The citizen involvement could increase the pollutant data collection in areas restricted for cars or bicycles and could enable the possibility to study the sinks on green or water areas, but these methods would be only based on low-cost sensors. Further improvements could also be made by the inclusion of spatiotemporal emission data with a high spatial resolution, such as traffic volumes or emission inventories.
The results provided by the model show that high concentrations of particulate matter during the cold season are representative for Bucharest city, due to the added effect of house heating (either dispersed in residential areas or localized at the city's power plants). Fine particles dominate during the cold season, although they remain at high levels during the warm season as well. NO2 is less challenging but still an important factor, especially during the warm season and along the main roads. The seasonal high-resolution air quality maps for Bucharest based on mixed-effects modeling pinpointed pollutant variability mostly during the warm season and higher concentrations and fine particle ratios during the cold season. Water and vegetation areas are evidenced as effective sinks for NO2 and fine particles, while traffic and residential heating are evidenced as effective sources in Bucharest. Based on these findings, a more extended certified air quality station network would be beneficial for human-health-related pollutant monitoring, as well as inclusion of fine particle measurements among these.
The approach presented in this paper can be adjusted for high-resolution mapping of NO2, PM and UFP in other cities as well, using the series of predictor variables identified in this study as necessary. This is feasible as long as the urban structures are characterized well and there is a fairly dense and diverse network of in situ monitoring stations, whose observational data can be used for model calibration and validation.
Code availability
Codes developed for this study are available from the main author (Camelia Talianu, [email protected]) upon request.
Data availability
Data used in this study are available from the main author (Camelia Talianu, [email protected]) upon request.
The supplement related to this article is available online at
Author contributions
CT: Conceptualization, Methodology, Formal analysis, Software, Writing (original draft), Writing (review and editing). JV: Conceptualization, Formal Analysis, Methodology, Resources, Supervision, Writing (original draft), Writing (review and editing). DN: Conceptualization, Funding acquisition, Project administration, Resources, Supervision, Writing (original draft), Writing (review and editing). AI: Data curation, Investigation, Visualization, Writing (original draft). AD: Formal analysis, Investigation, Writing (original draft). AN: Writing (review and editing). LB: Data curation, Investigation, Formal analysis.
Competing interests
The contact author has declared that none of the authors has any competing interests.
Disclaimer
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.
Special issue statement
This article is part of the special issue “Air quality research at street level – Part II (ACP/GMD inter-journal SI)”. It is not associated with a conference.
Acknowledgements
We thank INCAS and INOESY for making the mobile instruments available. Also, we thank BOKU-Met, Vienna, Austria, for facilitating access to the supercomputer from the IT infrastructure used for the LUR models. This publication has been prepared using the European Union’s Copernicus Land Monitoring Service. We gratefully acknowledge the National Air Quality Monitoring Network, part of the Romanian Ministry of Environment, and the Romanian Institute of Statistics for providing invaluable data essential to this research. We appreciate the anonymous reviewers and editors for their time and effort to improve the paper.
Financial support
This work was carried out through RI-URBANS project (Research Infrastructures Services Reinforcing Air Quality Monitoring Capacities in European Urban & Industrial Areas, European Union's Horizon 2020 research and innovation program under grant agreement, grant agreement number 101036245), ATMO-ACCESS project (European Commission under the EU Horizon 2020 – Research and Innovation Framework Programme, H2020-INFRAIA-2020-1, grant agreement number: 10100800) and the Core Program within the National Research Development and Innovation Plan 2022–2027, carried out with the support of MCID, project no. PN 23 05.
Review statement
This paper was edited by Stelios Kazadzis and reviewed by five anonymous referees.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2025. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
High-resolution mapping of pollutants based on mobile observation facilitates deep understanding of air pollutant distributions within a city. This approach fosters science-based decisions to improve air quality, by adding to the existing but not optimally distributed permanent monitoring stations. In this study, we developed high-resolution concentration maps of nitrogen dioxide (NO2), particulate matter (PM10) and ultrafine particles (UFP) for Bucharest, Romania, to evaluate the spatial variation of pollutants across the city during the warm and cold seasons. Maps were generated using a mixed-effects method applied to a land-use regression (LUR) model. The approach relies on multiple land-use and traffic predictor variables and assimilation of data collected by mobile measurements over 30 d in the periods May–July 2022 and January–February 2023. Cross-validation was done against in situ data extracted from the same collection, while validation was organized by comparison with standard measurements at fixed reference sites. Our study shows that this combined method has a good performance for all pollutants (
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details




1 National Institute of Research and Development for Optoelectronics (INOE 2000), Str. Atomistilor 409, Măgurele, 077125, Ilfov, Romania; Institute of Meteorology and Climatology, Department of Water, Atmosphere and Environment, University of Natural Resources and Life Sciences, Gregor-Mendel Street 33, Vienna, 1180, Vienna, Austria
2 National Institute of Research and Development for Optoelectronics (INOE 2000), Str. Atomistilor 409, Măgurele, 077125, Ilfov, Romania
3 National Institute of Research and Development for Optoelectronics (INOE 2000), Str. Atomistilor 409, Măgurele, 077125, Ilfov, Romania; Faculty of Geography, University of Bucharest, Bulevardul Nicolae Bălcescu 1, Bucharest, 010041, Bucharest, Romania
4 National Institute of Research and Development for Optoelectronics (INOE 2000), Str. Atomistilor 409, Măgurele, 077125, Ilfov, Romania; National University of Science and Technology POLITEHNICA Bucharest, Splaiul Independentei 313, Bucharest, 060042, Romania