1. Introduction
Aerosol Optical Depth (AOD) is a key parameter that describes the content and distribution of aerosols in the atmosphere, primarily originating from natural resources and human activities [1,2,3,4,5]. It is of great significance for climate change research, atmospheric environmental quality assessment, and analysis of human health impacts [6,7,8,9,10,11,12,13]. It also helps to gain a deeper understanding of the distribution characteristics, transport patterns of aerosols in the different layers of atmosphere, and their mechanisms of impact on climate and the environment [14,15,16]. To obtain high-spatiotemporal-resolution AOD spatial distribution on a regional scale, traditional ground-based point observations are difficult to achieve. Satellites are the only possible technology for full coverage over a large area. However, due to factors such as satellite orbital patterns, cloud cover, and surface reflectance, single satellite products often have missing data or cannot fully cover large regions [17,18,19]. As a result, missing data reconstruction has attracted widespread attention from scholars and has made certain progress in various remote sensing data application scenarios [20].
AOD product data obtained from satellite remote sensing can be used for large-scale air quality monitoring, especially in areas where ground monitoring stations are insufficiently covered [21]. Satellite AOD product data can provide more comprehensive information on air quality and help identify and track the sources and transport pathways of atmospheric pollutants. And for a long-term trend analysis of particle size variation, AOD is often analyzed separately for different particle types like dust, coarse, and fine pollution particles, based on emission sources and size [22,23,24]. Also, AOD data provide valuable insights into material corrosion and soiling caused by aerosol air pollution [25,26,27]. Therefore, it is of great significance to obtain AOD values in an uninterrupted, quasi-real-time, and full-coverage manner and to grasp their spatial and temporal dynamics for the atmospheric environment, ecology, corrosion, pollution of human health materials, and climate change. However, aerosol data have a high missing rate, which adversely affects air pollution predictions and may lead to unavoidable biases, while also introducing significant uncertainty in assessing the impact of aerosols on weather and climate [17,18,28,29]. Research on the reconstruction of missing satellite remote sensing AOD data have been extensively conducted by scholars both domestically and internationally, leading to the proposal of various methods. Currently, multiple statistical methods have been applied to the interpolation and reconstruction of aerosol AOD [30,31,32]. Methods such as inverse distance weighting, nearest neighbor, kriging, generalized linear models, or multiple imputation can yield significant errors in interpolation results when the original data distribution is uneven or there are many missing values, making it difficult to accurately reflect the true AOD distribution [33,34]. In order to address the issues of insufficient and uneven distribution of ground monitoring stations for aerosols, remote sensing inversion methods have been widely adopted. The multi-source data fusion method utilizes AOD data from different sources to obtain more comprehensive AOD information [35,36]. For example, multi-source data fusion can compensate for the shortcomings of a single data source to some extent, improving the spatial coverage of the data. Machine learning models, due to their strong nonlinear fitting capabilities and ability to handle complex data, have been widely used in the reconstruction of missing AOD data in recent years [37,38,39]. However, as the spatial resolution of aerosol products continues to improve, the range of interpolation and interpolation efficiency also needs to be enhanced. Additionally, neglecting many other external influencing factors, such as meteorological conditions and the cloud cover ratio, inevitably reduces the efficiency and accuracy of the inferences.
The Fengyun meteorological satellite, as a meteorological observation satellite independently developed by China, is equipped with sensors that can obtain large-scale AOD data. It has high spatial and temporal resolution, providing abundant data resources for studying AOD distribution on regional and even global scales [40,41]. However, due to various factors such as cloud cover, atmospheric scattering, solar flares, and the performance of the sensors during satellite observations, there are significant gaps in the AOD product data from the Fengyun meteorological satellites. These missing data severely affect the temporal and spatial continuity and integrity of the AOD data, limiting the application of China’s Fengyun satellites in certain areas. Therefore, conducting research on the reconstruction of missing data in the Fengyun meteorological satellite AOD product is of great practical significance, as it can improve the quality and usability of AOD data, providing more reliable data support for research and applications in related fields.
In summary, the existing methods for reconstructing missing AOD data each have their own advantages and disadvantages, and it is necessary to choose the appropriate method based on specific circumstances in practical applications. Given the characteristics of the AOD product data from the Fengyun meteorological satellites, it is necessary to develop a high-precision and efficient method for generating daily, 1 km resolution, full-coverage AOD products for the Beijing–Tianjin–Hebei region. This will be achieved by leveraging multi-source data fusion techniques and machine learning methods, particularly the random forest model. The approach involves integrating meteorological model data, various satellite remote sensing AOD products, surface condition parameters (such as Normalized Difference Vegetation Index (NDVI), Land Use/Cover Change (LUCC), and Digital Elevation Model (DEM)), and human activity influencing factors like nighttime light data. This research aims to effectively overcome the limitations of single-data sources and traditional methods by capitalizing on the synergistic advantages of multi-source data and the powerful modeling capabilities of machine learning, thereby providing a reliable data foundation for refined and continuous atmospheric environment monitoring.
2. Materials and Methods
2.1. Study Area
The Beijing–Tianjin–Hebei region is one of China’s major urban agglomerations (113.45°~119.85° E, 36.04°~42.62° N) and is known as China’s “capital economic circle”, which includes Beijing, Tianjin, and 11 cities in Hebei Province that is shown in Figure 1. This region is located in the northern part of the North China Plain, bordered by the Yanshan Mountains to the north, the North China Plain to the south, the Taihang Mountains to the west, and the Bohai Bay to the east. The terrain is higher in the northwest and north, while the south and east are relatively flat. The Yanshan-Taihang mountain range gradually transitions to plains from northwest to southeast, presenting a topography characterized by higher elevations in the northwest and lower elevations in the southeast. The region covers an area of 218,000 square kilometers, and emissions from industrial and residential sources are the main contributors to air pollution in the area, with approximately 30% coming from surrounding areas affected by meteorological conditions. Additionally, the high coal emissions, low wind speeds, and high atmospheric stability during winter exacerbate the concentration of particulate matter, while the surrounding Taihang Mountains may inhibit the dispersion of air pollutants. The location in China is shown in Figure 1.
2.2. Data and Processing
2.2.1. Remote Sensing Data
FY 3D/MERSI-II: FY 3D/MERSI-II was launched in November 2017 and can achieve global multi-frequency coverage observations every day [19,42,43]. FY3D is an afternoon orbit satellite, with its equatorial ascending node crossing time around 13:30. The average orbital altitude is 836 km. The data were sourced from the National Satellite Meteorological Center’s satellite data server website
FY-4A/AGRI: FY-4A/AGRI is China’s new generation of geostationary meteorological satellites [46]. AGRI has 14 bands, with wavelengths ranging from 0.45 to 13.8 μm, covering visible, near-infrared, and infrared bands [47]. The temporal resolution of the full disk TOAR data is 15 min, and the spatial resolution is approximately 0.5–4 km. TOAR and CLM data are available from the National Satellite Meteorological Center’s satellite data server website
Modis data: The MODIS MCD19A2 Version 6 data product is a Level 2 gridded product of land AOD released by the National Aeronautics and Space Administration (NASA) of the United States. It was generated based on data from the MODIS Terra and Aqua satellites, using the Multi-Angle Implementation of Atmospheric Correction (MAIAC) algorithm. This product is generated daily with a resolution of 1 km. This article uses MCD19A2-MODIS AOD data from the LAADS DAAC (
Himawari AHI: Himawari8 is a new-generation geostationary meteorological satellite launched by the Japan Meteorological Agency (JMA), equipped with the advanced optical sensor AHI (Advanced Himawari Imager, L3Harris Technologies, Melbourne, FL, USA). The Japan Aerospace Exploration Agency (JAXA) provides the recently updated AHI aerosol products, which include two main datasets (AODpure and AODmerged), namely, Himawari AHI. AHI is capable of providing high temporal resolution (every 10 min) and high spatial resolution (0.5 to 2 km) observational data, covering the Pacific Ocean and its adjacent seas extensively [48]. The AOD products from Himawari-8 AHI are divided into Level 2 and Level 3, primarily used for monitoring the aerosol content and distribution in the atmosphere. Among them, L2 data were used for span calibration in the eastern region of China [49]. The data were sourced from the Meteorological Satellite Center (MSC) | Product and Library. It is accessed on 7 July 2015.
The satellite remote sensing data used are listed in Table 1:
2.2.2. Meteorological Model Data
Using meteorological model data, this study further determines the relationship between AOD and the meteorological environment. This research obtained atmospheric reanalysis products from HRCLDAS (High Resolution China Meteorological Administration Land Data Assimilation System) and collected a total of 8 meteorological variables, including surface temperature (SP), boundary layer height (BLH), evaporation (ET), relative humidity (RH), wind direction (WD), and wind speed (WS), with a spatial resolution of 0.01° and a temporal resolution of hourly.
2.2.3. Auxiliary Data
Normalized Difference Vegetation Index (NDVI): MODIS products are obtained from the NASA website and processed to generate 16-day, monthly, and annual NDVI periodic products within the study area, with resolutions of 250 m, 500 m, and 1 km. FY data products, after preprocessing, generate 10-day, monthly, and annual NVI periodic products, with resolutions of 250 m and 1 km.
Land Cover Types (LUCC): The land cover type data were obtained from the NASA website’s MCD12Q1 product and preprocessed to generate land cover products for the study area from 2000 to 2020, with a spatial resolution of 500 m.
Digital Elevation Model (DEM): STRM surface elevation data with a spatial resolution of 90 m were obtained and resampled to generate data that are consistent with the spatial resolution of the training set.
2.2.4. Training Dataset Processing
The text involves the data organization as shown in Table 2 of the model. The original data are extracted, organized, and unified into the same projection and coordinate system (WGS_1984). The spatial resolution was based on the resolution of meteorological model products (e.g., 2 m temperature, 1 km), and the temporal resolution was set to daily. Data matching and resampling were performed, and a spatial join method was used to integrate various data into the same grid units, forming a set of independent variables based on regular grid units, as shown in Figure 2:
2.3. Model
This article collects meteorological pattern HRCLDAS data, land use type parameters, and nighttime light index, which are used for machine learning model training and adjustment after data preprocessing. Through steps such as multi-source data preprocessing, model training (90% sample size) and adjustment, accuracy verification (10% sample size), and multi-source data fusion, the predicted data were then input to obtain the model-estimated aerosol products. The accuracy was verified using 10% of the validation data, and the predicted data were input to obtain the model-estimated aerosol products. At the same time, multi-source aerosol product data (such as MCD19A2 AOD, FY4A AOD, etc.) were collected and preprocessed. First, a fusion product was obtained through a multi-source aerosol fusion algorithm and then combined with the machine learning model estimation results to merge the correction relationship equation. This was used to interpolate and reconstruct missing pixels, ultimately achieving high-precision reconstruction of missing or incomplete AOD data, completing the reconstruction of aerosol satellite products. Figure 3 shows details.
The main steps are as follows: first, fuse and interpolate multi-source aerosol remote sensing product data; second, select data of 13 variables as model input variables. At the same time, analyze the importance of each parameter in the model verification stage. Third, integrate meteorological data, remote sensing data, and surface parameters to perform data fusion correction and spatial accompaniment, and conduct model training and accuracy verification.
The multi-source satellite aerosol product data include MCD19A2 AOD, FY4A AOD, FY3D AOD, and Himawari-8 AOD products, each with different spatial and temporal resolutions, and varying missing rates in the Beijing–Tianjin–Hebei region. To maximize the spatial continuity of the data, the MCD19A2 AOD product was used as a reference to calibrate the AOD products of FY4A, FY3D, and Himawari-8 and to interpolate the missing aerosol observation values. The interpolation process is as follows: (1). Preprocess four types of aerosol product data sources to extract high-quality AOD pixel values, using the MCD19A2 product as a benchmark to unify spatial and temporal resolutions. (2). Retrieve aerosol products pixel by pixel; if the MCD19A2 aerosol (AOD) product exists, stop the retrieval, and use the MCD19A2 aerosol (AOD) product as the pixel value for that pixel. (3). If it does not exist, retrieve the FY4A aerosol product at the same spatial location. If the FY4A aerosol product exists, use linear regression to correct the pixel value, and the corrected result will be the pixel value for that pixel. (4). If the FY4A aerosol product does not exist, continue to retrieve the Himawari-8 aerosol product. If the Himawari-8 aerosol product exists, use linear regression to correct the pixel value, and the corrected result will be the pixel value for that pixel. (5). If the Himawari-8 aerosol product does not exist, continue to retrieve the FY3D aerosol product. If the Fengyun 3D aerosol product exists, use linear regression to correct the pixel value, and the corrected result will be the pixel value for that pixel. If it does not exist, invoke machine learning algorithms to estimate and simulate the aerosol product.
2.4. Model Building
Due to the influence of multiple factors such as complex geographical environment, meteorological conditions and human activities, the concentration of aerosol AOD varies significantly over time and space. Five typical machine learning models were used to address the relationship between spatiotemporal variability and complex influencing factors and AOD, including Bagging algorithm, random forest (RF), AdaBoost algorithm, gradient boosting tree (GBDT), and feedforward neural network learning (FNN). Since these models have different parameters or algorithm combinations, in order to obtain the best fit estimate, they were adjusted by using 10-fold cross-validation until the lowest error was obtained. The final model for interpolating and reconstructing aerosol (AOD) products was determined by comparing the performance of each model.
2.5. Parameter Importance Analysis
Reduced Exactness (IncMSE) was used to calculate the importance of different variables in the model. When other conditions remain unchanged, a variable was randomly increased by ±25% deviation, the model was recalculated, and the MSE increase in the newly generated result error was calculated. The average value was calculated after being repeated multiple times. If the MSE increase value was larger, it indicated that the parameter is more important in the model and vice versa. In this study, the Importance index was used to evaluate the importance of parameters, and its principle is similar to IncMSE.
2.6. Model Training Set Validation
The project carried out data fusion correction and spatial matching based on meteorological data, remote sensing data, and surface condition parameters from January to December 2021, and it finally formed 414,980 sample data points. In total, 90% of the data were randomly selected for model training and verification, and the remaining 10% of the data were used for product accuracy verification.
MAE (mean absolute error):
(1)
RMSE (root mean square error):
(2)
R2 (coefficient of determination):
(3)
In the above formula, represents the sample size, represents the simulated values, represents the observed values, and represents the mean of the observed values.
3. Results
3.1. Missing Rate Analysis
The missing rate of MCD19A2 aerosol (AOD) products, FY3D aerosol (AOD) products, were showed in Figure 4a,b. The results showed that the missing rate of MCD19A2 aerosol (AOD) products was greater than 40% in most areas, among which the missing rate was highest in the northern mountainous areas and coastal areas, greater than 80%; the missing rate of FY3D aerosol products was relatively low, less than 40% overall, and the missing situation was more serious in the northern mountainous areas and the southern part of Beijing–Tianjin–Hebei. Therefore, based on multi-source remote sensing satellites, according to the missing situation and overall quality of satellite data, a reasonable interpolation and reconstruction process was formulated to produce spatially continuous aerosol (AOD) products.
3.2. Machine Model Estimation
3.2.1. Comparison of Model Results
Table 3 compares the estimation results of different models. It can be seen from the results that, in the internal verification stage of the model, the random forest method performs better than other methods, and the RMSE of the simulation results is 0.10. In the external verification, the random forest algorithm is obviously better than other algorithms, with an RMSE of 0.19. Therefore, it can be considered that the random forest model is the optimal model in the interpolation and reconstruction of aerosol AOD in the Beijing–Tianjin–Hebei region. RF makes it suitable for this type of AOD reconstruction due to its ability to handle complex nonlinear relationships between variables, its robustness to overfitting with high-dimensional data, and its capability to assess variable importance.
3.2.2. Model Variable Importance Analysis Unit
This article uses the Importance index to calculate the importance of different parameters in the model. The larger the Importance value, the more important the parameter is in the model. Figure 5 shows the Importance values of each parameter. From the results, it can be seen that, when simulating aerosol products, the top three variables in terms of importance are near-infrared band NIR, relative humidity RH, and boundary layer height BLH. Overall, the importance of the near-infrared band NIR variable is much higher than other variables, with an Importance value of 0.34, followed by relative humidity RH, with an Importance value of 0.15.
3.2.3. AOD Product Result Evaluation
Figure 6 is a histogram of the product residual distribution of the random forest simulation aerosol AOD results. It can be seen from the figure that the simulation error generally presents a skewed normal distribution. The error of less than ±0.1 accounts for 62.38% of the simulation, and the error of less than ±0.5 accounts for 95.9%; the frequency of the simulation error greater than 0 is much higher than that of less than 0, indicating that the simulation results generally overestimate the actual value.
Figure 7 is the spatial distribution of the error of the random forest simulation of aerosol AOD results. It can be seen from the figure that, when simulating aerosol AOD, the errors of most stations are between 0 and 0.1. From the perspective of spatial distribution characteristics, stations with errors greater than 0.1 are mainly distributed in the southwest region.
Table 4 is a comparison of the evaluation results of the random forest model simulating aerosol AOD products in different seasons, and Figure 8 is a scatter plot of the simulated values and measured values in different seasons. The simulation results in winter are the best, followed by autumn, spring, and summer, and the errors in the four seasons are not much different, with the maximum RMSE difference of 0.1 and the maximum MAE difference of 0.04.
3.2.4. Spatial Distribution of Reconstruction Results
Figure 9 and Figure 10 are the original aerosol products and the fully covered aerosol products after interpolation and reconstruction, respectively. From the perspective of spatial distribution, the original aerosol products have a high spatial missing rate, among which the missing rate in the northern mountainous area is the most serious. Affected by weather factors, some areas have regional missing rates. When polluted weather occurs, the spatial distribution of aerosol products is much higher in the southern plains than in the northwestern mountainous areas, and the aerosol value is greater than 0.35. This project uses the interpolation and reconstruction method established by the Step-Two method, which fully considers the influencing factors such as weather and surface conditions. The fully covered spatial aerosol products produced are relatively smooth and conform to the spatial distribution trend of high in the south and low in the north which is shown in Figure 11. However, the linear regression method is currently used for consistency correction between multi-source products, resulting in some areas being significantly higher than the surrounding areas after interpolation. Further analysis of the correction model between multi-source aerosol products is needed.
4. Discussion
This study used multi-source satellite remote sensing datasets, environmental site monitoring data, and meteorological model data in 2021 to establish random forest models for the daily interpolation of Fengyun satellite AOD products. The results show that the constructed random forest model has the lowest RMSE and the highest R2 when interpolating AOD time series. According to the results of the model importance ranking, the variable that has the greatest impact on the model results is relative humidity RH. Analysis of the simulation results of different seasons found that the simulation effect in winter is the best, followed by autumn, which may be due to the low wind speed and high humidity in the Beijing–Tianjin–Hebei region in autumn and winter, but the errors in the four seasons are not much different. The multi-source fusion results show that the results after interpolation and reconstruction show that the spatial distribution trend of the southern plains is much higher than that of the mountains, which is consistent with the distribution law of aerosol products in the Beijing–Tianjin–Hebei region, and the overall performance of the full coverage product is smooth and accurate. However, with the enrichment of satellite data, the algorithm differences and radiation calibration offsets of different multi-source AOD products will also introduce interactive errors. If there are long-term high-missing areas, they may rely too much on simulated data, and systematic deviations may be introduced due to insufficient parameterization of the physical model.
The factors that affect aerosol concentrations are different in different regions. Therefore, when there are large differences in environmental background, human background, and climate background, the model needs to be trained again to obtain the best filling results.
In general, by using multi-source aerosol products, selecting appropriate variables to establish a random learning model, and combining with meteorological model data, it is possible to invert aerosol AOD products in the Beijing–Tianjin–Hebei region over a large area of space based on Fengyun satellites. Based on this, this study can establish a long-term series of 1 km resolution daily spatially continuous aerosol AOD product data in the Beijing–Tianjin–Hebei region and subsequent atmospheric environment monitoring and climate change research.
5. Conclusions
In this study, in order to solve the problem of missing data in the AOD product data of Fengyun meteorological satellites, meteorological model data were added for the first time to perform the AOD data missing reconstruction algorithm. By collecting multi-source auxiliary data and preprocessing them, the AOD data missing reconstruction model was constructed using the random forest algorithm. The experimental results show that the model can effectively reconstruct the missing AOD data, and the reconstruction accuracy is significantly improved compared with the traditional methods. The specific research conclusions are as follows: After collecting and preprocessing multi-source data, including projection transformation, spatial matching, null value data removal, data fusion, and other operations, a high-quality dataset was obtained, which provided a reliable data foundation for the construction of subsequent models. The AOD data missing reconstruction model constructed based on the random forest algorithm can accurately learn the complex relationship between AOD and auxiliary factors and determine the auxiliary factors that have a greater impact on AOD through feature selection, which effectively improves the reconstruction accuracy of the model. During the verification process, the evaluation indicators such as the mean absolute error (MAE), root mean square error (RMSE), and determination coefficient (R2) of the model performed well, and, compared with the traditional spatial interpolation method and multi-source data fusion method, it has obvious advantages.
The method proposed in this study provides an effective solution for the reconstruction of the missing AOD product data of Fengyun meteorological satellites. It can provide more accurate and complete AOD data support for atmospheric environment monitoring, climate change research, and other fields, and it has important application value.
Data curation, H.W. and M.W.; investigation and simulations, M.W.; methodology, Q.L., F.M., Y.G. and X.G.; formal analysis, H.W. and M.W.; writing—original draft preparation, H.W.; writing—review and editing, H.W., M.W., P.J. and Q.L. All authors have read and agreed to the published version of the manuscript.
Not applicable.
Not applicable.
The data presented in this study are openly available in FY 3D/MERSI-II at
We are grateful to the China Meteorological Administration—National Satellite Meteorological Center for data support. We would also like to acknowledge the Japan Aerospace Exploration Agency for providing the datasets (available from
The authors declare no conflicts of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 1 The location of Beijing–Tianjin–Hebei area.
Figure 2 Data preprocessing workflow.
Figure 3 Flowchart of the interpolation and reconstruction technology for aerosol.
Figure 4 Spatial distribution map of MCD19A2 AOD missing rate (a). Spatial distribution map of FY3D AOD missing rate (b).
Figure 5 Importance ratio of each indicator in the aerosol concentration AOD inversion result (supplementary horizontal and vertical axis description).
Figure 6 Product residual distribution histogram.
Figure 7 Spatial distribution of site errors.
Figure 8 Scatter plots of simulated and measured values in different seasons.
Figure 9 MCD19A2 AOD product (01122021–19122021).
Figure 10 FYD AOD product (01122021–19122021).
Figure 11 Interpolation and reconstruction of aerosol products (01122021–19122021).
Multi-source satellite aerosol products.
On-Board Instruments | Products | Physical Parameters | Time Resolution | Spatial Resolution |
---|---|---|---|---|
FY-3D/MESRI-II | Land-based aerosol products | AOD | 1 time/day | 1 km |
MODIS | MCD19A2 (L2) | AOD | 1 time/day | 1 km |
FY-4A/AGRI | Aerosol parameter product (L2) | AOD | 1 time/hour | 4 km |
Himawari-8/AHI | Aerosol products (L2/L3) | AOD | 1 time/hour | 4 km |
Aerosol product (AOD) monitoring input data.
No. | Type | Variable Name | Unit | Spatial Resolution | Spatial Resolution | Data Type | Note |
---|---|---|---|---|---|---|---|
1 | Measured data | Aerosol products (AOD) | -- | -- | | ||
2 | Meteorological variables | 2 m temperature (TEM) | °C | 1 km | 1 h | Real-time data integration | HRCLDAS |
3 | Relative humidity (RH) | % | 1 km | 1 h | Real-time data integration | HRCLDAS | |
4 | Precipitation (PRE) | mm | 1 km | 1 h | Real-time data integration | HRCLDAS | |
5 | Total evaporation (ET) | 1 km | 3 h | Real-time data integration | HRCLDAS | ||
6 | Surface pressure (SP) | hPa | 1 km | 3 h | Real-time data integration | HRCLDAS | |
7 | 10 m wind speed (WS) | m/s | 1 km | 1 h | Real-time data integration | HRCLDAS | |
8 | 10 m wind direct (WD) | m/s | 1 km | 1 h | Real-time data integration | HRCLDAS | |
9 | Boundary layer height (BLH) | m | 3 km | 1 h | Forecast data (forecast at 8 PM) | Chem 3 km-pblh | |
10 | Surface condition parameters | Normalized vegetation index (NDVI) | -- | 1 km | monthly | MOD13A3 | |
11 | Land cover data (LUCC) | -- | 500 m | yearly | MCD12Q1 | ||
12 | Surface elevation data (DEM) | -- | 90 m | yearly | SRTM | ||
13 | Population data | Nighttime light data (NTL) | -- | 500 m | daily | VNP46A1 |
Comparison of test results of each model (unit: DU).
Verification Method | Model | Validation of AOD Simulation Results | |||
---|---|---|---|---|---|
Sample | MAE | RMSE | R2 | ||
Model validation | DT | 373,482 | 0.00 | 0.00 | 0.99 |
RF | 373,482 | 0.06 | 0.11 | 0.93 | |
AdaBoost | 373,482 | 0.31 | 0.30 | 0.40 | |
GBDT | 373,482 | 0.21 | 0.32 | 0.53 | |
MLP | 373,482 | 0.21 | 0.32 | 0.53 | |
Product verification | DT | 41,498 | 0.20 | 0.41 | 0.33 |
RF | 41,498 | 0.14 | 0.24 | 0.74 | |
AdaBoost | 41,498 | 0.30 | 0.38 | 0.29 | |
GBDT | 41,498 | 0.22 | 0.35 | 0.47 | |
MLP | 41,498 | 0.24 | 0.36 | 0.46 |
Comparison of AOD product accuracy in different seasons.
Season | AOD | |||
---|---|---|---|---|
Sample Num | MAE | RMSE | R2 | |
Spring | 10,713 | 0.07 | 0.12 | 0.94 |
Summer | 6064 | 0.08 | 0.16 | 0.91 |
Autumn | 6616 | 0.06 | 0.11 | 0.93 |
Winter | 14,504 | 0.04 | 0.06 | 0.96 |
1. Cochrane, M.A. Fire science for rainforests. Nature; 2003; 421, pp. 913-919. [DOI: https://dx.doi.org/10.1038/nature01437] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/12606992]
2. Guo, B.; Wu, H.J.; Pei, L.; Zhu, X.W.; Zhang, D.M.; Wang, Y.; Luo, P.P. Study on the spatiotemporal dynamic of ground-level ozone concentrations on multiple scales across China during the blue-sky protection campaign. Environ. Int.; 2022; 170, 107606. [DOI: https://dx.doi.org/10.1016/j.envint.2022.107606] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/36335896]
3. Kaufman, Y.J.; Tanré, D.; Boucher, O. A satellite view of aerosols in the climate system. Nature; 2002; 419, pp. 215-223. [DOI: https://dx.doi.org/10.1038/nature01091]
4. Wei, X.; Chang, N.B.; Bai, K.; Gao, W. Satellite remote sensing of aerosol optical depth: Advances, challenges, and perspectives. Crit. Rev. Environ. Sci. Technol.; 2020; 50, pp. 1640-1725. [DOI: https://dx.doi.org/10.1080/10643389.2019.1665944]
5. Kumar, V.; Patil, R.; Bhawar, R.L.; Rahul, P.R.C.; Yelisetti, S. Increasing wind speeds fuel the wider spreading of pollution caused by fires over the IGP region during the Indian post-monsoon season. Atmosphere; 2022; 13, 1525. [DOI: https://dx.doi.org/10.3390/atmos13091525]
6. Tianchen, L.; Lin, S.; Haoxin, L. MODIS aerosol optical depth retrieval based on random forest approach. Remote Sens. Lett.; 2021; 12, pp. 179-189.
7. Von Schneidemesser, E.; Monks, P.S.; Allan, J.D.; Bruhwiler, L.; Forster, P.; Fowler, D.; Lauer, A.; Morgan, W.T.; Paasonen, P.; Righi, M.
8. Fuzzi, S.; Baltensperger, U.; Carslaw, K.; Decesari, S.; Denier van der Gon, H.; Facchini, M.C.; Fowler, D.; Koren, I.; Langford, B.; Lohmann, U.
9. Gao, M.; Beig, G.; Song, S.; Zhang, H.; Hu, J.; Ying, Q.; Liang, F.; Liu, Y.; Wang, H.; Lu, X.
10. Shen, F.; Zhang, L.; Jiang, L.; Tang, M.; Gai, X.; Chen, M.; Ge, X. Temporal variations of six ambient criteria air pollutants from 2015 to 2018, their spatial distributions, health risks and relationships with socioeconomic factors during 2018 in China. Environ. Int.; 2020; 137, 105556. [DOI: https://dx.doi.org/10.1016/j.envint.2020.105556]
11. Sun, J.-L.; Jing, X.; Chang, W.-J.; Chen, Z.-X.; Zeng, H. Cumulative health risk assessment of halogenated and parent polycyclic aromatic hydrocarbons associated with particulate matters in urban air. Ecotox. Environ. Saf.; 2015; 113, pp. 31-37. [DOI: https://dx.doi.org/10.1016/j.ecoenv.2014.11.024] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/25483369]
12. Yang, Y.; Zheng, Z.; Yim, S.H.L.; Roth, M.; Ren, G.; Gao, Z.; Wang, T.; Li, Q.; Shi, C.; Ning, G.
13. Guo, B.; Zhang, D.M.; Pei, L.; Su, Y.; Wang, X.X.; Bian, Y.; Zhang, D.H.; Yao, W.Q.; Zhou, Z.X.; Guo, L.Y. Estimating PM2.5 concentrations via random forest method using satellite, auxiliary, and ground-level station dataset at multiple temporal scales across China in 2017. Sci. Total Environ.; 2021; 778, 146288. [DOI: https://dx.doi.org/10.1016/j.scitotenv.2021.146288]
14. Hansen, J.; Sato, M.; Ruedy, R. Radiative Forcing and Climate Response. J. Geophys. Res. Biogeosci.; 1997; 102, pp. 6831-6864. [DOI: https://dx.doi.org/10.1029/96JD03436]
15. Li, X.; Liu, K.; Tian, J. Variability, predictability, and uncertainty in global aerosols inferred from gap-filled satellite observations and an econometric modeling approach. Remote Sens. Environ.; 2021; 261, 112501. [DOI: https://dx.doi.org/10.1016/j.rse.2021.112501]
16. Jose, S.; Lakshmi, N.B.; Babu, S.S. Long Term Trend in Aerosol Direct Radiative Effects over Indian Ocean Region from Multi-Satellite Observations. Remote Sens. Lett.; 2021; 12, pp. 994-1003. [DOI: https://dx.doi.org/10.1080/2150704X.2021.1957509]
17. Zhao, C.; Yang, Y.; Fan, H.; Huang, J.; Fu, Y.; Zhang, X.; Kang, S.; Cong, Z.; Letu, H.; Menenti, M. Aerosol characteristics and impacts on weather and climate over the Tibetan Plateau. Natl. Sci. Rev.; 2020; 7, pp. 492-495. [DOI: https://dx.doi.org/10.1093/nsr/nwz184]
18. Zheng, Z.; Ren, G.; Wang, H.; Dou, J.; Gao, Z.; Duan, C.; Li, Y.; Ngarukiyimana, J.; Zhao, C.; Cao, C.
19. Chen, H.; Li, Q.; Wang, Z.; Ma, P.; Li, Y.; Zhao, A. Retrieval of Aerosol Optical Depth Using FY3D MERSI2 Data. J. Geo-Inf. Sci.; 2020; 22, pp. 1887-1896.
20. Chen, Z.Y.; Jin, J.Q.; Zhang, R.; Zhang, T.H.; Chen, J.J.; Yang, J.; Ou, C.Q.; Guo, Y.M. Comparison of different missing-imputation methods for MAIAC (multiangle implementation of atmospheric correction) AOD in estimating daily PM2.5 levels. Remote Sens.; 2020; 12, 3008. [DOI: https://dx.doi.org/10.3390/rs12183008]
21. Zhang, Y.; Chen, H.; Wang, Z. Terrestrial Aerosol Retrieval over Beijing from Chinese GF-1 Data Based on the Blue/Red Correlation. Remote Sens. Lett.; 2020; 12, pp. 219-228. [DOI: https://dx.doi.org/10.1080/2150704X.2020.1856959]
22. Varotsos, C.; Ondov, J.; Tzanis, C.; Öztürk, F.; Nelson, M.; Ke, H.; Christodoulakis, J. An observational study of the atmospheric ultra-fine particle dynamics. Atmos. Environ.; 2012; 59, pp. 312-319. [DOI: https://dx.doi.org/10.1016/j.atmosenv.2012.05.015]
23. Shin, J.; Sim, J.; Dehkhoda, N.; Joo, S.; Kim, T.; Kim, G.; Müller, D.; Tesche, M.; Shin, S.-K.; Shin, D.
24. Pandey, P.C. Highlighting the Role of Earth Observation Sentinel5P TROPOMI in Monitoring Volcanic Eruptions: A Report on Hunga Tonga, a Submarine Volcano. Remote Sens. Lett.; 2022; 13, pp. 912-923. [DOI: https://dx.doi.org/10.1080/2150704X.2022.2106799]
25. Tidblad, J.; Kreislová, K.; Faller, M.; De la Fuente, D.; Yates, T.; Verney-Carron, A.; Grøntoft, T.; Gordon, A.; Hans, U. ICP Materials Trends in Corrosion, Soiling and Air Pollution (1987–2014). Materials; 2017; 10, 969. [DOI: https://dx.doi.org/10.3390/ma10080969]
26. Tzanis, C.; Varotsos, C.; Ferm, M.; Christodoulakis, J.; Assimakopoulos, M.N.; Efthymiou, C. Nitric acid and particulate matter measurements at Athens, Greece, in connection with corrosion studies. Atmos. Chem. Phys.; 2009; 9, pp. 8309-8316. [DOI: https://dx.doi.org/10.5194/acp-9-8309-2009]
27. Tzanis, C.; Varotsos, C.; Christodoulakis, J.; Tidblad, J.; Ferm, M.; Ionescu, A.; Lefevre, R.-A.; Theodorakopoulou, K.; Kreislova, K. On the corrosion and soiling effects on materials by air pollution in Athens, Greece, Atmos. Chem. Phys.; 2011; 11, pp. 12039-12048.
28. Li, Z.; Wang, Y.; Guo, J.; Zhao, C.; Cribb, M.C.; Dong, X.; Fan, J.; Gong, D.; Huang, J.; Jiang, M.
29. Bai, K.; Chang, N.-B.; Chen, C.-F. Spectral Information Adaptation and Synthesis Scheme for Merging Cross-Mission Ocean Color Reflectance Observations from MODIS and VIIRS. IEEE Trans. Geosci. Remote Sens.; 2016; 54, pp. 311-329. [DOI: https://dx.doi.org/10.1109/TGRS.2015.2456906]
30. Bai, K.; Li, K.; Guo, J.; Yang, Y.; Chang, N.-B. Filling the gaps of in situ hourly PM2.5 concentration data with the aid of empirical orthogonal function analysis constrained by diurnal cycles. Atmos. Meas. Technol.; 2020; 13, pp. 1213-1226. [DOI: https://dx.doi.org/10.5194/amt-13-1213-2020]
31. Chang, N.-B.; Bai, K.; Chen, C.-F. Smart Information Reconstruction via Time-Space-Spectrum Continuum for Cloud Removal in Satellite Images. IEEE J. Sel. Top. Appl.; 2015; 8, pp. 1898-1912. [DOI: https://dx.doi.org/10.1109/JSTARS.2015.2400636]
32. Lv, B.L.; Hu, Y.T.; Chang, H.H.; Russell, A.G.; Bai, Y.Q. Improving the accuracy of daily PM2.5 distributions derived from the fusion of ground-level measurements with aerosol optical depth observations: A case study in North China. Environ. Sci. Technol.; 2016; 50, pp. 4752-4759. [DOI: https://dx.doi.org/10.1021/acs.est.5b05940] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/27043852]
33. Xiao, Q.Y.; Wang, Y.J.; Chang, H.H.; Meng, X.; Geng, G.N.; Lyapustin, A.; Liu, Y. Full-coverage high-resolution daily PM2.5 estimation using MAIAC AOD in the Yangtze River Delta of China. Remote Sens. Environ.; 2017; 199, pp. 437-446. [DOI: https://dx.doi.org/10.1016/j.rse.2017.07.023]
34. Di, Q.; Kloog, I.; Koutrakis, P.; Lyapustin, A.; Wang, Y.J.; Schwartz, J. Assessing PM2.5 exposures with high spatiotemporal resolution across the continental United States. Environ. Sci. Technol.; 2016; 50, pp. 4712-4721. [DOI: https://dx.doi.org/10.1021/acs.est.5b06121]
35. Jiang, M.J.; Chen, Z.H.; Yang, Y.S.; Ni, C.J.; Yang, Q. Establishment of aerosol optical depth dataset in the Sichuan Basin by the random forest approach. Atmos. Pollut. Res.; 2022; 13, 101394. [DOI: https://dx.doi.org/10.1016/j.apr.2022.101394]
36. Bai, K.X.; Li, K.; Ma, M.L.; Li, K.T.; Li, Z.Q.; Guo, J.P.; Chang, N.B.; Tan, Z.; Han, D. LGHAP: The Long-term Gap-free High-resolution Air Pollutant concentration dataset, derived via tensor-flow-based multimodal data fusion. Earth Syst. Sci. Data; 2022; 14, pp. 907-927. [DOI: https://dx.doi.org/10.5194/essd-14-907-2022]
37. Chen, X.; Shi, G. Machine learning-based inversion of aerosol optical depth inversion from FY-4A data. Remote Sens. Nat. Resour.; 2025; 37, pp. 213-220.
38. Si, L. Remote Sensing Retrieval of Aerosol Optical Depth Based on Deep Learning and Its Spatiotemporal Pattern and Driving Force Analysis. Ph.D. Thesis; Shandong University: Shandong, China, 2023.
39. Ali, M.A.; Bilal, M.; Wang, Y.; Qiu, Z.F.; Nichol, J.E.; de Leeuw, G.; Ke, S.; Mhawish, A.; Almazroui, M.; Mazhar, U.
40. Fan, R.A.; Ma, Y.Y.; Jin, S.K.; Gong, W.; Liu, B.M.; Wang, W.Y.; Li, H.; Zhang, Y.Q. Validation, analysis, and comparison of MISR V23 aerosol optical depth products with MODIS and AERONET observations. Sci. Total Environ.; 2023; 856, 159117. [DOI: https://dx.doi.org/10.1016/j.scitotenv.2022.159117]
41. Zhang, H. Technical Specification for Space-Ground Interface of Fengyun-3D Satellite (1); Shanghai Engineering Institute of Satellite: Shanghai, China, 2014; pp. 5-20.
42. Tang, S.H.; Qiu, H.; Ma, G. Review on progress of the Fengyun meteorological satellite. J. Remote Sens.; 2016; 20, pp. 842-849.
43. Zhu, A.J.; Hu, X.Q.; Lin, M.Y.; Jia, S.Z.; Ma, Y. Global data acquisition methods and data distribution for FY-3D meteorological satellite. J. Mar. Meteorol.; 2018; 38, pp. 1-10.
44. Yang, L.K.; Hu, X.Q.; Wang, H.; He, X.W.; Liu, P.; Xu, N.; Yang, Z.D.; Zhang, P. Preliminary test of quantitative capability in aerosol retrieval over land from MERSI-II onboard FY-3D. Natl. Remote Sens. Bull.; 2022; 26, pp. 923-940. [DOI: https://dx.doi.org/10.11834/jrs.20210286]
45. Xu, M.J. Study on inversion method of AOD in Beijing-Tianjin-Hebei Region by FY-4A Meteorological Satellite. Master’s Thesis; Nanjing University of Information Science & Technology: Nanjing, China, 2021.
46. Zhang, Y.; Li, Z.; Li, J. A preliminary layer perceptible water vapor retrieval algorithm for Fengyun-4 advanced geosynchronous radiation imager. Proceedings of the IGARSS 2019–2019 IEEE International Geoscience and Remote Sensing Symposium; Yokohama, Japan, 28 July–2 August 2019; pp. 7564-7566.
47. Wang, Y.; Liu, H.; Zhang, Y.; Duan, M.; Tang, S.; Deng, X. Validation of FY-4A AGRI layer precipitable water products using radiosonde data. Atmos. Res.; 2021; 253, 105502. [DOI: https://dx.doi.org/10.1016/j.atmosres.2021.105502]
48. Xu, W.; Wang, W.; Chen, B. Comparison of hourly aerosol retrievals from JAXA Himawari/AHI in version 3.0 and a simple customized method. Sci. Rep.; 2020; 10, 20884. [DOI: https://dx.doi.org/10.1038/s41598-020-77948-5]
49. He, X.; Zhao, L.; Zhai, W.; Qiao, M.; Zhang, W.; Xin, J. Calibration of the Span of Himawari-8 AOD Products in Eastern China. Remote Sens. Lett.; 2021; 12, pp. 1136-1146. [DOI: https://dx.doi.org/10.1080/2150704X.2021.1937371]
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The satellite remote sensing of Aerosol Optical Depth (AOD) products is crucial in environmental monitoring and atmospheric pollution research. However, data gaps in AOD products from satellites like Fengyun significantly hinder continuous, seamless environmental monitoring capabilities, posing challenges for the long-term analysis of atmospheric pollution trends, responses to sudden ecological events, and disaster management. This study aims to develop a high-precision method to fill spatial AOD missing values and generate daily full-coverage AOD products for the Beijing–Tianjin–Hebei region in 2021 by integrating multi-dimensional data, including meteorological models, multi-source remote sensing, surface conditions, and nighttime light parameters, and applying machine learning methods. A comparison of five machine learning models showed that the random forest model performed optimally in AOD inversion, achieving a root mean square error (RMSE) of 0.11 and a coefficient of determination (R2) of 0.93. Seasonal evaluation further indicated that the model’s simulation was best in winter. Variable importance analysis identified relative humidity (RH) as the most critical factor influencing model results. The reconstructed full-coverage AOD product exhibited a spatial distribution trend of significantly higher values in the southern plain areas compared to mountainous regions, consistent with the actual aerosol distribution patterns in the Beijing–Tianjin–Hebei area. Moreover, the product demonstrated overall smoothness and high accuracy. This research lays the foundation for establishing a long-term, 1 km resolution, daily spatially continuous AOD product for the Beijing–Tianjin–Hebei region and beyond, providing more robust data support for addressing regional and larger-scale environmental challenges.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details


1 Beijng Municipal Climate Center, Beijing 100089, China; [email protected] (H.W.); [email protected] (F.M.); [email protected] (Y.G.)
2 State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China; [email protected]
3 School of Economics and Management, Southwest University of Science and Technology, Mianyang 621010, China; [email protected]
4 School of Architectural Engineering, Tianjin University, Tianjin 300072, China; [email protected], Institute of Water Resources and Hydropower Research, Beijing 100044, China