Full text

Turn on search term navigation

1 Introduction

Photosynthesis represents the single largest CO $_{2}$ flux between the atmosphere and the biosphere. At the canopy level, the sum of all leaf photosynthesis is termed gross primary productivity (GPP), and accurate characterization of GPP represents a major uncertainty in the carbon cycle . Directly measuring GPP from remote sensing systems (e.g., satellites) is not presently possible. Instead, previous work has utilized stationary measurements of net ecosystem exchange (NEE) from flux towers that can be decomposed into GPP and respiration e.g.,. Observable quantities from satellites (e.g., vegetation indices computed from reflectance data) are then related to GPP inferred from flux towers e.g., in light use efficiency (LUE) or machine learning models to derive global estimates of GPP.

Vegetation indices such as the normalized difference vegetation index (NDVI) and near-infrared reflectance of terrestrial vegetation (NIR $_{v}$ ) combine two (or more) spectral bands with different absorption characteristics to infer quantities related to plant physiology and canopy structure. The MODIS instrument was launched on the Terra and Aqua satellites in 1999 and 2002, respectively. This instrument has proved particularly useful due, in part, to the long operational lifetime, and vegetation indices can be derived from the individual reflectance bands of MODIS. More recently launched satellites, like Sentinel-5P, carry instruments with the necessary signal-to-noise ratio and spectral resolution to retrieve solar-induced chlorophyll fluorescence (SIF). The electromagnetic signal SIF is emitted by chlorophylls during photosynthesis. SIF is emitted in the red–far-red wavelengths of 650–850 nm . It is a way, besides photochemistry and nonphotochemical quenching, for de-excitement of the chlorophylls . Even though the link between chlorophyll fluorescence and photosynthesis is nonlinear at leaf and canopy scale, that does not hold for satellite scales, in which a linear relationship of SIF to GPP is frequently reported e.g.,.

Vegetation indices (also termed greenness) can be regarded as a measure of photosynthetic capacity , whereas SIF indicates photosynthetic activity. SIF has been shown to be a powerful proxy for estimating GPP , to capture the impact of drought on photosynthetic activities across different vegetation types , and to assess the regional source of carbon emissions .

described the first retrievals of SIF from the TROPOspheric Monitoring Instrument (TROPOMI), the sole instrument on the Sentinel-5P satellite. The TROPOMI instrument has an equatorial crossing time of 13:30 local solar time and a 16 d orbit cycle. TROPOMI has a wide swath (2600 km across track) that allows for near-daily temporal resolution and a spatial resolution of 5.5 $\times$ 3.5 km. This was a substantial improvement to previous satellite instruments measuring SIF that were limited to 40 $\times$ 40 km spatial resolution . Despite the higher spatial resolution of TROPOMI, there have been efforts to estimate SIF at finer spatial scales e.g.,. This is motivated by the importance of fine-scale phenomena in the carbon cycle such as ecosystem fragmentation e.g.,.

Globally, 20 % and 70 % of the remaining forests are within a distance of 100 m and 1 km, respectively, from the forest edges, meaning that most of the forests are fragmented . show that the carbon uptake and storage of trees near the forest edge increase up to 13 $\pm$ 3 % and 10 $\pm$ 1 %, respectively. On the other hand most of our understanding about forest carbon fluxes comes from intact ecosystems, resulting in a mismatch between the ecosystems we are trying to quantify and the data we are using to do so . Higher-resolution estimates of photosynthetic activity might enable us to include fragmentation effects of ecosystems to global carbon cycle estimates or biosphere models like , , and . Additionally, recent work has shown the importance of fine-scale variations in the urban biosphere on the overall carbon flux for a city e.g.,.

There has been some recent work with the goal of increasing the resolution of existing global SIF estimates through downscaling methods (i.e., physics-based methods). For example, used NIR $_{v}$ to partition SIF within a particular TROPOMI scene and oversampled it using a 16 d window afterwards, resulting in a daily 500 m SIF estimate over the conterminous United States (CONUS). downscaled GOME-2 satellite SIF from 0.5 to 0.05 $^{\circ}$ using a parameterization with a term for the fraction of absorbed photosynthetically active radiation (fAPAR), one for water stress, and one for heat stress based on MODIS data. used airborne data to downscale far-red SIF from canopy to leaf level.

Machine learning has also been used to create global, high-resolution SIF data sets. , , and used spectral bands from MODIS as input to neural networks that were trained with Orbiting Carbon Observatory 2 (OCO-2) SIF data to build global continuous SIF products at 0.05 $^{\circ}$ resolution. OCO-2 has a narrow swath, and, therefore, the networks are trained only in the regions where OCO-2 SIF is available by using MODIS data as input. After training, the global MODIS data are used as input to estimate SIF on a global scale. use MODIS reflectance data as input and predict GOME-2 normalized by clear-sky irradiance. Multiplying that with a MODIS-derived photosynthetic active radiation product results in a MODIS only estimated SIF, termed RSIF. trained a convolutional neural network (CNN) with MODIS data on the artificial GOSIF data set at a resolution of 0.05 $^{\circ}$ and used the trained network and MODIS data at a resolution of 0.008 $^{\circ}$ to estimate SIF at 0.008 $^{\circ}$ . The physics-based downscaling approach from can only consider one variable for weighting the SIF signal, while the machine-learning-based approaches in the literature can consider more than one variable – but many do not use SIF data as an input to their model, meaning that they estimate SIF based on reflectance data.

Here we build a convolutional neural network to obtain high-resolution SIF, named SIFnet. SIFnet increases the spatial resolution of TROPOMI SIF by considering coarse-resolution SIF with high-resolution auxiliary data as input. These auxiliary data consist of either proxies of SIF or photosynthetic drivers. SIFnet is trained using data with near-global coverage. Different model parameters (structure, input features, and scaling factors) are compared and evaluated. After training the model, the resolution of TROPOMI SIF is refined by a factor of 10 to a spatial resolution of 0.005 $^{\circ}$ . This product is then compared against a recent downscaling method from the literature . Both high-resolution estimates are validated over CONUS against the independent SIF measurements of the OCO-2 and OCO-3 instruments (together OCO-2/3) .

2 Data sets

2.1 Data sources

The input data to the neural network are listed in Table . These diverse global data products are expected to capture a broad range of photosynthetic drivers. The table divides the data into time-varying or time-invariant and training or validation data. The native spatial resolution is shown in the last column of Table . In a first step, all data sets are aggregated to 0.05 $^{\circ}$ spatial resolution and 16 d time steps. In the case of a higher native spatial resolution, the data are regridded by computing the mean value that falls into the coarse-resolution grid cell. In the case of coarser resolutions than 0.05 $^{\circ}$ , it is resampled to the common grid. Quality control flags and cloud filtering are applied when necessary.

Table 1

Data sets used in this work. ENF: evergreen needleleaf forest; EBF: evergreen broadleaf forest; DNF: deciduous needleleaf forest; DBF: deciduous broadleaf forest; MF: mixed forest; UF: unknown forest.

Data			Time	Training	Validation	Spatial
			invariant			resolution
Sentinel-5P TROPOMI $^{1}$		SIF at 740 nm		$\times$		0.05 $^{\circ}$
MODIS	MODIS bands	NIR, red, blue,		$\times$		500 m
MCD43A4.v006		green, SWIR1,
(v06) $^{2}$		SWIR2, SWIR3
	Vegetation indices	NIRv, kNDVI, NDVI,		$\times$		500 m
		EVI
ERA5-Land	Temperature	Mean air temperature,		$\times$		0.1 $^{\circ}$
Hourly – ECMWF		mean air temperature
Climate Reanalysis $^{3}$		with 16 d delay
	Precipitation	Total precipitation,		$\times$		0.1 $^{\circ}$
		total precipitation
		with 16 d delay
NASA USDA Enhanced SMAP		Surface soil moisture,		$\times$		10 km
Soil Moisture $^{4}$		subsurface soil moisture
Solar zenith angle $^{5}$		Cosine of the solar zenith		$\times$		Computed
		angle
USDA GMTED2010:		Elevation	$\times$	$\times$		7.5 arcsec
Global Multi-resolution
Terrain Elevation Data 2010 $^{6}$
Copernicus Corine global land cover		Non-vegetated, ENF, EBF,	$\times$	$\times$		100 m
classification (CLC2018) $^{7}$		DNF, DBF, MF, UF, shrubs,
		grassland, crops, wetland
Forest fragmentation $^{8}$		Forest share	$\times$	$\times$		30 m
		Edge share	$\times$	$\times$		30 m
OCO-2 $^{9}$		SIF at 740 nm			$\times$	2.25 $\times$ 1.29 km
OCO-3 $^{10}$		SIF at 740 nm			$\times$	2.25 $\times$ 1.29 km

References: $^{1}$ ; $^{2}$ ; $^{3}$ ; $^{4}$ ; $^{5}$ ; $^{6}$ ; $^{7}$ ; $^{8}$ ; $^{9}$ ; $^{10}$ .

MODIS measures the reflected radiance from the earth surface in seven different spectral bands covering the visible and infrared spectral region. Vegetation indices are computed by combining the near-infrared (where chlorophyll is non-absorbing) and the red band (where chlorophyll is highly absorbing) . Specifically, the normalized difference vegetation index (NDVI) , near-infrared vegetation index (NIR $_{v}$ ) , the kernel NDVI (kNDVI) , and the enhanced vegetation index (EVI) are computed as follows: $\begin{matrix} 1 & NDVI = \frac{ρ_{NIR} - ρ_{RED}}{ρ_{NIR} + ρ_{RED}}, \\ 2 & {NIR}_{v} = ρ_{NIR} \cdot NDVI, \\ 3 & kNDVI = tanh⁡ ({NDVI}^{2}), \\ 4 & EVI = G \cdot \frac{ρ_{NIR} - ρ_{RED}}{ρ_{NIR} + C_{1} \cdot ρ_{RED} - C_{2} \cdot ρ_{BLUE} + L} . \end{matrix}$ EVI coefficients for MODIS are as follows: $L = 1$ , $C_{1} = 5$ , $C_{2} = 7.5$ , and $G = 2.5$ . $ρ_{NIR}$ is the near infrared band, $ρ_{RED}$ is the red band, and $ρ_{BLUE}$ is the blue band from the MODIS satellites.

Temperature and precipitation are taken from ERA5-Land data at the time step of interest and with a delay of one time step. Soil moisture has been shown to be a strong driver of global photosynthesis due, in part, to its impact on vapor-pressure deficit . Here we use the coarse-resolution NASA USDA Soil Moisture Active Passive (SMAP) soil moisture as a model input and explore its correlation with TROPOMI SIF. The cosine of the solar zenith angle (SZA) is a proxy for photosynthetically active radiation (PAR) under cloud-free conditions .

Time-invariant data sets consist of elevation data, fractional land cover classification, and forest fragmentation data. The land cover classification is resampled to 11 fractional classes. The forest fragmentation data consist of two bands and have a native resolution of 30 m. One band describes the share of forest within the grid cell (forest share) and the other how much of that forest is edge forest (defined as a maximum distance to an edge or other land cover type of 30 m). OCO-2 and OCO-3 have high spatial resolution (2.25 $\times$ 1.29 km) but small swaths (10 km) and a 16 d revisit time.

2.2 Covariation of input data sets with SIF

We are interested in understanding what these different data sets are telling us about SIF and also how they co-vary with each other. We compare all collected time variant data against TROPOMI SIF in the spatial and temporal domain. As a quantitative measure, we compute the Pearson correlation coefficient ( $r$ ) . Figure shows a scatter comparison of SIF against the auxiliary data at the lowest resolution of the two corresponding sets. Negative SIF values (on the $x$ axis in Fig. ) are due to relatively high retrieval errors which scale with radiance levels .

Figure 1

Scatter comparison of SIF to timely changing auxiliary data. The time span of measurements is from April 2018 to March 2021 at 16 d resolution. Longitude and latitude borders are from $-$ 180 to 180 $^{\circ}$ and $-$ 60 to 70 $^{\circ}$ , respectively. The comparison resolution corresponds to the lowest resolution of the two corresponding products. For all MODIS data the resolution is 0.05 $^{\circ}$ and for precipitation, air temperature, surface soil moisture (ssm), and subsurface soil moisture (susm) 0.1 $^{\circ}$ . To quantify the goodness of fit we compute the Pearson correlation coefficient ( $r$ ) for each subplot .

[Figure omitted. See PDF]

Figure 2

Pearson correlation coefficient of NIR $_{v}$ and kNDVI to TROPOMI SIF. Data are compared at 0.05 $^{\circ}$ spatial resolution and in 16 d time steps starting in April 2018 until March 2021. The value per grid cell in (a) and (b) represents the Pearson correlation coefficient of the vegetation index to SIF in time. Panel (c) represents the difference in correlation of the vegetation indices to SIF.

[Figure omitted. See PDF]

Figure shows spatial patterns of the Pearson correlation coefficients between both NIR $_{v}$ and kNDVI to SIF. In both our spatial (Fig. ) and temporal (Fig. ) analyses, we find NIR $_{v}$ is a better predictor for SIF than kNDVI, which contradicts the recent findings from . However, used GOME-2 SIF instead of TROPOMI SIF.

The vegetation index NIR $_{v}$ outperforms kNDVI in nearly all vegetated regions. Only central Asia, the Sahara, and very high latitudes show a better correlation of kNDVI with SIF. At the same time, these regions generally show a weaker correlation of vegetation indices with SIF.

3 Development and optimization of SIFnet

3.1 Training and optimization of the neural network

Convolutional neural networks (CNNs) are supervised machine learning methods that need matching feature and ground truth data pairs to compute the loss that is back propagated . As such, we begin by coarsening SIF data to 0.5 $^{\circ}$ and use it with auxiliary data at 0.05 $^{\circ}$ as input to SIFnet, allowing us to estimate SIF at 0.05 $^{\circ}$ . The model output is compared against the measured TROPOMI SIF at 0.05 $^{\circ}$ . After optimizing the model it can resolve a scaling factor of 10 between coarse-resolution input SIF and model output SIF. Figure visualizes this method. In the following step of estimating high-resolution SIF, the feature SIF data have a resolution of 0.05 $^{\circ}$ and auxiliary data of 0.005 $^{\circ}$ , resulting in a model output of SIF at 0.005 $^{\circ}$ .

Figure 3

CNN model structure and training and estimation method. Yellow and red blocks are convolutional and ReLU layers, respectively. Notation of convolutional layers: $k (X 1, X 2)$ : kernel sizes are $X 1, X 2$ ; ch $Y$ : number of channels is $Y$ . For training, the data are upscaled. We input auxiliary data at the target resolution and SIF data at a factor of 10 coarser.

[Figure omitted. See PDF]

Figure shows our chosen CNN model structure for SIFnet. The model consists of convolutional and rectified linear unit (ReLU) layers that are arranged in a sequence. After the first convolutional block there is a residual connection that skips one ReLU and two convolutional layers. Convolutional kernel sizes are either (3,3) or (1,1). This structure is adapted from the literature findings from, for example, . Further, several model structures (Sect. S4.3 in the Supplement) with a different amount of layers, channels, or residual blocks are compared. The chosen model structure represents the best trade-off between complexity and performance. The input feature collinearity and principal component analysis (PCA) presented in Sect. S3 show that some input features have high correlations with each other. A total of 9 out of the 19 PCs in the time variant and 13 out of the 15 PCs in the time-invariant data carry above 99 % of the variance. This suggests that fewer channels should be used in the CNN layers than the feature dimension (because some variables are similar). Therefore the number of channels in the first layer of SIFnet reduces the complexity from 34 to 16 channels (Fig. ). More complex model structures did not result in a notably improved loss metrics (Sect. S4.3).

For training SIFnet we use 3 years of data (April 2018–March 2021) in 16 d time steps. The study regions are shown in Sect. S2. There are five folds used as training data: two folds over Asia, one over Europe, one over the southern part of Africa, and one over South America. Our validation region is North America (Fig. S4). The hyperparameter tuning is done by training the model on the five folds and computing the loss of the validation data. The parameters are optimized to minimize the loss of the validation data set. Due to computational reasons and the size of the data set, we do not apply a cross validation in the optimization process. The final product consists of high-resolution SIF at 0.005 $^{\circ}$ and is validated against independent SIF measurements of the instruments OCO-2 and OCO-3.

We center and scale each feature individually by subtracting the mean and normalizing by the standard deviation. For data augmentation of the training data, we use random crops and random flips. Each day of one fold has a matrix size of 1200 $\times$ 900 pixels. We analyze 69 d in 16 d steps over 3 years. For each input during the training process we randomly crop a matrix with a size of 100 $\times$ 100 pixels. As some areas have a large fraction of missing values (e.g., due to water or clouds), we only use cropped matrices that consist of $>$ 80 % valid pixels in the SIF product. Further, we randomly flip vertically and horizontally, both with a probability of 0.5. These data augmentation methods provide us with a huge database that should avoid overfitting the network parameters. During training, all missing values in the data are set to zero. That mainly affects water regions as the share of missing values in the SIF data used is 91.2 % caused by water. In case there is a missing value in the SIF training sample, all feature values of this pixel are also set to zero to ensure the network does not learn false relationships between the predictors and the target variable (that also applies to vegetated regions). For the MODIS bands we applied the quality index value 0 (best quality only). This filtering also removes pixels that include clouds. To ensure a high coverage we interpolated in time for MODIS. Further, training and test folds are selected based on coverage; i.e., the regions near the Equator (between $\pm$ 22.5 $^{\circ}$ ) are not included in the data cubes as MODIS reflectance is sensitive to clouds which appear frequently in these regions (compare Figs. S2 and S4). All static variables have full coverage on land. ERA5 data has full spatial and temporal coverage. We did not apply any further quality filtering on SMAP soil moisture data. The data are provided as a level 3 product on Google Earth Engine.

Our individual loss function is comprised of two loss terms. We use the mean squared error (MSE) loss in combination with the structural dissimilarity index (DSSIM). The DSSIM is the countermeasure of the structural similarity index (SSIM): DSSIM $=$ 1 $-$ SSIM . Therefore, we are not only optimizing the overall deviation of the estimated SIF to the measured SIF but also the structural patterns. Section S4.4 shows the benefit of including both MSE and SSIM terms in the loss function. Equation () shows our loss function: 5 $\begin{aligned} L = & a \cdot MSE + b \cdot DSSIM \\ = & a \cdot \frac{1}{n} \cdot \sum_{i = 1}^{n} {(y_{i} - \tilde{y_{i}})}^{2} \\ + b \cdot (1 - \frac{2 \cdot μ_{Y} \cdot μ_{\tilde{Y}} \cdot (2 \cdot σ_{Y \tilde{Y}} + c_{2})}{(μ_{Y}^{2} + μ_{\tilde{Y}}^{2} + c_{1}) \cdot (σ_{Y}^{2} \cdot σ_{\tilde{Y}}^{2} + c_{2})}), \end{aligned}$

where $n$ is the number of data points, $y_{i}$ is the data point $i$ in measured (target variable) SIF, $\tilde{y_{i}}$ is the data point $i$ in estimated SIF, $Y$ values are all data points of measured (target variable) SIF, $\tilde{Y}$ values are all data points of estimated SIF, $μ_{Y}$ is the mean of $Y$ , $μ_{\tilde{Y}}$ is the mean of $\tilde{Y}$ , $σ_{Y}$ is the variance of $Y$ , $σ_{\tilde{Y}}$ is the variance of $\tilde{Y}$ , $σ_{Y \tilde{Y}}$ is the covariance of $Y$ and $\tilde{Y}$ , and $c_{1} = (k_{1} L)^{2}$ and $c_{2} = (k_{2} L)^{2}$ are variables for stabilization with $L =$ 2 bit px $^{- 1} - 1$ , $k_{1} = 0.01$ , and $k_{2} = 0.03$ . The parameters $a$ and $b$ define the weights of the overall loss of the two individual losses. The overall model performance did not show a notable sensitivity to different $a$ and $b$ values. To approximately keep the individual losses in the same order of magnitude, we set $a = 1$ and $b = 0.3$ . DSSIM is in the range of 0 to 1, with 0 meaning structurally similar and 1 structurally dissimilar.

We use the optuna library for the hyperparameter tuning of the learning rate, weight decay, and epoch of the CNN . Here, a tree-structured Parzen estimator sampler suggests the parameters of the next trial which is based on a Gaussian mixture model. Section S4.1 provides more details on this hyperparameter tuning.

3.2 Results of model optimization

Figure summarizes the results of the optimized model. We observe an overall $r^{2}$ of 0.92, SSIM of 0.87, and RMSE of 0.17 mW m $^{- 2}$ sr $^{- 1}$ nm $^{- 1}$ between the estimated SIF from SIFnet and retrieved SIF from TROPOMI at 0.05 $^{\circ}$ (Fig. e). SSIM is calculated by comparing the average SIF signal of the 3 years under investigation. Figure e shows the three metrics for each month of the year. We observe the lowest $r^{2}$ values in January, February, and March. These are associated with low SIF values and, consequently, lower signal-to-noise ratios which drive the decreased performance. SSIM also indicates reduced performance during this time period. RMSE values are correlated with overall productivity with the lowest RMSE in winter; this is expected as this metric depends on the magnitude of the signal.

Figure 4

Test set results of CNN training at 0.05 $^{\circ}$ . Panel (a) shows low-resolution SIF that is used as model input, (b) shows the estimated SIF at 0.05 $^{\circ}$ by SIFnet, (c) shows the measured TROPOMI SIF at 0.05 $^{\circ}$ from , (d) shows the scatter comparison between TROPOMI SIF and the SIFnet estimate at 0.05 $^{\circ}$ , and (e) shows for each investigated month the metrics $r^{2}$ , SSIM, and RMSE. Metrics are calculated at 16 d resolution and averaged to monthly values afterwards.

[Figure omitted. See PDF]

3.3 Which features drive SIFnet?

We are particularly interested in understanding which features drive our neural net. Here we evaluate the feature importance using the permutation feature importance method with our North American validation data at a target resolution of 0.05 $^{\circ}$ . The method first computes the RMSE including all input features (RMSE $_{orig .}$ ). We then apply the following three steps.

Shuffle all pixels of one input feature randomly in time and space.
Compute the new RMSE of the estimation (RMSE $_{F, shuf .}$ ).
Compare the shuffled RMSE to the original: $d_{F, shuf .} = {RMSE}_{F, shuf .} / {RMSE}_{orig .}$

Figure 5

Feature importance. Panel (a) shows the total RMSE of the permuted feature divided by the RMSE without feature permutation, and (b) shows the RMSE of each pixel with permuted features divided by the RMSE without feature permutation. Some input variables are clustered, and all variables of that class are permuted at the same time. $ρ_{MODIS}$ : all seven MODIS bands; LULC: all 11 land cover classes; other VIs: kNDVI, NDVI, and EVI; Mereor.: temperature, precipitation, temperature with 16 d delay, and precipitation with 16 d delay; SM: surface soil moisture and subsurface soil moisture.

[Figure omitted. See PDF]

Figure shows the feature importance of clustered input classes and individual features to the overall estimation. Multiple applications of the feature permutation yielded negligible differences in feature importance. Figure a shows the RMSE share of shuffled data to the RMSE of unshuffled data. SIFnet finds low-resolution SIF (SIF $_{LR}$ ) to be the most important input variable, followed by the vegetation index NIR $_{v}$ and the cosine of the solar zenith angle ( $cos⁡ (SZA)$ ). All other variables do not contribute notably to the model output. This result strengthens our findings from Figs. and that NIR $_{v}$ is better correlated with SIF from TROPOMI than kNDVI. Further, our feature importance is in line with in which they find a high correlation of SIF with NIR $_{v}$ multiplied with photosynthetic active radiation (PAR), of which the $cos⁡ (SZA)$ can be used as a proxy. The CNN is a data-driven method and is not restricted by LUE terms. Although SM and meteorology (air temperature and precipitation) play a key role for photosynthesis, we find that they are not important to our model output. This does not necessarily imply that SIF is not linked to these parameters. This can be explained by the following. (1) The variables SM and those from ERA-5 are at coarser resolution than the actual model output of the training phase which is at 0.05 $^{\circ}$ (10 000 and 11 132 m for SM and ERA-5, respectively). Therefore each pixel at the resolution of 0.05 $^{\circ}$ does not have its unique value for SM or ERA-5, but multiple cells can be within one SM or ERA-5 pixel. (2) Not only do the auxiliary data of the model estimate higher-resolution SIF, but they are computed together with coarse-resolution SIF. Therefore, events like heat stress that impact a bigger area than the actual model output might be represented in the coarse-resolution SIF. (3) We have aggregated the data used to 16 d time steps. LUE parameters influencing SIF might have a bigger impact on the estimation at higher temporal resolutions.

Figure b shows the spatial feature importance over the validation set in North America for the four most important features. We observe that SIF $_{LR}$ has the biggest impact in the eastern US, which corresponds strongly to the high vs. low productivity regions in the US. NIR $_{v}$ is a strong predictor in the southeastern US and in shrub regions in the western US. The contribution of $cos⁡ (SZA)$ is highest at high latitudes and weakens at lower latitudes. NIR $_{v}$ is found to be less predictive of SIF at high latitudes. The land mask is the fourth most important input feature and contributes most in shrub regions. These four features consistently stand out as the strongest predictors. Other inputs such as fragmentation and soil moisture were not found to be strong predictors here. In Sect. S4.6 we test higher scaling factors between low- and high-resolution SIF. Even with scaling factors of 20 and 50 low-resolution SIF stays the most and second most important input feature, respectively.

We also examined different combinations of inputs such as directly including the MODIS bands as opposed to vegetation indices derived from MODIS bands (see Fig. S11). Low-resolution SIF remains the most important feature, followed by the NIR band $ρ_{NIR}$ . The land cover products increase in relevance. Interestingly, when low-resolution SIF is omitted as input for the model, we observe contrasting results to Fig. in which NIR $_{v}$ is no longer a leading predictor. We find that $cos⁡ (SZA)$ , $ρ_{NIR}$ , kNDVI, and NDVI are the four most important features in this case (see Fig. S12). This may result from the collinearity between input features or suggests that another combination of $ρ_{NIR}$ and $ρ_{RED}$ is better correlated to SIF than, for example, NIR $_{v}$ or kNDVI. This finding was robust to multiple optimizations and permutations.

3.4 Comparison of SIFnet to downscaled SIF

Figure shows the 0.005 $^{\circ}$ SIF estimated by SIFnet and downscaled SIF from . The difference between the two SIF estimates can be seen in Fig. c. SIFnet predicts lower SIF in the western US drylands and higher SIF over forested regions in the eastern US. This prediction of lower SIF in drylands is interesting because resorted to an ad hoc bias correction in these regions due to a low signal-to-noise ratio. Recent work from concluded that SIF and NIR $_{v}$ capture complementary events in western US drylands as a proxy for GPP and that the linear correlation of SIF to GPP was substantially lower in these regions compared to other vegetation types. We also observe systematic differences in the predicted SIF in urban areas. These regions are further evaluated in Fig. S17. Notably, the SIFnet estimate is systematically lower than the downscaling estimate in most urban regions examined here, Seattle being a notable exception. Both SIFnet and the downscaling approach allocate SIF to large urban parks and green spaces, but SIFnet predicts little-to-no SIF over the rest of the urban area. In particular, SIFnet estimates nearly zero SIF in the urban core of Los Angeles and San Francisco. SIFnet and the downscaling method predict comparable SIF as we move away from the urban core. Fine-scale features in the urban region are visible in both SIF estimates such as the Schiller Woods in Chicago (42.0 $^{\circ}$ N, 87.8 $^{\circ}$ W).

Figure 6

SIFnet estimated SIF at 0.005 $^{\circ}$ for CONUS and its comparison to downscaled SIF. Panel (a) shows the SIFnet estimated SIF at 0.005 $^{\circ}$ , (b) shows the downscaled SIF from , and (c) shows the difference between SIFnet and downscaled SIF. Negative values imply a higher SIFnet SIF and positive values a higher downscaled SIF value.

[Figure omitted. See PDF]

4 Validation against OCO-2/3 SIF

The differences in SIF predicted from SIFnet and the downscaled SIF beg the question: which is correct? Here we evaluate both SIF products against independent SIF observations from OCO-2 and OCO-3 . These instruments have higher spatial resolution than TROPOMI and, as such, can be used to evaluate the high-resolution patterns predicted by both SIFnet and the downscaling approach. Specifically, OCO-2 and OCO-3 have nadir footprint sizes of 2.25 $\times$ 1.29 km. However, OCO-2 and OCO-3 do not provide full spatial coverage. They observe narrow swaths that are $\sim$ 10 km across-track. OCO-3 also provides a scanning mode to observe urban areas. Here, we use quality-checked OCO-2 data from April 2018 until March 2021 and OCO-3 data from July 2019 until March 2021. To compare the ungridded OCO-2 and OCO-3 data against the SIF estimated from TROPOMI, we compute the weighted average of all 0.005 $^{\circ}$ grid cells that fall within the bounds of an OCO footprint. Here, the TROPOMI estimates are subsampled to 0.0005 $^{\circ}$ (approx. 50 m at Equator), and the mean value is computed for all values which fall into the OCO footprint. For a quantitative comparison between OCO-2/3 and the SIFnet and downscaled estimate, the metrics $r$ , $r^{2}$ , and RMSE are computed.

The high-resolution SIF estimates from SIFnet, the downscaling, and OCO are instantaneous SIF measurements taken at a specific time of the day, while the time of TROPOMI observations can differ substantially. Here, we compute the daily average SIF by scaling with the cosine of the SZA :

6 $Daily SIF (x, y) = SIF (τ_{s}, x, y) \cdot \frac{\int_{τ_{o}}^{τ_{f}} cos⁡ [SZA (τ, x, y)] d τ}{cos⁡ [SZA (τ_{s}, x, y)]},$ where $Daily SIF (x, y)$ is the daily integrated SIF estimate, $SIF (τ_{s}, x, y)$ is the instantaneous SIF at the individual measurement time, SZA is the solar zenith angle, $τ_{s}$ is the time of the satellite measurement, $τ_{o}$ is the time of sunrise, and $τ_{f}$ is the time of sunset. This implicitly assumes that both PAR and SIF scale with $cos⁡ (SZA)$ under cloud-free conditions, and we neglect Rayleigh scattering, as well as gas absorption. Although this approach neglects several water or light conditions, it provides our best estimate of daily SIF and enables comparison between multiple SIF products with different measurement times . The method is equivalent to the daily correction scheme for OCO-2, OCO-3, and TROPOMI . Additionally, we performed a sensitivity study in which we trained SIFnet using daily corrected SIF and found the results to be generally insensitive to the use of instantaneous vs. daily corrected SIF (see Fig. S18). Following this, we chose to apply the daily correction after deriving the high-resolution SIF.

Figure 7

Validation of SIFnet and downscaled SIF to OCO-2 and OCO-3 SIF over CONUS. Comparison from April 2018 until March 2021 in 16 d time steps. Daily OCO-2 and OCO-3 data are assigned to the closest 16 d time step. Panel (a) shows the gridded correlation of the two products against the combined data of OCO-2 and OCO-3. We first compute the SIF data from SIFnet and downscaling estimate that falls into the OCO footprint. Then we assign every OCO footprint to the closest grid point on the 1 $^{\circ}$ grid dependent on the center location of that footprint and compute the Pearson correlation coefficient. Panel (b) shows the scatter comparison of the weighted average of all grid cells on the 0.005 $^{\circ}$ estimated SIFnet SIF (ours) and downscaled SIF that fall into the OCO-2 or OCO-3 footprint.

[Figure omitted. See PDF]

Figure shows a comparison of both SIFnet and the downscaled SIF to OCO-2 and OCO-3. Specifically, Fig. a shows the correlation of SIFnet and the downscaled estimate with OCO-2/3 for every 1 $^{\circ}$ pixel over CONUS. Both SIFnet and the downscaled SIF generally show good agreement with $r$ in excess of 0.7 for most of the high-productivity regions. We observe weaker correlations in the western drylands due, in part, to a lower signal-to-noise ratio. Overall, we find SIFnet to perform systematically better than the downscaled SIF, as shown in the difference plot. Figure b summarizes these spatial patterns in a scatterplot comparison. SIFnet again shows better performance than the downscaled SIF against OCO-2, OCO-3, and OCO-2/3. The Pearson correlation coefficient $r$ is 0.78 and 0.72 for the SIFnet and downscaled estimate, respectively, when comparing to all OCO data (right column in Fig. b). The generally high RMSE indicates different scales and variability in the data sets.

Deviations between TROPOMI and OCO-2/3 also appear at a grid of 0.05 $^{\circ}$ (Fig. S19). The $r^{2}$ coefficient is 0.61 and 0.62 between TROPOMI and OCO-2 and OCO-3 SIF, respectively. Indeed, one might expect better correlations here as both present SIF at 740 nm. However, as pointed out in , the uncertainty of both TROPOMI and OCO-2 SIF is expected to lead to a certain spread between the data sets. In addition, we do not account for differences in acquisition times and viewing–illumination geometry, which can lead to additional uncertainties in this comparison. For reference, when comparing single footprints of TROPOMI SIF to aggregated OCO-2 SIF for June 2018 globally, found a $r^{2}$ of 0.67; only additional aggregation leads to a $r^{2}$ of 0.88. The mean deviation of TROPOMI SIF to OCO-2 SIF is close to the average standard deviation of TROPOMI SIF (0.4 mW m $^{- 2}$ sr $^{- 1}$ nm $^{- 1}$ ). In our analysis, from the 16 d product from TROPOMI SIF for April 2018 until March 2021 at 0.05 $^{\circ}$ , we observe an average error in the TROPOMI SIF of 0.43 mW m $^{- 2}$ sr $^{- 1}$ nm $^{- 1}$ for the CONUS. That error is close to the RMSE between instantaneous TROPOMI SIF and instantaneous OCO-2 SIF (0.37 mW m $^{- 2}$ sr $^{- 1}$ nm $^{- 1}$ ). To compare TROPOMI and OCO-2/3 SIF we aggregate the OCO-2/3 footprints to the same grid as our TROPOMI data (0.05 $^{\circ}$ ). As we aggregate multiple OCO-2 or OCO-3 footprints to match one TROPOMI grid cell at 0.05 $^{\circ}$ , the certainty of the OCO measurements increases, and therefore the RMSE between TROPOMI and OCO SIF decreases.

Figure 8

SIFnet and downscaled SIF, the difference between these, and the difference in correlation to OCO-2 and OCO-3 for four urban regions. The first column shows the SIFnet estimate, the second the downscaled SIF from , the third the difference between SIFnet and downscaled SIF, and the last the difference in correlation of the high-resolution SIF estimates to combined OCO-2 and OCO-3 data on a 0.02 $^{\circ}$ grid multiplied with the $L^{1}$ norm between the SIFnet and downscaled estimate.

[Figure omitted. See PDF]

Figure presents a detailed comparison of SIFnet and the downscaled SIF in four US cities. The first column shows the SIFnet estimate, the second the downscaled SIF from , the third the difference between SIFnet and downscaled SIF, and the last column the difference in correlation of the high-resolution SIF estimates to combined OCO-2 and OCO-3 data on a 0.02 $^{\circ}$ grid multiplied by the $L^{1}$ norm between the SIFnet and the downscaled SIF. The column on the right highlights both regions where the differences in predicted SIF are large and which product is performing better. As such, the right column will show white in areas where the difference in predicted SIF is small or the correlation with OCO is similar. While we observe large differences in predicted SIF for the urban areas (column 3), we do not find one product to perform systematically better in urban areas. This likely indicates the complexity in the SIF signal arising from urban areas. Additionally, urban areas make up a small fraction of the overall land mass and, as such, do not represent a large share of the training data in SIFnet. These factors likely contribute to the heterogeneous performance observed in the right column of Fig. .

Figure 9

SIFnet SIF, downscaled SIF, MODIS NIR $_{v}$ , and a Google Earth cut-out for a part of Chicago. Left panel shows the SIFnet estimate, second panel shows the downscaled estimate from , third panel shows MODIS NIR $_{v}$ for Chicago, and last panel shows the Google Earth cut-out . For panels 1–3 the average data for April 2018 until March 2021 is shown.

[Figure omitted. See PDF]

However, there are some notable successes of SIFnet in urban areas that can be mapped directly to features in the urban area. Figure shows both SIFnet and the downscaled SIF along with NIR $_{v}$ from MODIS and a true color image of Chicago. A feature clearly stands out in both the downscaled SIF and NIR $_{v}$ image. This is a region with missing NIR $_{v}$ and effectively no downscaled SIF. However, SIFnet does not show a strong gradient here. This region corresponds to the Chicago airport. In the MODIS NIR $_{v}$ image it is visible that there are no valid data available for that region for the 3 investigated years. The downscaling method from relies only on NIR $_{v}$ in the weighting function. If there are no data available for the region, they are interpolated in space and time. Here, it shows that the method seems to fail in urban regions where no MODIS NIR $_{v}$ signal is available. SIFnet handles this region better and seems to rely on other auxiliary data if there is no MODIS NIR $_{v}$ available. In Fig. it is also visible that the SIFnet estimate correlates better with the OCO-X data than the downscaled SIF for the region of the Chicago airport.

5 Conclusions

Here, we develop a convolutional neural network (CNN) model named SIFnet to increase the resolution of TROPOMI SIF by a factor of 10. The novelty of our method consists of using coarse-resolution SIF measurements together with high-resolution auxiliary data as model input to estimate high-resolution SIF. After optimization and hyperparameter tuning of SIFnet, the estimated SIF at 500 m resolution yields an $r^{2}$ and RMSE of 0.92 and 0.17, respectively, when compared against validation data (Fig. ). We further compare the output of SIFnet against a recently developed downscaling method to estimate high-resolution SIF and evaluate both methods against independent observations from the Orbiting Carbon Observatory 2 and 3 (OCO-2/3). SIFnet is found to perform systematically better than the downscaling approach when compared against independent measurements. Through interpretable machine learning methods, we identify the key features that SIFnet utilizes to accurately predict high-resolution spatial patterns of SIF. We find that SIFnet relies heavily on the low-resolution SIF feature (SIF $_{LR}$ ) and the vegetation index NIR $_{v}$ (Fig. ).

SIFnet is a multi-layer CNN that increases the spatial resolution of the TROPOMI SIF by a factor of 10. Our model uses auxiliary data sets related to gross primary productivity and SIF as inputs and yields a high-resolution SIF estimate. The model is trained using three years of data from Asia, Europe, Africa, and South America. North America is used as the validation data set. Our loss function is comprised of two terms: the mean squared error and the structural dissimilarity index. The combination of these two terms improved the performance of our model.

SIFnet was further compared to the recent downscaled SIF product developed by . The two high-resolution estimates showed pronounced differences across the western US drylands. This difference is particularly interesting because these drylands tend to be low-productivity regions and traditionally have been difficult for SIF to accurately capture due to the low signal-to-noise ratio. Both high-resolution SIF estimates were compared to independent observations from OCO-2/3. SIFnet performed systematically better than the downscaled SIF ( $r = 0.78$ for SIFnet, $r = 0.72$ for downscaling). SIFnet and the downscaling method also yielded differences in urban regions. However, there was substantial heterogeneity in the performance of SIFnet and downscaling in urban areas. One product did not perform systematically better than the other within urban areas. The mixed results in urban areas likely relates to both the complexity of the photosynthetic activity in urban areas, as well as the lack of training data, as urban areas represent a small fraction of the total landmass.

We adapted techniques from the area of interpretable machine learning to assess the key features driving SIFnet. Specifically, we conducted random permutations to input data sets and assessed the impact on the resulting RMSE. From this, we found that SIFnet relies most heavily on the low-resolution SIF feature (SIF $_{LR}$ ). The second most important factor is the MODIS vegetation index NIR $_{v}$ . NIR $_{v}$ is also found to outperform the recently proposed kNDVI vegetation index, in contrast to . The interpretable machine learning approach also allowed us to identify spatial regions of importance for the different parameters. Interestingly, SIFnet relies more heavily on NIR $_{v}$ in the western drylands where the SIF signal-to-noise ratio is low. This implies that SIFnet is picking up on key physics that lead to the improved performance relative to the downscaling method. Overall, SIFnet represents a robust method to infer continuous high-spatial-resolution information about processes related to gross primary productivity.

Data availability

The high-resolution SIF for CONUS from April 2018 until March 2021 is available here: 10.5281/zenodo.6321987 . Further data can be requested from the authors.

The supplement related to this article is available online at: https://doi.org/10.5194/bg-19-1777-2022-supplement.

Author contributions

JG, AJT, and JC conceived the study. JG compiled data sets, conducted data analysis, and generated figures. JG wrote the manuscript. All authors edited the manuscript and provided feedback. JG did the literature research. PK and CF generated the TROPOMI SIF data. JC provided project guidance. All authors contributed to the discussion and interpretation of the results. All authors have read and agreed to the published version of the manuscript.

Competing interests

The contact author has declared that neither they nor their co-authors have any competing interests.

Disclaimer

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Acknowledgements

We thank Xiaojing Tang, Luca Lloyd, and Lucy Hutyra from Boston University, USA, for providing us with their valuable global data about forest fragmentation.

Financial support

This research has been supported by the Institute for Advanced Study, Technische Universität München (grant no. 291763), the Deutsche Forschungsgemeinschaft (grant no. 419317138), the NASA Early Career Faculty program (grant no. 80NSSC21K1808), and the NASA Carbon Cycle Science program (grant no. 80HQTR21T0101).This work was supported by the German Research Foundation (DFG) and the Technical University of Munich (TUM) in the framework of the Open Access Publishing Program.

Review statement

This paper was edited by Martin De Kauwe and reviewed by two anonymous referees.

Word count: 6917

Show less

© 2022. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Gross primary productivity (GPP) is the sum of leaf photosynthesis and represents a crucial component of the global carbon cycle. Space-borne estimates of GPP typically rely on observable quantities that co-vary with GPP such as vegetation indices using reflectance measurements (e.g., normalized difference vegetation index, NDVI, near-infrared reflectance of terrestrial vegetation, NIR $_{v}$ , and kernel normalized difference vegetation index, kNDVI). Recent work has also utilized measurements of solar-induced chlorophyll fluorescence (SIF) as a proxy for GPP. However, these SIF measurements are typically coarse resolution, while many processes influencing GPP occur at fine spatial scales. Here, we develop a convolutional neural network (CNN), named SIFnet, that increases the resolution of SIF from the TROPOspheric Monitoring Instrument (TROPOMI) on board of the satellite Sentinel-5P by a factor of 10 to a spatial resolution of 500 m. SIFnet utilizes coarse SIF observations together with high-resolution auxiliary data. The auxiliary data used here may carry information related to GPP and SIF. We use training data from non-US regions between April 2018 until March 2021 and evaluate our CNN over the conterminous United States (CONUS). We show that SIFnet is able to increase the resolution of TROPOMI SIF by a factor of 10 with a $r^{2}$ and RMSE metrics of 0.92 and 0.17 mW m $^{- 2}$ sr $^{- 1}$ nm $^{- 1}$ , respectively. We further compare SIFnet against a recently developed downscaling approach and evaluate both methods against independent SIF measurements from Orbiting Carbon Observatory 2 and 3 (together OCO-2/3). SIFnet performs systematically better than the downscaling approach ( $r = 0.78$ for SIFnet, $r = 0.72$ for downscaling), indicating that it is picking up on key features related to SIF and GPP. Examination of the feature importance in the neural network indicates a few key parameters and the spatial regions in which these parameters matter. Namely, the CNN finds low-resolution SIF data to be the most significant parameter with the NIR $_{v}$ vegetation index as the second most important parameter. NIR $_{v}$ consistently outperforms the recently proposed kNDVI vegetation index. Advantages and limitations of SIFnet are investigated and presented through a series of case studies across the United States. SIFnet represents a robust method to infer continuous, high-spatial-resolution SIF data.

Details

Title

A convolutional neural network for spatial downscaling of satellite-based solar-induced chlorophyll fluorescence (SIFnet)

Author

Gensheimer, Johannes¹; Turner, Alexander J²; Köhler, Philipp³

; Frankenberg, Christian³

; Chen, Jia⁴

¹ Environmental Sensing and Modeling, Technical University of Munich (TUM), Munich, Germany; Department of Biogeochemical Integration, Max Planck Institute for Biogeochemistry, 07745 Jena, Germany
² Department of Atmospheric Sciences, University of Washington, Seattle, WA, USA
³ Division of Geological and Planetary Sciences, California Institute of Technology, Pasadena, CA, USA
⁴ Environmental Sensing and Modeling, Technical University of Munich (TUM), Munich, Germany

Pages

1777-1793

Publication year

2022

Publication date

2022

Publisher

Copernicus GmbH

ISSN

17264170

e-ISSN

17264189

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.5194/bg-19-1777-2022

ProQuest document ID

2645346920

A convolutional neural network for spatial downscaling of satellite-based solar-induced chlorophyll fluorescence (SIFnet)

Jump to:

Full text

Abstract

Details

Suggested sources