Introduction
Land-atmosphere CO2 exchange is a pathway through which about 30% of anthropogenic CO2 emission is assimilated to terrestrial ecosystems (Friedlingstein et al., 2023). It is hence, a crucial flux to understand the future evolution of atmospheric CO2 concentration under global warming. Despite the importance, the current model projection of land-atmosphere CO2 exchange is largely uncertain. Climate models with carbon cycle coupled or Earth system models show diverging simulations of the evolution of land-atmosphere CO2 exchanges with different climate change scenarios in terms of magnitude and even the sign (Arora et al., 2020; Friedlingstein et al., 2014; Heimann & Reichstein, 2008), which in turn makes the air temperature projection more uncertain (Bodman et al., 2013; Booth et al., 2012).
Reducing the uncertainty of the projection of land-atmosphere CO2 exchange necessitates better knowledge of the regions driving the CO2 exchange and its variations (Ahlström et al., 2015). In this context, the regional contributions have received growing research interest. Nevertheless, the regional contribution to the seasonality in the global land-atmosphere CO2 exchange has been studied less. It has been found that the seasonal variability in the global gross CO2 uptake by terrestrial vegetation (i.e., gross primary productivity, GPP) is driven by the Northern mid-to-high latitudes (e.g., Chen et al., 2017), and the same conclusion is expected for the terrestrial carbon release (i.e., ecosystem respiration, RECO) due to the strong coupling between carbon uptake and release across biomes (e.g., Migliavacca et al., 2015). For the global net CO2 flux between land and atmosphere (i.e., net ecosystem exchange, NEE), three inversion models and dynamic global vegetation models (DGVMs) agreed that terrestrial ecosystems in Northern Hemisphere drive its seasonal variability (e.g., Piao et al., 2020; Seiler et al., 2022). However, the governing processes behind it remain uncertain due to complex interactions among carbon cycle, hydrology, and climate (e.g., Liu et al., 2020).
Moreover, regarding the interannual variability (IAV) in NEE, studies diverge in their conclusions on the regional contribution analysis (Piao et al., 2020). One atmospheric inversion product identified humid tropics as the dominant region, whereas an ensemble of data-driven models contradicted this, showing stronger IAV from arid regions (Marcolla et al., 2017). Poulter et al. (2014) examined CO2 fluxes estimated using an atmospheric inversion, a dynamic global vegetation model, and a global carbon budget accounting method to show that semi-arid regions in the Southern Hemisphere dominated the IAV in the global net land-atmosphere CO2 exchange. This conclusion was corroborated by other studies (e.g., Ahlström et al., 2015; Zhang et al., 2018). However, some studies emphasize tropical humid regions as significant contributors to the IAV. For instance, Piao et al. (2020) conducted spatial attribution analysis using NEE estimates from various approaches, including atmospheric inversion models, data-driven upscaled estimates, and DGVMs. All the examined methodologies in Piao et al. (2020) agreed that contributions by semi-arid tropics versus humid tropics are not qualitatively different. M. Wang et al. (2022) used a satellite-driven process-based ecosystem model and found the dominance of evergreen broadleaf forests, which prevailed over humid tropics, to the global NEE IAV. Furthermore, indirect evidence underscores the importance of humid tropical regions. Humphrey et al. (2021) analyzed Earth system model simulations and found that the global NEE IAV is primarily driven by land-atmosphere interactions, with their hotspots located in semi-arid and tropical regions, indicating that these regions are also hotspots for the global NEE IAV.
Such spread of conclusions, particularly from studies using DGVMs, stems in part from the complex underlying biogeochemical mechanisms and associated parameter uncertainties (Luo et al., 2015). As model development progresses, the structure becomes more complex with a greater number of parameters (J. B. Fisher et al., 2014; R. A. Fisher & Koven, 2020). However, many of these parameters are determined empirically without a sound theoretical basis or sufficient observational support within the simulation domain (Famiglietti et al., 2020; Feng, 2020; Prentice et al., 2015), leaving parameters and thus the simulation more uncertain. A promising alternative is to leverage observation-based products to inform the model for parameter calibration (Keenan et al., 2012; Niu et al., 2017; Williams et al., 2009). Previous studies demonstrated the efficacy of the model-data integration in simulating water and carbon cycles (e.g., Bloom et al., 2016; Lee et al., 2023; Trautmann et al., 2018, 2022). Levine et al. (2023) applied a model-data integration framework to show that the tropical humid region dominates the IAV of pan-tropical land carbon uptake, showing the potential applicability of model-data integration methods in understanding the global carbon cycle variability.
Here, we diagnose gridwise and regional contributions to seasonal and interannual variability of the global land-atmosphere CO2 exchange. To this end, we develop a process-based diagnostic biogeochemical model of carbon and water cycles, namely the Strategies to INtegrate Data and BiogeochemicAl moDels (SINDBAD) framework. SINDBAD aims at reproducing patterns in various observational data streams of the carbon and water cycles through extensive calibration of a parsimonious model structure. After calibration, we evaluate the performance of the model against observation-based products used for the calibration. Finally, we apply the model estimates of CO2 fluxes for the spatial attribution analysis. We expect that our data-driven process-based model will provide an additional perspective on the spatial contributions to global CO2 variability and improve its understanding. We specifically address the following questions:
-
How well can a parsimonious process-based diagnostic model simulate observed seasonal and interannual variability in gross and net CO2 fluxes compared to state-of-the-art prognostic models?
-
Which regions drive seasonal and interannual variability of the global CO2 flux?
Methods and Data
Figure 1 provides a visual overview of the workflow of this study. We build a simple process-based model structure to simulate CO2 exchanges between land and atmosphere (Section 2.1). The model structure is built on top of a hydrological model by Trautmann et al. (2022), wherein a new set of carbon cycle process formulations is incorporated into the current study. The newly developed model is driven by a set of meteorological variables and is constrained by available observation-based data products (Section 2.2) see Appendix A in Lee et al. (2023). During model calibration, parameters are optimized at the regional scale to find the set that produces the best fit between the simulated and observed carbon and water pools and fluxes (Section 2.3). A spatially and temporally constant value is determined for each parameter, and thus these parameters are global constants. Using the optimized parameter set, a forward run of the model produces a daily time series of pools and fluxes for each 1-degree grid. We evaluate the model performance of carbon fluxes across spatial (i.e., global and regional) and temporal (i.e., seasonal and interannual) scales (Section 2.4). Then, we attribute the variance in the global fluxes to grids and regions using a covariance matrix analysis to understand the spatial pattern of the global CO2 fluxes (Section 2.5).
[IMAGE OMITTED. SEE PDF]
Model Description
SINDBAD is a model-data integration framework (MDIF) that integrates processes of land biogeochemistry with an array of different observational data sets to estimate parameters. Note that SINDBAD is not a specific model but an MDIF which allows for multiple configurations of model structure, observational constraints, and other settings (e.g., Trautmann et al., 2018, 2022). For simplicity, we refer to the specific configuration of SINDBAD used in this study as SINDBAD. SINDBAD is designed for employing parsimonious model structures, where the main processes are represented by relatively simple equations with comparatively few effective parameters. This parsimony improves parameter identifiability, interpretability of simulation results, and computational speed. The CARDAMOM approach (Bloom et al., 2016) follows a similar philosophy but differs from SINDBAD in many ways, including the model structure, the observational constraints, and the model data integration strategy. For example, CARDAMOM performs Bayesian optimization of parameters for each grid cell, while SINDBAD calibrates a set of globally constant parameters. When comparing such MDIFs with the traditional DGVMs (i.e., process-based models without parameter calibration), MDIFs facilitate model calibration as well as a formal scrutinization of the model structure in capturing observed patterns. Hence, their strength is to provide diagnostic assessments of the past and for the modeling domain covered by observational data streams. Effectively, MDIFs facilitate a joint interpretation and explanation for patterns in diverse and heterogeneous data streams. On the other hand, MDIFs risk overfitting to a specific spatiotemporal domain such that extrapolation to for example, future climate conditions are conceptually questionable. This requires either a consideration of the dynamics of parameters in space and time or a modeling strategy rooted more strongly in the philosophy of representing detailed and complete process understanding, or both. The latter strategy is followed by traditional process-based land surface and vegetation models included in TRENDY, which involve complex structures, inherent uncertainties in model design, and a lack of calibration.
The model structure of SINDBAD framework in this study extends the hydrological model by Trautmann et al. (2022) with carbon cycle components, including gross primary productivity (GPP), ecosystem respiration (RECO), and net ecosystem exchange (NEE). Here, we describe the general flow of water (Section 2.1.1) and carbon (Section 2.1.2) processes in the model and associated formulations.
Water Cycle
Figure 2 shows the structure of the model. The general flow of water cycle components is as described by Trautmann et al. (2022). Precipitation is partitioned into solid snowfall and liquid rainfall based on the air temperature threshold. The snowfall can sublimate (Esub) or melt to replenish snow water storage (wSnow). The rainfall and snowmelt flow to the top of the soil, where vegetation controls the partitioning into the surface runoff, interception loss, and percolation, respectively. The incoming percolation (Iin) infiltrates to the upper soil layer (wSoil1). In exceeding the capacity of wSoil1, infiltration excess (Iexc) is generated, which either refills slowly varying water storage (wSlow) or goes out of the system as fast runoff (Qfast). Together with Qfast, slow runoff from wSlow (Qslow) constitutes the total runoff (Q) from the system. Deeper soil layer (wSoil2) exchanges moisture with deep water storage (wDeep) based on the gradient of potential, creating capillary rise (from wDeep to wSoil2) and drainage (from wSoil2 to wDeep). A fraction of wDeep moves to wSlow. Evaporation (E) consists of Eint, Esub, and soil evaporation (Esoil) from wSoil1; evapotranspiration (ET) is the sum of transpiration (T) from both soil layers and E. Terrestrial water storage (TWS) consists of wSnow, wSoil1, wSoil2, wDeep, and wSlow. Note that the formulations for the water cycle components are presented in detail in Trautmann et al. (2022); minor updates in the formulations are described below.
[IMAGE OMITTED. SEE PDF]
The maximum water holding capacity of the first soil layer is estimated via a calibrated parameter, . The second soil layer, is calculated as,
Table 1 Data Sets Used for Meteorological Forcing and Parameter Calibration of SINDBAD
Variable | Product | Usage (temporal form) |
Precipitation | GPCP 1dd v1.3 (Huffman et al., 2001) | Forcing (daily) |
Air temperature | CRUJRA v2.2 (Harris, 2021) | Forcing (daily) |
Net radiation | CERES SYN1degEd4A (Wielicki et al., 1996) | Forcing (daily) |
TWS | GRACE Tellus JPL RL06M v1 with CRI v1 (Wiese et al., 2016) | C&E (monthly) |
SWE | GlobSnow v3 (Luojus et al., 2021) | C&E (monthly) |
ET | FLUXCOM RS Ensemble (Jung et al., 2019) | C&E (ESV) |
Q | G-RUN Ensemble v1 (Ghiggi, Humphrey, Seneviratne, & Gudmundsson, 2021) | C&E (ESV) |
GPP | FLUXCOM RS Ensemble (Jung et al., 2020) | C&E (ESV) |
NEE | OCO-2 v10 MIP (Byrne et al., 2023) | C&E (monthly) |
EVI | MODIS EVI of FluxnetEO (Walther et al., 2022) | Prescribed properties (smoothed) |
RD1 | Maximum rooting depth (Y. Fan et al., 2017) | Prescribed properties (static) |
RD2 | Effective rooting depth (Yang et al., 2016b) | Prescribed properties (static) |
RD3 | Maximum soil water storage capacity (Wang-Erlandsson et al., 2016) | Prescribed properties (static) |
RD4 | Maximum plant-available water capacity (Tian et al., 2019) | Prescribed properties (static) |
SOC | WoSIS snapshot 2019 (Batjes et al., 2020) | Prescribed properties (static) |
Transpiration (T) at time is first calculated as the minimum of supply (Tsup) and demand (Tdem) at time :
The formulation of transpiration as the minimum of supply and demand allows users to analyze the propagation of the effect of supply- and demand-limited conditions on transpiration to other water and carbon components in the model. The inclusion of fAPAR in Equations 3 and 6 accounts for the presence of vegetation and its functions in the grid cell. It is used as a scalar for the calculation of transpiration supply and demand to consider only the part of energy or water relevant for vegetation functions, similar to Trautmann et al. (2022).
Carbon Cycle
The carbon processes start with the calculation of GPP that is coupled with transpiration (T) through water use efficiency (WUE):
To relate transpiration with vegetation productivity, water use efficiency is calculated based on the PRELES model (Peltoniemi, 2012).
The carbon loss through respiratory losses is calculated as the sum of autotrophic and heterotrophic respirations. A part of GPP is lost as autotrophic respiration (RA), remaining net primary productivity (NPP).
Autotrophic respiration depends on GPP, following Migliavacca et al. (2015), as
NPP is added to the vegetation carbon pool (cVeg) as the only influx. cVeg in SINDBAD is a lumped representation of the vegetation carbon, and there is no allocation to different components such as root. cVeg loses carbon only via Litterfall.
Litterfall[t] is quantified by scaling the decrease in smoothed daily EVI as
RHcLit is calculated as
The two respiration terms, RH from the litter carbon pool (RHcLit) and RH from the slow soil carbon pool (RHcSoil), constitute heterotrophic respiration:
Forcing and Observational Data
We used the same forcing variables used in Lee et al. (2023) (Table 1). These included precipitation from GPCP V1DD V1.3 (Huffman et al., 2001), air temperature from CRU JRA V2.2 (Harris, 2021), and net radiation from CERES SYN1deg(Ed4A) (Wielicki et al., 1996). Note that using different precipitation products or TWS products only showed a minor influence on the simulated TWS (Lee et al., 2023). VPD was precalculated using air temperature, specific humidity, and surface air pressure from the ERA5 data set (Muñoz Sabater, 2019).
The observation-based data used for constraining the model parameters include (a) TWS from the Gravity Recovery and Climate Experiment (GRACE) Mascon Ocean, Ice, and Hydrology Equivalent Water Height Release 06 Coastal Resolution Improvement (CRI) Filtered Version 1.0 (Wiese et al., 2016), (b) snow water equivalent (SWE) from the GlobSnow v3 (Luojus et al., 2021), (c) evapotranspiration from the FLUXCOM v1 RS ensemble (Jung et al., 2019), and (d) runoff from the Global RUNoff ENSEMBLE (G-RUN ENSEMBLE) v1 (Ghiggi, Humphrey, Seneviratne, & Gudmundsson, 2021). A spatial gap-filling that assigns zero values in non-snow regions was applied to GlobSnow SWE, following Kraft et al. (2022), to obtain global coverage. The observational constraints for carbon fluxes include FLUXCOM GPP (Jung et al., 2020) and OCO-2 v10 MIP NEE (Byrne et al., 2023). We assume that RECO is indirectly constrained by constraining GPP and NEE. The same constraints for the carbon fluxes are also used for the model evaluation.
External data sets were used as proxies of some variables related to land surface characteristics. Vegetation properties such as the vegetation index and maximum rooting depth were used following Trautmann et al. (2022). The vegetation index, quality-controlled and gap-filled daily MODIS EVI of FluxnetEO (Walther et al., 2022), was used to account for the vegetative control on water cycle processes, such as canopy interception, transpiration demand, and infiltration and runoff generation. It also accounts for the litterfall generation in the carbon cycle, which is used to estimate heterotrophic respiration from the litter carbon pool. The daily EVI was prescribed for the period of 2001–2019 after smoothing to remove noises. Four data products of maximum rooting depth and soil water capacity (Table 1) were used as a proxy of soil water holding capacity. Lastly, soil organic carbon by the World Soil Information Service (WoSIS) snapshot 2019 (Batjes et al., 2020) was used as a proxy of heterotrophic respiration from the soil carbon pool.
For all forcing and constraining variables, we applied a spatial filter as done in Lee et al. (2023) to exclude (a) grid cells with a significant fraction of ice, snow, water body, bare land surface, or artificial land cover or (b) grid cells with a significant anthropogenic impact on the trend in GRACE TWS by, for example, groundwater exploitation. This exclusion of grid cells was to preclude the potential biases during calibration due to processes that SINDBAD does not properly consider (Lee et al., 2023).
The model simulation was at the daily and 1 by 1 resolutions for the period of 2001–2019, while the calibration was done using monthly regional time series of SINDBAD and observational constraints for the period of 2015–2019 to obtain a set of optimal parameters each of which is globally constant (i.e., the same set of optimal parameters is shared among the whole grid cells). Thereby, forcing variables and constraints were aggregated into corresponding spatio-temporal resolutions. During calibration, GPP, ET, and Q time series were used in the form of an expected seasonal variability, as FLUXCOM RS and G-RUN ENSEMBLE report significant biases in reproducing interannual variability (Ghiggi, Humphrey, Seneviratne, & Gudmundsson, 2021; Jung et al., 2020).
Model Calibration
The purpose of the calibration is to find a set parameter value that best represents the global pattern of water and carbon cycles with respect to the constraints. This is achieved by iteratively comparing the model estimates with observational constraints to calculate the total cost (i.e., a measure of fit) and using a search algorithm to adjust parameter values to minimize the total cost. The resulting parameters in SINDBAD framework are spatio-temporally constant (i.e., one value for each parameter for a simulation). This is not only to reduce the computational cost for the optimization but also to capture the emerging patterns at a larger spatial scale as much as possible. It is assumed that the spatial dynamics of state and flux variables can be captured by data sets. We apply the multi-criteria approach, similar to Trautmann et al. (2018, 2022), to consider different aspects of earth observations simultaneously during the parameter calibration. We define a cost function and adjust the parameter values via the covariance matrix evolution strategy (CMA-ES) search algorithm (N. Hansen & Ostermeier, 2001) to minimize the total cost (i.e., the sum of cost components). CMA-ES is a stochastic global parameter search algorithm that is efficient and competitive for a range of optimization problems (N. Hansen et al., 2010). It has been deployed for various fields of study, including robotics (e.g., Hasselmann et al., 2021), hydrology (e.g., Rincón et al., 2023; Trautmann et al., 2018, 2022), and ecology (e.g., Van der Meersch & Chuine, 2023).
To reduce computational cost and reduce overfitting, parameter calibration is performed for a subset of grid cells and years: 904 1 by 1 grid cells (constituting only 8% of the total grid cells) and a 5-year subset (2015–2019) were used instead of the total grid cells (cf. 11,000 grid cells) and period (2001–2019). The grid cells were chosen via stratified sampling from the Köppen-Geiger climate regions (Kottek et al., 2006) to preserve the overall proportion among regions (Trautmann et al., 2022).
The calibration was done at the regional scale, as the native resolution of some constraints, such as OCO-2 NEE and GRACE TWS, is coarser than 1 resolution. The TRANSCOM regions were used for the calibration. Specifically, the time series of sampled 904 grid cells by SINDBAD and constraints were aggregated into TRANSCOM regions using the land-area of each grid cell as the weight. The regional time series were linearly concatenated for each constraint (e.g., a concatenated regional time series of NEE, a concatenated regional time series of TWS, and so on), and the model-data mismatch of the constraint was quantified using Nash Sutcliffe Efficiency (NSE, Nash & Sutcliffe, 1970):
Model Evaluation
Using the optimized parameter set, we ran a global model simulation including all 1 by 1 grid cells (c.a. 11,000 grid cells) in the study domain for the time period from 2001 to 2019. We first compared the simulations with corresponding observation-based products to evaluate the overall fitness of the model. This evaluation was conducted for the TRANSCOM regions, and at the global scale with the same temporal aggregation as calibration. Next, before analyzing their spatial variability, we further evaluated simulated carbon fluxes against constraints (FLUXCOM GPP and OCO-2 NEE) and TRENDY v9 dynamic global vegetation models. These models were used to estimate the global carbon budget for 2021 (Friedlingstein et al., 2022), and were introduced here as the current state-of-the-art models. The S2 experiment was used to exclude the effect of land cover change, which is not considered by SINDBAD. Two metrics were used for the evaluation of carbon fluxes: (a) the coefficient of determination (R2) as the square of the Pearson correlation coefficient to measure the temporal coherence and (b) the relative absolute error (RAE, Equation 22) to measure the bias.
We quantified the seasonality (i.e., expected seasonal variability, ESV) of a time series as follows:
Note that the calculation of ESV is different from the mean seasonal cycle commonly used in previous studies, as the linear trend information is excluded. This was done to exclude the potential effect of the linear trend in the model evaluation and attribution analysis, as SINDBAD in this study was forced with a constant atmospheric CO2 concentration; therefore it lacks the rising CO2 effect on the land-atmosphere CO2 fluxes. We compared ESV and the mean seasonal cycle, and the results presented here are almost identical (Figure A3).
To calculate the interannual variability or expected seasonal variability at the global scale, we averaged the IAV or ESV of regions using their area as the weight. For OCO-2 and TRENDY ensembles, the uncertainty across members was measured using the mean absolute deviation multiplied by 1.25, which corresponds to one standard deviation in a normal distribution (Raju & Srinivasan, 1996). This is more robust against outliers compared to standard deviation (Jung et al., 2011).
Analysis of CO2 Flux Variability
Finally, we analyzed the spatial variability of global CO2 fluxes (GPP, RECO, and NEE) arising from different grid cells and regions. Specifically, we applied the covariance matrix analysis by Lee et al. (2023) to quantify grid cell-wise and regional contributions to the variance in the globally aggregated temporal pattern, which is calculated as
The measure quantifies the total covariances related to the grid cell or region among the whole covariance combinations across grid cells or regions in a relative and normalized term. A positive contribution is assigned when the grid cell increases the global variance. This method can quantify the contribution of a group of grid cells, as is mathematically additive, and the sum of contributions of all grid cells becomes one. Note that is not a measure of model performance or the agreement between observation and simulation for the grid cell or region , as the measure also contains the covariance across grid cells or regions. Rather, the measure provides a view of spatial variability of global water and carbon cycles from each data product. Also note that the covariance matrix analysis quantifies the spatial contribution differently from the method used in Ahlström et al. (2015). Their method assesses the spatial contribution by summing the absolute values of the variable for each timestep, taking the sign into account relative to the globally aggregated time series. On the other hand, the covariance matrix analysis in this study quantifies the contribution of a spatial unit (e.g., region) to the variance of the globally aggregated time series considering the variance of the region and the covariance with other regions in accordance with mathematical laws of variance decomposition.
We calculated the contribution measure for different spatial (i.e., grid cell-wise and regional) and temporal scales (i.e., seasonal and interannual variability). SINDBAD results were used to quantify the grid cell-wise contribution. For the regional calculation, we used the Köppen-Geiger climate zones (Kottek et al., 2006), which are further simplified to fewer classes (see Figure A4), including Tropical humid region and Tropical savanna region (i.e., Tropical semi-arid region). We used the Köppen-Geiger zones and not the TRANSCOM regions because the climatic regions are more reflective of different climate and vegetation characteristics that allow for potential associations with different driving mechanisms of CO2 fluxes. Note that the regional classification used affects the regional attribution results (Zhang et al., 2018) as well as the time scale and the time period.
Results and Discussion
We first evaluate SINDBAD performance against observation-based estimates with the same temporal aggregation as used in the calibration (Section 3.1). Next, we further evaluate the seasonal and interannual variability of GPP and NEE against observations as well as state-of-the-art TRENDY models (Section 3.2). Then, we analyze grid cell-wise and regional contributions to the global CO2 fluxes (Section 3.3). Lastly, we discuss the potential limitations of our study (Section 3.4).
Overall Performance
Table 2 shows the list of parameters with the range and optimized value. These parameters are spatio-temporally constant, so are decided in terms of the global representativeness by the optimizer. Overall, the resulting parameter values are in broad agreement with other studies, albeit with differences due to the mismatch in scales.
Table 2 List of Parameters in SINDBAD
Parameter | Description | Unit | Range | Default | Optimized |
Soil water capacity | |||||
the maximum soil water holding capacity of the first soil layer | 10–100 | 50 | 99.29 | ||
the scaling parameter to obtain the maximum soil water holding capacity of the second soil layer | 1,000–10,000 | 1,000 | 7151.63 | ||
weight for the maximum rooting depth by Y. Fan et al. (2017) | - | 0–1 | 0.25 | 0.85 | |
weight for the effective rooting depth by Yang et al. (2016b) | - | 0–1 | 0.25 | 0.58 | |
weight for the maximum soil water storage capacity by Wang-Erlandsson et al. (2016) | - | 0–1 | 0.25 | 0.71 | |
weight for the plant-available water capacity by Tian et al. (2019) | - | 0–1 | 0.25 | 0.07 | |
maximum plant-available water capacity of the second soil layer for grid cells with missing estimates in Tian et al. (2019) | 0–1000 | 50 | 438.78 | ||
Transpiration | |||||
fraction of the second soil layer available for transpiration | - | 0.01–0.1 | 0.05 | 0.09 | |
vegetation-specific alpha coefficient of the Priestley-Taylor equation | - | 0.2–3 | 1 | 2.48 | |
interception of linear regression to estimate fAPAR from EVI | - | −0.09–0.09 | −0.09 | −0.09 | |
slope of linear regression to estimate fAPAR from EVI | - | 0.8–0.8 | 0.8 | 0.8 | |
Water use efficiency | |||||
WUE at 1 hPa of VPD | 0.1–20 | 9.2 | 4.70 | ||
scalar for the exponential response of WUE to VPD | 0.06–0.7 | 0.4 | 0.25 | ||
the base level CO2 concentration for the CO2 concentration effect on WUE | 300–500 | 380 | 403.64 | ||
the sensitivity scalar for the CO2 concentration effect on WUE | 10–2,000 | 500 | 1938.10 | ||
Autotrophic respiration | |||||
the degree of dependency of RA to GPP | - | 0.01–0.99 | 0.5 | 0.42 | |
the degree of dependency of RA to physiological phenology | - | 0.01–1500 | 300 | 96.87 | |
sensitivity of RA to temperature | 150–500 | 300 | 298.65 | ||
reference temperature for RA | C | 15–15 | 15 | 15 | |
lower reference temperature for RA | C | −46.02–46.02 | −46.02 | −46.02 | |
Litterfall generation | |||||
scalar to derive litterfall from EVI change | 300–3,000 | 1,500 | 575.81 | ||
Heterotrophic respiration | |||||
annual decay rate of cLit | 0.5–148 | 14.8 | 15.79 | ||
sensitivity of RH to temperature | 1–500 | 100 | 71.33 | ||
sensitivity of RH to soil moisture within the soil moisture range lower than | - | 0.01–0.1 | 0.05 | 0.05 | |
sensitivity of RH to soil moisture within the soil moisture range higher than | - | 0.01–1 | 0.5 | 0.79 | |
optimal soil water content for RH | % | 70–95 | 90 | 87.83 | |
interception of linear regression to estimate RHcSoil from SOC | 0.01–2.5 | 1 | 0.22 | ||
slope of linear regression to estimate RHcSoil from SOC | 0.375–0.625 | 0.5 | 0.41 |
The maximum water holding capacity of the second soil layer , which is estimated as a scaling factor times the harmonized spatial pattern of two rooting depths and two soil water capacity data sets (Equation 1), shows a similar spatial pattern with a higher maximum range (Figure A5), compared to the previous results (Figure B16 in Lee et al., 2023) and an independent product by Stocker et al. (2023). The optimal moisture content for heterotrophic respiration of 87.83% is within the range of values reported for heterotrophic respiration from various soil texture classes in other modeling studies (e.g., Moyano et al., 2012; Yan et al., 2018). Compared to Yan et al. (2018), our is comparable with that of sandy loam and sandy clay loam, which prevail in South American and African tropical and semi-arid regions (e.g., Hengl et al., 2017; Ross et al., 2018).
The larger value of than means a stronger sensitivity of autotrophic respiration to temperature variability compared to heterotrophic respiration. However, whether one component has a stronger sensitivity to temperature changes than the other is uncertain, especially at the global scale, as warming experiments usually do not consider above-ground autotrophic respiration and represent only a few species or biomes. X. Wang et al. (2014) analyzed the results of warming experiments across the globe. They found a stronger response of heterotrophic respiration to warming than root respiration, but the effect of drier conditions caused by warming also affected their results. Site or plot-level studies yield varying conclusions. Hartley et al. (2007) conducted a warming experiment in plots of wheat and maize. They found that soil respiration is more sensitive to warming of soil than root respiration is. Schindlbacher et al. (2009) analyzed changes in respiration with a 2-year soil warming experiment in a forest dominated by Norway Spruce. They concluded that microbial respiration increased a bit more than but comparable to root respiration, which showed a strong coupling with the photosynthesis rate of plants. Collectively, there need to be more observations with longer periods to account for multiple determinants of the temperature sensitivity of respiration, such as soil moisture, nutrient, and vegetation function (Rey et al., 2002; Melillo et al., 2011; X. Wang et al., 2014). Nevertheless, our parameter calibration results based on multiple observational products indicate a stronger temperature sensitivity of RA than RH at the global scale and provide a perspective of a model-data integration framework.
We evaluate the overall performance of SINDBAD which is run using the above set of optimized parameter values against observational products used for parameter calibration across TRANSCOM regions and at the global scale. Overall, SINDBAD GPP, NEE, TWS, ET, SWE, and Q broadly agree with corresponding observation-based products–FLUXCOM GPP, OCO-2 NEE, GRACE TWS, FLUXCOM ET, GlobSnow SWE, and G-RUN Q–across the globe and TRANSCOM regions (Figure 3). The wide range of grid cell-wise performance shows the within-region heterogeneity of water and carbon cycles that are not necessarily captured by parameters that are calibrated against regional means. The grid cell-wise errors can also be attributable to inconsistencies of constraints and meteorological forcing as they are not produced in a single framework. Note that the performance for the grid cells in the study area is very similar to that of the 904 grid cells used for the parameter calibration (Figure A6). This means that the spatial and temporal subsets used for the calibration well represent the global pattern of water and carbon fluxes. We also observe that, for regions with low variability and/or magnitude of the signal, the model performance becomes worse at the regional scale than the grid cell scale, for example, ET and GPP in the South American Tropical region (Figure 3d) and GPP in Australia (Figure 3k). See Table A1 for the evaluation of seasonal and interannual variability.
[IMAGE OMITTED. SEE PDF]
Temporal Variations of CO2 Fluxes
Figure 4 compares GPP from SINDBAD and TRENDY models against FLUXCOM observations over the TRANSCOM regions. SINDBAD captures the expected seasonal variability of GPP well, especially at the global scale and across boreal and temperate regions. It explains more than 95% of the variance in GPP ESV for most regions except those without clear seasonality, such as the South American Tropical region (Figure 4d) and Australia (Figure 4k). In such regions, SINDBAD may not perform as well as it does in regions with stronger signals, as the calibration was conducted simultaneously for all regions. SINDBAD also performs well in terms of bias, with RAE 0.35 for most regions. For the Eurasian Temperate region (Figure 4i), where SINDBAD captures the temporal pattern well, SINDBAD underestimates GPP across the year (RAE = 0.29). Compared with the TRENDY model ensemble, SINDBAD shows comparable or better performance in reproducing the observed temporal pattern, underscoring the value of parameter calibration even within a subset of spatial and temporal domains. Additionally, it shows improved seasonality compared to the ensemble median of TRENDY models, especially in the second half of the year in the temperate and boreal regions.
[IMAGE OMITTED. SEE PDF]
For NEE ESV, SINDBAD agrees with OCO-2 MIP well at the global scale and they are in broad agreement across the TRANSCOM regions (Figure 5). When compared to the ensemble median of TRENDY models, SINDBAD shows slightly better and RAE at the global scale, while the comparison results vary by regions. Specifically, SINDBAD generally shows better performance for regions with relatively stronger variability, such as North American Temperate (Figure 5c), Eurasian Boreal (Figure 5h), and Europe (Figure 5l) regions; whereas the ensemble median of TRENDY models perform better for regions with relatively weaker variability such as Eurasian Temperate (Figure 5i), Tropical Asia (Figure 5j), and Australia (Figure 5k).
[IMAGE OMITTED. SEE PDF]
While SINDBAD NEE seasonal variability shows good agreements with OCO-2 across global and regional scales, it also shows notable biases. Specifically, in the North American temperate region (Figure 5c), SINDBAD overestimates NEE ESV in the summer. One possible reason is the effect of the management in croplands. The region includes an intense agricultural area (i.e., the U.S. Corn Belt), which causes a high GPP in the peak growing season (e.g., Guanter et al., 2014). These effects in croplands are not considered by both the FLUXCOM GPP (Jung et al., 2020) and the model; therefore SINDBAD underestimates the peak GPP in the region. The ensemble median of TRENDY models similarly underestimate GPP at the growing season peak, suggesting the lack of relevant processes to account for croplands in the current state-of-the-art.
In the South American temperate (Figure 5e) and Southern Africa (Figure 5g) regions, SINDBAD does not capture the decrease in OCO-2 NEE, which starts around October, whereas TRENDY reasonably simulates the decrease in NEE ESV. The timing of the decrease in NEE ESV is 1–2 months after the start of the increase in GPP ESV, which SINDBAD successfully captures (Figures 4e and 4g). This indicates that the bias in SINDBAD NEE ESV arises from respiration processes, such as respiration pulse with the rewetting at the start of the growing season (i.e., the Birch effect), which is an influential water-carbon interaction to the CO2 dynamics, particularly over drylands (e.g., Barnard et al., 2020; Z. Fan et al., 2015; Metz et al., 2023). Like some TRENDY models that are able to capture the respiratory pulse depict (a component of) RH sensitive to upper soil moisture more than GPP is (Metz et al., 2023), SINDBAD also uses the upper soil moisture to calculate the moisture response of RHcLit. Nevertheless, SINDBAD does not capture the Birch effect. Possible reasons include (a) the less sensitive moisture response function and/or (b) the relatively thicker topsoil in SINDBAD, which has the maximum water holding capacity of 99.29 mm. The TRENDY models that capture the Birch effect have a relatively thinner topsoil. For example, the JSBACH model has the topsoil of 65 mm thickness (Reick et al., 2021) and the CLASSIC model has the topsoil of 100 mm thickness (Melton et al., 2020).
Regarding interannual variability (IAV), SINDBAD NEE IAV reasonably agrees with the constraint at global and regional scales (Figure 6). SINDBAD shows the best performance at the global scale ( = 0.69 and RAE = 0.62), while the performance metrics vary by region ( = 0.05–0.67 and RAE = 0.63–3.46). When compared with the ensemble median of TRENDY models, SINDBAD performs comparably at the global scale, but the comparison results vary by region. As in NEE ESV, SINDBAD performs better in regions with stronger IAV (e.g., Tropical Asia), except for Australia (Figure 6k), where SINDBAD does not capture the strongest land carbon sink in 2016 and 2017, which seems to be dominant for the global signal during the period, estimated by OCO-2 and TRENDY models. In 2016, Australia's higher rainfall and lower air temperature enhanced land carbon uptake by vegetation in arid areas, while the region released more CO2 in 2019 due to drier and hotter conditions (Villalobos et al., 2022). SINDBAD shows a smaller magnitude of variability compared to OCO-2 and/or TRENDY, for example, in South American Tropical (Figure 6d), South American Temperate (Figure 6e), and Southern Africa (Figure 6g) regions. On the other hand, SINDBAD shows a comparable magnitude of the signal as estimated by OCO-2 in boreal and temperate regions with relatively weak variability, such as North American Boreal (Figure 6b), North American temperate (Figure 6c), Eurasian Boreal (Figure 6h), and Eurasian Temperate (Figure 6i) regions, though is low and RAE is high due to low magnitude and variability.
[IMAGE OMITTED. SEE PDF]
SINDBAD shows weaker performance in time periods outside the parameter calibration when compared against IAV from an independent NEE estimate, while the ensemble median of TRENDY models are less affected by the inversion product to be compared with. In the evaluation against Jena CarboScope (Rödenbeck et al., 2018), the ensemble median of TRENDY models perform better than SINDBAD in simulating the global signal and in a number of TRANSCOM regions (Figure 7). The decrease in SINDBAD's performance is particularly notable in Tropical Asia and Australia, where SINDBAD underestimates the magnitude of NEE IAV compared to OCO-2 estimates. Such a degradation of SINDBAD performance compared with another NEE product seems to come from the fact that SINDBAD is a diagnostic model, and that cannot get the large fluctuation in Jena CarboScope NEE IAV. The dependency of SINDBAD on the NEE product used for parameter optimization or the difference between OCO-2 and Jena CarboScope is unlikely the reason for the performance decrease, as another optimization experiment using Jena CarboScope instead of OCO-2 causes only minor differences in NEE estimates (Figure A7). This means that the difference in raw time series of Jena CarboScope and OCO-2 is not significant for the parameter calibration of SINDBAD; the clear difference in their NEE IAV, for example, in 2015–2016 (Figure A8), is not influential for the calibration as its raw temporal form, which is used for the parameter calibration, has much stronger magnitude than its IAV form. The independence of SINDBAD from the NEE constraint during parameter calibration suggests that improving the simulation requires addressing other factors. These may include redefining the cost function to better emphasize interannual variability or enhancing the meteorological forcing data, which appear to impact the simulation of interannual variability in global carbon fluxes (e.g., Jung et al., 2020; Nelson et al., 2024).
[IMAGE OMITTED. SEE PDF]
Overall, SINDBAD reproduces the ESV of GPP and NEE and NEE IAV estimated by reference products used in the parameter calibration, and it performs comparable or better than the ensemble median of TRENDY models. On the other hand, we also find discrepancies in several regions and weaker performance of SINDBAD when evaluated over a longer time period than the one used for parameter calibration. This feature, together with the wide range of grid cell-wise performance and the tendency that SINDBAD matches toward regions with stronger flux signals, can be expected given the way how it is calibrated (i.e., for all regions simultaneously for 5 years). Nevertheless, the reasonable consistency across scales in the CO2 fluxes among SINDBAD, TRENDY models, and observation-based products shows the potential of SINDBAD and its estimates as valuable resources for diagnosing the global carbon cycle dynamics.
Spatial Attribution of the Global CO2 Fluxes
We attribute the variance of globally integrated seasonal and interannual variability of global CO2 fluxes to each grid cell to show the spatial pattern of the contributions (Figure 8). Given the varying performance of SINBAD at the grid-cell level (Figure 3), we only interpret rather clear patterns in Figure 8. We expect that these clear patterns are rather robust, as they are qualitatively consistent with results of the regional attribution analysis (see below). We provide a similar result using the method by Ahlström et al. (2015) for a comparison (Figure A13). For ESV, GPP and RECO have positive contributions over the Northern Hemisphere, and they clearly show a contrasting pattern between the Southern and Northern Hemispheres. However, the inter-hemispheric compensation of GPP and RECO respectively diminishes in the NEE ESV map, leading to a dominating role of the Northern high latitudes for the seasonal cycle of global NEE. In addition to the inter-hemispheric compensation of GPP and RECO, the weakened positive and negative contributions around the equator and in the Southern Hemisphere can also be partly attributable to the compensatory water effect on GPP and RECO (Jung et al., 2017) where the simultaneous increase or decrease of GPP and RECO in response to the moisture condition dampens the NEE variability. Regarding IAV, South America emerges as the dominant contributor for the three carbon fluxes, but within South America, NEE spatial distribution differs from that of RECO and GPP. Large contributions to RECO and GPP IAV are found within the Amazon River basin and semi-arid areas in Eastern Brazil, while contributions to NEE IAV consist of a mixture of wet and dry tropics and semi-arid ecosystems. Besides South America, other remarkable hotspots include the Eastern USA, Eurasia, and sub-Saharan regions.
[IMAGE OMITTED. SEE PDF]
The Northern mid-to-high latitudes positively contribute to both NEE ESV and NEE IAV, implying that GPP and RECO in the region are occasionally decoupled. This region stores a larger amount of soil organic carbon. As SINDBAD estimates the heterotrophic respiration from the soil carbon pool using a map of soil organic carbon, the contribution of heterotrophic respiration from the slow soil carbon pool (RHcSoil) to RECO becomes larger in the Northern high latitudes. The higher RHcSoil contribution to RECO weakens the GPP-RECO coupling strength in the model, thereby allowing the region to contribute more to the variance in the global NEE. The emergence of the Northern mid-to-high latitudes for the NEE ESV and NEE IAV contributions may be SINDBAD-specific, and indeed happens much weaker in the ensemble median of TRENDY models (Figure A9). This occurs because, in SINDBAD, as RA depends on GPP, which leads to a stronger GPP-RECO coupling, and RHcSoil depends on SOC. However, a number of empirical evidence indicates that the GPP-RECO coupling occurs across different biomes (Migliavacca et al., 2011), which supports the rationale of the dependency of RA on GPP in SINDBAD. Also, the relatively large soil heterotrophic respiration fluxes in Northern mid-to-high latitudes are shown in global maps of heterotrophic respiration reported by other studies (e.g., Nissan et al., 2023). This suggests that, in addition to different responses of RECO and GPP to transient temperature and moisture conditions, soil carbon pool dynamics and its effect on the GPP-RECO (de)coupling contribute to the globally emergent NEE IAV.
We further investigated the contributions at a regional scale using Köppen-Geiger regions to provide a better summary of the spatial pattern of contributions. We provide the model evaluation across Köppen-Geiger regions (Figures A10–A12), similar to Figures 4–6, for the direct comparison of model performance and spatial attribution results. We also provide comparable results of regional contributions quantified using the method by Ahlström et al. (2015) for a comparison (Figures A14 and A15 for ESV and IAV, respectively). The varying SINDBAD performance across regions, especially for the IAV, may affect the attribution. However, the attribution results below are qualitatively consistent between ones using observational constraints and ones using SINDBAD (and TRENDY models), which supports the robustness of the results. We observe the consistency in estimates of regional contributions to the variance in the ESV of the global CO2 fluxes among SINDBAD, constraints, and TRENDY models (Figure 9). They agree that the global ESV of the three CO2 fluxes is driven by temperate and boreal regions with a wider spread across TRENDY models, especially in the Boreal region. SINDBAD shows higher contributions by the temperate region for all fluxes compared to constraints and TRENDY models, while it relatively underestimates the contribution by the boreal region. This underestimation is probably caused by the underestimated magnitude of GPP ESV in the boreal region by SINDBAD (Figure A10). While all products consistently show marginal contributions by tropical and arid regions, SINDBAD and TRENDY models quantify slightly higher contributions by the tropical humid region to NEE ESV compared to OCO-2, with a relatively wide range across TRENDY models. We highlight that although TRENDY models, SINDBAD, and OCO-2 agree with the general pattern of regional contributions, the reasons behind are different by regions and data sets. For example, in tropical savanna regions, TRENDY models show strong inter-hemispheric compensation, while SINDBAD and OCO-2 do not (Figure A17). The pattern is different in the temperate regions, where the contribution is dominated by the northern part, identified by TRENDY models, SINDBAD, and OCO-2.
[IMAGE OMITTED. SEE PDF]
The large temperate and boreal regions' contribution to global GPP ESV agrees with other studies (e.g., Chen et al., 2017); NEE ESV contribution results are also consistent with estimates by other top-down inversion frameworks (e.g., Krishnapriya et al., 2022) and by another model-data integration framework (e.g., Quetin et al., 2020). Such a domination of temperate and boreal regions for GPP and RECO ESV is caused by spatial cancellation around the equator (e.g., Chen et al., 2017). We observe strong positive and negative grid cell-wise (Figures 8a and 8b) and regional (Figure A17a and A17b) contributions to GPP and RECO ESV, while only marginal contributions remain in tropical humid and savanna regions (Figures 9a and 9b). On the other hand, the strong contribution by temperate and boreal regions for NEE ESV is the result of the temporal cancellation between carbon uptake and release by lands, specifically the compensatory water effect on GPP and RECO (Jung et al., 2017) that seems to occur more strongly in subtropical regions. The Northern mid-to-high latitudes drive the global NEE ESV (Figure 9). The seasonality of water availability in the Northern mid-to-high latitudes is strongly controlled by snow (Trautmann et al., 2018), and snow also influences the seasonality of carbon fluxes in these regions (Arndt et al., 2020; Yi et al., 2020). Collectively, we speculate that snow may be a key regulator of the global NEE ESV and its coupling to water availability.
For interannual variability of CO2 fluxes, compared to the ESV results, we observe different regions dominating the global variability (Figure 10; see Figure A18 for the same results with the separation of Northern and Southern Hemispheres within each region). For GPP and RECO IAV, SINDBAD estimates the largest contribution from the tropical humid region, while TRENDY models show qualitatively comparable contributions for all regions without a dominant contributor. The regional contributions to the global GPP and RECO IAV are quantified differently not only between SINDBAD and TRENDY models in this study but also from other studies. For example, Chen et al. (2017) quantified regional contributions to global GPP IAV estimates by various approaches, including an ensemble of process-based models, satellite observations, and a data-driven upscaled product, and identified the most contribution consistently among those products in South Africa and Oceania, where SINDBAD identifies only mild-to-weak positive contributions (Figure 8) though it should be noted that there are differences in the methodology between two studies: (a) the target time period (2001–2015 vs. 1971–2010), (b) different ways to calculate IAV, and (c) different ways to calculate regional contributions.
[IMAGE OMITTED. SEE PDF]
For NEE IAV, SINDBAD, TRENDY models, and OCO-2 converge to the finding that the tropical humid region is dominating the global variance. Notably, both OCO-2 and TRENDY show a wide range of estimates across ensemble members. The contribution of the tropical humid region by SINDBAD is much smaller than that of GPP and RECO IAV, probably due to the strong GPP-RECO coupling within the model in this region. The tropical savanna region appears to be the second dominant region, supported by OCO-2 and TRENDY ensemble median. However, SINDBAD quantifies a slightly larger contribution of the temperate region than the tropical savanna, and TRENDY ensemble members show a wide range for the temperate region. For the arid region, OCO-2 shows a comparable contribution to the tropical savanna region, while SINDBAD and TRENDY models show smaller contributions. Both OCO-2 and TRENDY models show a marginal contribution to the boreal region. Most OCO-2 members and their ensemble median show negative contributions, while SINDBAD and most TRENDY models show positive contributions.
On the other hand, regional contributions to NEE IAV quantified using the method by Ahlström et al. (2015) show smaller contributions by tropical regions and larger contributions by arid regions (Figure A15c), compared with our method (Figure 10c). In both methods, tropical humid regions still show significant contributions to the global NEE IAV compared to tropical savanna regions. We note that results using the heuristic index by Ahlström et al. (2015) needs to be interpreted with caution as the method does not account for the effect of covariance in space, which takes a significant role in land-atmosphere CO2 fluxes (Jung et al., 2017). For example, temperate and boreal regions have negative spatial covariance of NEE IAV (Figure A16), implying a potential overestimation of contributions from these regions without considering the covariance (e.g., Figure A15c). The qualitatively distinct results between the two methods underscore the importance of the chosen methodology in spatial attribution studies, highlighting the need for caution in their deployment and further investigation into the methodology.
The emergence of the tropical humid region as a significant mediator of the interannual variability of global land carbon sink is remarkable since semi-arid regions have been regarded so far as the most relevant actors (e.g., Ahlström et al., 2015; Poulter et al., 2014; Zhang et al., 2018). There are studies suggesting that semi-arid regions may not be the only dominant contributor to the global land carbon sink variability. Piao et al. (2020) compared the net land carbon flux estimated by land carbon models, global atmospheric inversions, and FLUXCOM and found that tropical semi-arid and non-semi-arid regions show comparable contributions to the IAV of net land carbon flux. Levine et al. (2023) used a model-data fusion framework and found the dominant role of tropical humid regions on the IAV of tropical net biosphere production (NBP = −NEE–fire) for the 21st century due to water stress that is more attributable to the atmospheric demand than the supply. The importance of humid regions on the IAV of carbon fluxes is also found at the national scale. H. Li et al. (2021) analyzed simulations of an ensemble of terrestrial biosphere models and found the largest contributions (62%) of humid regions to the IAV of net primary productivity from 1982 to 2018 in China. Finally, although we find a strong agreement among SINDBAD, TRENDY models, and OCO-2 on the significant role of the tropical humid region to the global NEE IAV, we also observe a large uncertainty in the region by both ensemble products, which remains significantly large in another attribution method (Figure A15c). Such a wide range of contributions by humid tropical regions also happened in a comparison of GPP IAV among process models, satellite remote sensing observations, and data-driven upscaled product (e.g., Chen et al., 2017), due to fewer observations (e.g., Pastorello et al., 2020), low quality and amount of satellite data due to cloud contamination (Jung et al., 2020), and less understood carbon cycle processes, such as leaf phenology (e.g., Pau et al., 2011; Piao et al., 2019) and vegetation response to environmental conditions (Restrepo-Coupe et al., 2017).
Limitations
The relative importance of each region derived from our regional attribution analysis may be affected by processes that SINDBAD does not simulate, including wildland fires, deforestation, and agriculture. In our analyses, fire C emissions have been excluded, and this may cause the underestimation of contribution from regions with frequent wildfires, such as semi-arid regions. Also, the spatial filtering to get the study domain does not account for heavily deforested or cultivated grid cells. To test the robustness of the conclusions against fire CO2 emissions, deforestation, or cultivation, we conducted the regional attribution analysis of the global CO2 fluxes with (a) including fire CO2 emissions, (b) excluding deforested grid cells, or 3) removing cultivated grid cells (see Appendix B for the details). In all these three cases, our conclusions remain consistent as before (Figures B1–B6). The consistent regional attribution results by SINDBAD, regardless of heavily deforested or cultivated grid cells, can probably be due to two facts: (a) the subset of grid cells used for parameter optimization includes some of the heavily deforested or cultivated grid cells (Figure B9), and (b) the smoothed daily MODIS EVI prescribed to SINDBAD contains this information for example, Biradar and Xiao (2011) and L. Li et al. (2014).
We note that there are uncertain parts that our additional test cannot account for due to current limitations of the state of the art. For example, the effect of wildland fire is tested by using (a) SINDBAD NEE plus GFED fire C emissions, (b) raw OCO-2 NEE without excluding fire C emissions, and (c) TRENDY net biome exchange estimates from the S3 experiment with land use change and fire emissions. SINDBAD NEE plus GFED fire C emissions can consider the direct effect of fire (i.e., CO2 emission) only, leaving the indirect effect, such as post-fire tree mortality, untested. OCO-2 relies on GFED fire C emissions to identify fire effects (Byrne et al., 2023); GFED fire emissions have their own uncertainty, such as estimating emission factors and fuel consumption (van der Werf et al., 2017). The fire modules in TRENDY models have their uncertain parts, such as representing peatland fires and forest mortality processes (Piao et al., 2020).
Finally, a part of uncertainty of the regional attribution analysis of the global IAV of CO2 fluxes can be from the way of parameter calibration. SINDBAD parameters are calibrated for all the TRANSCOM land regions simultaneously using the raw or ESV of constraints. As IAV is a weak signal compared to the raw time series or ESV, the optimized parameters may be less representative for the IAV, although SINDBAD shows a comparably good performance in general in simulating the IAV of CO2 fluxes at the globe and across regions.
Conclusions
We developed a parsimonious process-based model of the land water and carbon cycles, SINDBAD. The parameters were constrained using multiple observation-based products, including GRACE TWS, OCO-2 NEE, and FLUXCOM GPP. SINDBAD, with the optimized set of parameters, reproduced the observed water and carbon patterns reasonably. The simplicity in parameter calibration (e.g., a subset of grid cells and years for parameter calibration and the use of spatiotemporally constant parameters for simulation) showed the significance of model-data integration.
SINDBAD performed comparable to or better than the ensemble of the state-of-the-art dynamic global vegetation models (i.e., TRENDY v9) in simulating seasonal and interannual variability of carbon fluxes at regional and global scales. Despite remaining limitations in the model, the reasonable performance of the model shows its potential in diagnosing patterns of water and carbon cycles.
Using the optimized model as well as TRENDY models and observation-based products, we assessed the relative contributions of land grid cells and climate regions to seasonal and interannual variability of land-atmosphere CO2 fluxes, including GPP, RECO, and NEE. SINDBAD, TRENDY models, and observation-based constraints agreed with the significant role of temperate and boreal regions in the Northern Hemisphere in the seasonal variability of carbon fluxes, which resulted from spatial (GPP and RECO) and temporal (NEE) cancellation. For interannual variability of NEE, all three products quantified a large contribution in tropical humid regions, though TRENDY models and OCO-2 showed large spreads among ensemble members, showing the need for better understandings and more observations of water and carbon cycles in the regions. For the IAV of GPP and RECO, no distinct contributor appeared in the results of either SINDBAD or TRENDY models, but we also lacked a reliable observation-based product to validate the model simulations.
Overall, our study provides an understanding of spatial contributions to the global land-atmosphere CO2 variability at seasonal and interannual scales by leveraging the synergy of process-based models and observation-based products to reduce parameter uncertainties. The significance of tropical humid regions in the global NEE IAV underscores the need to revisit the previous focus on semi-arid regions. Furthermore, our study reveals significant disparities among both TRENDY models and among OCO-2 ensemble members in estimating the covariation of land-atmosphere CO2 in tropical humid regions. Limited knowledge and modeling skill of carbon processes in tropical humid regions may not only impede the accurate diagnosis of global NEE IAV hotspots but also hinder a comprehensive understanding of the coupling between water and carbon cycles.
Appendix A - Additional Table and Figures
Additional table and figures in Appendix A provide further details on some aspects of the study, including regional classification (Figure A4), model formulation (Figure A1), calculations of IAV and ESV (Figures A2 and A3), model evaluation (Table A1, Figures A5–A8 and A10–A12), and spatial attribution analyses (Figures A9 and A13–A18).
Table A1 Coefficient of Determination and Relative Absolute Error (RAE) Between SINDBAD Estimates and Assimilated Constraints Across TRANSCOM Land Regions
Expected seasonal variability (RAE) | Interannual variability (RAE) | ||||||||
GPP | NEE | TWS | ET | SWE | Q | NEE | TWS | SWE | |
Global | 0.99 (0.04) | 0.99 (0.15) | 0.80 (0.45) | 0.99 (0.11) | 0.99 (0.17) | 0.76 (0.15) | 0.69 (0.62) | 0.55 (0.71) | 0.56 (0.85) |
1: North American Boreal | 0.99 (0.14) | 0.98 (0.35) | 0.79 (0.41) | 0.98 (0.19) | 0.99 (0.13) | 0.53 (0.45) | 0.13 (1.02) | 0.12 (0.91) | 0.69 (0.59) |
2: North American Temperate | 0.99 (0.09) | 0.92 (0.30) | 0.82 (0.44) | 0.98 (0.10) | 0.97 (0.26) | 0.83 (0.32) | 0.44 (1.82) | 0.58 (0.73) | 0.85 (0.39) |
3: South American Tropical | 0.01 (0.17) | 0.33 (1.88) | 0.99 (0.36) | 0.00 (0.07) | - (−) | 0.82 (0.13) | 0.32 (0.95) | 0.67 (0.62) | - (−) |
4: South American Temperate | 0.99 (0.13) | 0.71 (0.75) | 0.88 (0.39) | 0.96 (0.26) | - (−) | 0.97 (0.47) | 0.13 (0.91) | 0.33 (0.81) | - (−) |
5: Northern Africa | 0.95 (0.16) | 0.84 (0.41) | 0.95 (0.32) | 0.99 (0.21) | 0.37 (0.91) | 0.93 (0.35) | 0.41 (0.71) | 0.17 (0.93) | 0.04 (0.94) |
6: Southern Africa | 1.0 (0.12) | 0.68 (0.56) | 0.91 (0.29) | 1.00 (0.30) | - (−) | 0.89 (0.31) | 0.28 (0.84) | 0.03 (1.01) | - (−) |
7: Eurasian Boreal | 1.0 (0.05) | 0.93 (0.24) | 0.8 (0.44) | 0.97 (0.17) | 0.97 (0.19) | 0.75 (0.32) | 0.22 (0.9) | 0.34 (0.80) | 0.61 (0.8) |
8: Eurasian Temperate | 0.99 (0.29) | 0.69 (0.88) | 0.89 (0.33) | 0.99 (0.20) | 0.99 (0.19) | 0.79 (0.96) | 0.07 (3.46) | 0.39 (0.81) | 0.80 (0.59) |
9: Tropical Asia | 0.95 (0.05) | 0.27 (0.85) | 0.98 (0.37) | 0.97 (0.07) | - (−) | 0.89 (0.22) | 0.64 (0.63) | 0.38 (0.81) | - (−) |
10: Australia | 0.19 (0.35) | 0.14 (0.96) | 0.93 (0.58) | 0.95 (0.15) | - (−) | 0.83 (1.90) | 0.67 (0.79) | 0.46 (0.88) | - (−) |
11: Europe | 0.99 (0.11) | 0.97 (0.16) | 0.89 (0.40) | 0.98 (0.13) | 0.99 (0.23) | 0.70 (0.33) | 0.05 (2.27) | 0.00 (1.10) | 0.66 (0.79) |
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
Appendix B - Effects of Fire, Deforestation, and Agriculture
We tested the robustness of results of regional attribution analyses (Figures 9 and 10) against fire CO2 emissions, deforestation, and cultivation by additionally conducting the regional attribution analysis with (a) including fire CO2 emissions estimates, (b) excluding heavily deforested grid cells, and (c) excluding heavily cultivated grid cells. In summary, whether or not containing these emissions or grid cells does not affect our conclusions (Figure B1, Figure B2, Figure B3, Figure B4, Figure B5, Figure B6).
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
For the fire CO2 emissions, we added fire CO2 emissions from the Global Fire Emission Database (GFED) version 4 (Giglio et al., 2013) to SINDBAD NEE estimates. Note that this can account for the direct effect of fire emissions only, while it can not test the indirect effects of fire on the carbon cycle such as the post-fire tree mortality. For OCO-2, we used its raw estimates, instead of subtracting GFED fire CO2 emissions. For TRENDY, we used the net biome productivity (NBP) estimates from the S3 experiment that account for emissions from land use change and fire.
For heavily deforested or cultivated grid cells, we used corresponding gridded masks to filter out these grid cells before aggregating into regional CO2 flux time series and quantifying the regional contributions. For the mask of deforested grid cells, we removed heavily deforested grid cells using the global forest cover fraction map by M. C. Hansen et al. (2013). We calculated the forest cover fraction change between 2019 and 2001 (i.e., the study period), and defined grid cells as “heavily deforested” if the fraction change is smaller than 10 percentile (i.e., the top 10% strong negative changes) (Figure B7). Regarding the mask of cultivated grid cells, we used the MODIS Global Land Cover product MCD12Q1 version 6 (Friedl & Sulla-Menashe, 2019) to filter out heavily cultivated grid cells from the regional attribution analysis. We defined the heavily cultivated grid cells as grid cells with more than 50% fraction of cropland (Figure B8). Note that the subset of grid cells used for parameter optimization includes some of the heavily deforested or heavily cultivated grid cells (Figure B9).
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
Acknowledgments
Hoontaek Lee acknowledges support from the Max Planck Institute for Biogeochemistry (MPI-BGC) and the International Max Planck Research School for Global Biogeochemical Cycles (IMPRS-gBGC). Part of this research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. Part of the funding for this study was provided through NASA Carbon Cycle Science (Grant NNH20ZDA001N-CARBON). Javier Pacheco-Labrador was supported by the ESA Living Planet Fellowship IRS4BEF “Integrated Remote Sensing for Biodiversity-Ecosystem Function” (C.N.4000140028/22/I-DT-lr), a program of and funded by the European Space Agency. We thank Dr. Sophia Walther at MPI-BGC for the preparation of the vegetation index data used for the SINDBAD simulation. We thank the Associate Editor and three reviewers for their careful reading and constructive comments on the manuscript. Open Access funding enabled and organized by Projekt DEAL.
Data Availability Statement
SINDBAD simulation results are available at (Lee et al., 2025a). The scripts for processing data and producing figures can be accessed via a public repository on GitHub: (Lee et al., 2025b). TRENDY v9 simulation results are available upon request (Global Carbon Project, 2021). GPCP 1dd v1.3 precipitation is available at (Adler et al., 2020). CRUJRA v2.2 is available at (Harris, 2021). CERES SYN1degEd4A is available at (NASA/LARC/SD/ASDC, 2017). GRACE terrestrial water storage anomalies are available at (Wiese et al., 2023). GlobSnow v3 product is available at (Luojus et al., 2020). G-RUN Ensemble v1 is freely available at (Ghiggi, Humphrey, Gudmundsson, & Seneviratne, 2021). FLUXCOM fluxes (Jung et al., 2019, 2020) were obtained from (FluxCom, n.d.). OCO-2 v10 MIP fluxes (Byrne et al., 2023) are publicly available from (Byrne et al., 2022). Jena CarboScope fluxes (Rödenbeck et al., 2018) are available at (Rödenbeck, n.d.). MODIS EVI of FluxnetEO is freely available at (Walther et al., 2021). The maximum rooting depth by Y. Fan et al. (2017) is available upon request. The effective rooting depth by Yang et al. (2016b) is freely available from the CSIRO Data Access Portal (Yang et al., 2016a, ). The maximum soil water storage capacity by Wang-Erlandsson et al. (2016) is publicly available. The maximum plant-available water capacity by (Tian et al., 2019) is available upon request. The WoSIS snapshot 2019 is freely available at (Batjes et al., 2019). GFED v4 burned area is available at (Randerson et al., 2017). The Hansen global forest cover map (M. C. Hansen et al., 2013) was obtained from (Global Forest Change, n.d.). MODIS Global Land Cover product MCD12Q1 v6 will be decommissioned soon, but v6.1 is available at (Friedl & Sulla-Menashe, 2019).
Adler, R., Wang, J.‐J., Sapiano, M., Huffman, G., Bolvin, D., Nelkin, E., & NOAA CDR Program. (2020).
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The spatial contribution to the global land‐atmosphere carbon dioxide (CO2) exchange is crucial in understanding and projecting the global carbon cycle, yet different studies diverge on the dominant regions. Informing land models with observational data is a promising way to reduce the parameter and structural uncertainties and advance our understanding. Here, we develop a parsimonious diagnostic process‐based model of land carbon cycles, constraining parameters with observation‐based products. We compare CO2 flux estimates from our model with observational constraints and Trends in Net Land‐Atmosphere Carbon Exchange (TRENDY) model ensemble to show that our model reasonably reproduces the seasonality of net ecosystem exchange (NEE) and gross primary productivity (GPP) and interannual variability (IAV) of NEE. Finally, we use the developed model, TRENDY models, and observational constraints to attribute variability in global NEE and GPP to regional variability. The attribution analysis confirms the dominance of Northern temperate and boreal regions in the seasonality of CO2 fluxes. Regarding NEE IAV, we identify a significant contribution from tropical savanna regions as previously perceived. Furthermore, we highlight that tropical humid regions are also identified as at least equally relevant contributors as semi‐arid regions. At the same time, the largest uncertainty among ensemble members of NEE constraint and TRENDY models in the tropical humid regions underscore the necessity of better process understanding and more observations in these regions. Overall, our study identifies tropical humid regions as key regions for global land‐atmosphere CO2 exchanges and the inter‐model spread of its modeling.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details







1 Max Planck Institute for Biogeochemistry, Jena, Germany, Technische Universität Dresden, Institute of Photogrammetry and Remote Sensing, Dresden, Germany
2 Max Planck Institute for Biogeochemistry, Jena, Germany
3 Max Planck Institute for Biogeochemistry, Jena, Germany, Departamento de Ciências e Engenharia do Ambiente, Faculdade de Ciências e Tecnologia, Universidade Nova Lisboa, Costa da Caparica, Portugal, ELLIS Unit Jena, Michael Stifel Center Jena for Data‐Driven and Simulation Science, Jena, Germany
4 Max Planck Institute for Biogeochemistry, Jena, Germany, ELLIS Unit Jena, Michael Stifel Center Jena for Data‐Driven and Simulation Science, Jena, Germany
5 Technische Universität Dresden, Institute of Photogrammetry and Remote Sensing, Dresden, Germany
6 Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA
7 Environmental Remote Sensing and Spectroscopy Laboratory (SpecLab), Spanish National Research Council (CSIC), Madrid, Spain