Accurate predictions of water and carbon fluxes in croplands are essential for determining crop yield, hydrologic components, and irrigation schedules and researching climate change (He, Xu, Xia, et al., 2020; Lokupitiya et al., 2016; Williams, Gornall, et al., 2016; Xu, Guo, et al., 2018). Process-based land surface models (LSMs) have been developed to represent the complex land/hydrological interactions in earth systems (Chen & Dudhia, 2001; Dai et al., 2003; Lawrence et al., 2019; McDermid et al., 2017). In particular, LSMs have been enhanced by incorporating crop-growth and agriculture-management models to capture photosynthesis, plant growth, and crop yield processes in the Simple Biosphere Model (Lokupitiya et al., 2009); the Joint UK Land Environment Simulator (Osborne et al., 2015); the Community Land Model (Drewniak et al., 2012); and the Noah with multiparameterization model, Noah-MP-Crop model (X. Liu et al., 2016, 2020). The Noah-MP-Crop model was implemented in the High-Resolution Land Data Assimilation System (F. Chen et al., 2007) and Weather Research and Forecasting model (Skamarock et al., 2007). It improves the simulations of vegetation dynamics and surface heat fluxes for corn and soybean at both field and regional scales by using a number of agricultural management data, such as growing degree days (GDDs) and planting/harvesting dates (X. Liu et al., 2016, 2020; Xu, Chen, et al., 2019; Z. Zhang et al., 2020). However, uncertainties in input data, model structures, and model parameters inevitably induce uncertainties in the variables simulated by those process-based models (X. Li et al., 2007; T. Xu et al., 2011; Zhang, Chen, & Gan, 2016; Zhang, Guanter, et al., 2016; Zhang, Xiao, et al., 2016).
Data assimilation (DA) methods originated from estimation theory and cybernetics to merge observational information into process-based models to mitigate uncertainties in model variables and optimize model parameters (X. Li et al., 2020; Liang & Qin, 2008; Xia et al., 2019). This is done within variational-based or ensemble-based DA schemes to improve the model performances (He, Xu, et al., 2019; He, Xu, Bateni, et al., 2020; He et al., 2018; Lu et al., 2016, 2017, 2020; Margulis et al., 2002; Xu, Bateni, et al., 2018; Xu, Chen, et al., 2019; Xu, He, et al., 2019; T. Xu et al., 2011, 2015). Studies have assimilated various observational variables such as land surface temperature (LST), leaf area index (LAI), soil moisture (SM), and solar-induced chlorophyll fluorescence (SIF) into crop models and/or LSMs, which has improved the estimated crop yields (Ines et al., 2013; X. Li et al., 2018; Wang et al., 2014; Xie et al., 2017), vegetation biomass, evapotranspiration (ET), and gross primary production (GPP) within LSMs (Huang et al., 2008; Kumar et al., 2019; Xu, He, et al., 2019; T. Xu et al., 2015). The assimilation of SM observations improved the estimation of SM, runoff, ET (Han et al., 2012; D. Liu & Mishra, 2017; Lu et al., 2016, 2020; Tian et al., 2017), and drought monitoring and forecasting (Kumar et al., 2014; Yan et al., 2017). Remotely sensed SIF, as an indicator of photochemical processes, provides additional constraints for photosynthesis simulation (Camino et al., 2019; Y. Zhang et al., 2014; Zhang, Xiao, et al., 2016). Recently, a linear relationship between SIF and GPP was used to predict GPP under different environmental conditions and vegetation types (Frankenberg & Berry, 2018; He, Chen, et al., 2019; He, Magney, et al., 2020; Sun et al., 2018; Verma et al., 2017; Zhang, Xiao, et al., 2016).
However, these efforts only assimilated a single variable and/or constructed a one or dual-pass DA framework within process-based crop models (Kumar et al., 2019; Norton et al., 2018; Pinnington et al., 2018; Tian et al., 2017; K. Yang et al., 2007). Therefore, this study proposes a multipass land data assimilation scheme (MLDAS), based on the Noah-MP-Crop model and ensemble Kalman filter (EnKF) method so that LAI, SM, and SIF observations are simultaneously assimilated to predict sensible heat flux (H), latent heat flux (LE), gross primary product (GPP), and biomass. The developed MLDAS is tested at two AmeriFlux crop sites, namely, Mead (Nebraska, USA) and Bondville (Illinois, USA). The observed H, LE, and GPP from the eddy covariance (EC) flux towers are used to evaluate the MLDAS performance. The results of EnKF are compared with the ensemble open loop (EnOL) (i.e., no assimilation of LAI, SM, and SIF measurements) approach. Finally, the influence of assimilating LAI, SM, and SIF on the H, LE, and GPP estimates within the MLDAS is assessed.
Methodology Noah-MP-Crop ModelThe Noah-MP LSM is an enhanced Noah LSM (F. Chen & Dudhia, 2001; F. Chen et al., 1996; Ek et al., 2003) with expanded options and improved physics and multiparameterization options (Niu et al., 2011; Z. L. Yang et al., 2011). The improved physics options include a two-source energy balance module, a modified two-stream radiation transfer scheme, a short-term dynamic vegetation growth model, a dynamic groundwater component, and a multilayer snowpack. The multiparameterization options provide users with multiple choices of parameterizations in terms of leaf dynamics, surface-layer turbulence treatment, canopy stomatal resistance, SM limiting factors for stomatal resistance, and runoff and groundwater.
The Noah-MP LSM was recently enhanced with agricultural management models such as crop-growth models (X. Liu et al., 2016, 2020) and a dynamic irrigation model (Xu, Chen, et al., 2019; Z. Zhang et al., 2020). It simulates the process of carbohydrates and distributes them to different crop components, such as leaves, stems, grains, and roots. GDDs are utilized in Noah-MP-Crop to determine the different plant growth stages from planting to harvest. In addition, numerous agricultural management data (e.g., planting and harvesting dates, fraction of cultivated lands, and fraction of irrigation) are used in Noah-MP-Crop to constrain agriculture models. Noah-MP-Crop simulates the dynamic evolution of various biomasses (e.g., leaf, stem, grain, and root mass) and time-varying LAI for corn and soybean, which affects photosynthesis, surface net radiation, ET, SM, and surface heat fluxes (H and LE) (Kumar et al., 2019; X. Liu et al., 2016). This model also estimates grain biomass as the “yield” when the crop reaches physiological maturity. The values of the key parameters in Noah-MP-Crop used in this study are obtained from X. Liu et al. (2016, 2020) and Z. Zhang et al. (2020).
Multipass Land Data Assimilation Scheme Data Assimilation StrategyThe MLDAS is based on the Noah-MP-Crop and EnKF method and dynamically estimates model states and parameters independently by assimilating multisource remotely sensed or ground-measured observations in multiple passes. In this study, LAI, SM, and SIF are simultaneously assimilated in the MLDAS to improve the performance of Noah-MP-Crop.
Specific leaf area (SLA), defined as the leaf area to its dry mass (Garnier et al., 2001; Gunn et al., 1999), is used to convert leaf mass into LAI in Noah-MP-Crop and represents leaf functioning and plant productivity, ecological behavior, and growth rate (Cornelissen et al., 2003; Wilson et al., 1999). In this study, SLA is equivalent to the BIO2LAI parameter used in X. Liu et al. (2016). Plants with high SLA values rapidly produce biomass and have high growth rates, while plants with low SLA values efficiently preserve nutrients and decrease photosynthesis and growth rates (Wright & Westoby, 1999). SLA is set to a constant (0.015 for corn) in the Noah-MP-Crop model, but it may vary throughout the growing season and from year to year (Bryant, 2011; Rinaldi, 2003; Zhang, Suyker, et al., 2018; Z. Zhang et al., 2020). Therefore, it is necessary to consider a time-varying SLA within the growing season. This study optimized SLA and updated leaf biomass (LFMASS) simultaneously by assimilating the observed LAI. In fact, three data assimilation strategies are used for LAI assimilation here: Strategy I: only update LFMASS, Strategy II: only optimize SLA, and Strategy III: update both LFMASS and SLA. Moreover, surface SM observations (0–0.1 m) are assimilated into the Noah-MP-Crop model to simultaneously update the four-layer (0–0.1, 0.1–0.4, 0.4–1, and 1–2 m) SM.
The maximum rate of carboxylation (Vcmax) is a critical parameter in the Noah-MP-Crop photosynthesis model because it controls the carbon assimilation process (Farquhar et al., 1980; Y. Zhang et al., 2014). Uncertainty in Vcmax often causes large spreads in GPP estimates across models (Bonan et al., 2011; Hu et al., 2014). In this study, the SIF observations are assimilated to optimize Vcmax and constrain the photochemical process.
Overall, the LAI, SM, and SIF data are simultaneously assimilated into Noah-MP-Crop to constrain the model state variables (i.e., LFMASS and SM) and two critical parameters (i.e., SLA and Vcmax) using MLDAS with four passes and four parallel EnKFs as follows:
Pass I: optimize SLA with the assimilation of LAI observations; Pass II: update LFMASS with the assimilation of LAI observations; Pass III: update four-layer SM with the assimilation of surface SM observations; Pass IV: optimize Vcmax with the assimilation of SIF observations.
The flowchart of MLDAS is shown in Figure 1.
Ensemble Kalman Filter MethodThe EnKF method proposed by Anderson (2001) is integrated with Noah-MP-Crop and calculates the model states and parameter error covariance based on the Monte Carlo method, as used in many land data assimilation scheme (LDAS) studies (Koster et al., 2018; Moradkhani et al., 2005; Reichle et al., 2002; T. Xu et al., 2015). In the MLDAS framework, the model state variables (i.e., LFMASS and SM) and two critical parameters (i.e., SLA and Vcmax) are updated separately by using the EnKF method. T is considered the state vector of the DA scheme used in the four passes of EnKF, representing LFMASS, SLA, SM, and Vcmax. At the beginning of the EnKF method, the initial value (T0) is used to create the state vector ensembles by adding normally distributed random errors (with zero mean and specified variance): [Image Omitted. See PDF]where Tj,0 represents the jth ensemble member of a state variable at the initial time. ωj represents the background error, which is assumed to have a normal distribution.
In the forecast step, the state vector is predicted according to: [Image Omitted. See PDF]where and are the forecasted state vectors of the jth member at times k+1 and k, respectively. M(−) represents the model operator (Noah-MP-Crop in this study), and Fk+1 and αk+1 represent the model forcing data and parameters at time k+1, respectively. μj is the model error with a zero mean and specified variance.
At times, when measurements of LAI, SM, and SIF is available, the observation operator is described as: [Image Omitted. See PDF]where is the model-predicted LAI, SM, and SIF. H(−) represents the observation operator. For LAI assimilation, the relationship between LAI and LFMASS (LAI = SLA × LFMASS) is described as the observation operator. For SIF assimilation, the linear relationship between SIF and GPP (SIF = a × GPP + b) is described as the observation operator. a and b are calibration parameters. ηj is the observation error with a mean of zero and a covariance of R of the jth member.
The state variable of each member is updated as follows: [Image Omitted. See PDF] [Image Omitted. See PDF] [Image Omitted. See PDF] [Image Omitted. See PDF]where and are the analyzed and forecasted state variables of the jth member at time k+1, respectively. K is the Kalman gain matrix. is the LAI, SM, and SIF observations at time k+1. The LAI, SM, and SIF observations at updated time k+1 are used in the EnKF system to update the model states and parameters. is the error covariance matrix of the forecast model states (LFMASS and SM) and parameters (SLA and Vcmax), and represents the mean of forecasted state variables at time k+1. [·]T represents the transposed matrix, and Ne is the number of ensembles. In this study, the ensemble size for the MLDAS run is set as 30.
In MLDAS, the model state variables and parameters are updated separately using Equations 4–7. In Passes I, II, III, and IV, the state variable T in Equation 1 represent SLA, LFMASS, four-layer SM, and Vcmax, respectively.
To apply the EnKF technique, the model ensembles should be perturbed by adding normally distributed random errors to the model states. Random errors within a physically reasonable range reflect the uncertainty in observations, background conditions, and propagated states (Bateni & Entekhabi, 2012). In this study, the model ensembles are obtained by adding normally distributed random errors to the model states (LFMASS and SM) of Passes I and III and the model parameters (SLA and Vcmax) of Passes II and IV in Equations 1 and 2. Additive perturbations with a standard deviation of 10 g m−2 are applied to the modeled LFMASS (Kumar et al., 2019). The standard deviations of the four-layer SM (0–0.1, 0.1–0.4, 0.4–1, and 1–2 m) are 0.03, 0.03, 0.025, and 0.02 m3 m−3, respectively (T. Xu et al., 2015). For SLA and Vcmax, the model parameter standard deviations are set to 10% of the range of the value (T. Xu et al., 2011, 2015). For the observation errors, standard deviations of 0.1, 0.05 m3 m−3, and 0.15 W m−2 nm−1 sr−1 are used as empirical values for the LAI, SM, and SIF observations, respectively. These standard deviation values are comparable to those values used in prior LAI, SM, and SIF DA studies (Kumar et al., 2019; Norton et al., 2018; Sabater et al., 2008; T. Xu et al., 2011, 2015; K. Yang et al., 2007; Zhang, Joiner, et al., 2018). More details about the MLDAS are listed in Table 1.
Table 1 Data Assimilation Strategies Discussed in This Study
Data assimilation experiment | Assimilated data | The updated model variables/parameters |
EnOL | No data is assimilated | None |
EnKF-LAI-I | Leaf area index (LAI) | Optimize specific leaf area (SLA) |
EnKF-LAI-II | Update leaf biomass | |
EnKF-LAI-III | Update both SLA and leaf biomass | |
EnKF-SM | Surface soil moisture | Update four-layers soil moisture at daily time scale |
EnKF-SIF | Solar-induced chlorophyll fluorescence (SIF) | Optimize Vcmax |
EnKF-ALL | LAI, surface soil moisture, SIF | Optimize SLA and Vcmax; update leaf biomass and four-layer soil moisture |
Abbreviations: EnKF, ensemble Kalman filter; EnOL, ensemble open loop; LAI, leaf area index; SM, soil moisture; Vcmax, maximum rate of carboxylation; SLA, specific leaf area; SIF, Solar-induced chlorophyll fluorescence.
Experimental Sites and Data SetThe developed MLDAS is tested at the Bondville (US-Bo1) and Mead (US-Ne1) sites. The Mead (US-Ne1, Nebraska) site is covered by corn and irrigated with a central pivot system. Bondville (US-Bo1, Illinois) is characterized by rainfed with annual rotation between corn and soybean. The simulation period for the Mead site is from 2003 to 2005, and for the corn year (2001, 2003, and 2005) at the Bondville site.
Half-hourly (Bondville) and hourly (Mead) meteorological forcing data, such as incoming shortwave and longwave radiation, wind speed, precipitation, air pressure, temperature, and humidity, are used to drive the Noah-MP-Crop model (
The simulations are conducted for five months (May 1 to September 30, i.e., DOY 121–273) covering the corn growing season. The LAI observations are assimilated into the Noah-MP-Crop when they are available. The daily averaged (00:00–24:00 local time) SM observations are assimilated every day, and the SIF observations are assimilated every 4 days. All these observations are assimilated into the Noah-MP-Crop model at 00:00 (local time); in addition, the model state variables (LFMASS and four-layer SM) are updated and model parameters (SLA and Vcmax) are constrained. To evaluate the performance of MLDAS, mean absolute error (MAE), root mean square deviation (RMSD), and correlation coefficient (R) metrics are used in this study.
Results and Discussion Assimilation of LAI ObservationsFigure 2 compares observations with the LAI estimates from the three assimilation strategies and open loop simulation. EnOL has the largest RMSD, while the three LAI assimilation strategies ensemble Kalman filter-leaf area index (EnKF-LAI-I, EnKF-LAI-II, and EnKF-LAI-III) have much lower RMSDs. Among the three LAI assimilation strategies, EnKF-LAI-III performs best because the DA method from EnKF-LAI-III update both LFMASS and optimizes SLA with the information contained in LAI observations, and retrieves the statistically optimal value for LFMASS and LAI. Moreover, the LAI estimates from EnKF-LAI-III have higher R and lower RMSD values than those of EnOL, EnKF-LAI-I, and EnKF-LAI-II at Mead and Bondville. The RMSDs of the EnOL LAI retrievals at the Mead and Bondville sites are 0.73 and 0.62 m2 m−2, respectively. EnKF-LAI-III reduces the RMSD by 46.58% and 43.55% for the Mead and Bondville sites, respectively.
Figure 2. Scatter plots of the estimated daily leaf area index (LAI) values versus the observations (first row) and the statistical metrics of the estimated daily LAI (second row) with the three LAI assimilation strategies.
As shown in Figure 3, EnOL significantly underestimated the LAI during the growing season at the Mead (2004, 2005) and Bondville (2001, 2003) sites but slightly overestimated it during the growing season at Bondville (2005), which are primarily due to inaccurate model parameters (e.g., GDD and planting and harvesting dates). In Noah-MP-Crop, dynamic crop growth parameters, such as the accumulated GDD, is used to determine plant growth stages. The planting and harvesting dates are used to determine the duration of the growing season of corn and soybeans (Levis et al., 2012; X. Liu et al., 2016; Z. Zhang et al., 2020). All these parameters are site-specific and empirical and are not easily obtained and applied to various sites. Furthermore, the specification of SLA requires information on the LFMASS, which is rarely measured for a large area. However, the LAI estimates from EnKF-LAI-III can capture both the seasonal and daily variations in the measurements with parameter optimization and are closer to the measurements than those from the EnOL approach.
Figure 3. Comparisons of the leaf area index (LAI) values estimated from ensemble Kalman filter-leaf area index-III (EnKF-LAI-III) and ensemble open loop (EnOL) with the observations at the Mead and Bondville sites.
Figure 4 shows that the optimized SLA from EnKF-LAI-III is close to the measured values at the two sites. The estimated SLA decreases from 0.03 m2 g−1 at the early stage of the growing season to 0.01 m2 g−1 at the end stage of the growing season, because the corn leaves grow rapidly at the beginning of the growing season (larger LAI) with a small amount of biomass; later, the leaves become thicker (larger biomass), while the LAI increases slightly (Z. Zhang et al., 2020). Overall, the key vegetation parameters of Noah-MP-Crop (i.e., SLA) can be efficiently constrained by assimilating LAI observations. The range of the retrieved SLA values in this study is comparable to those reported in the literature (e.g., Bateni et al., 2014; Y. Li et al., 2005; Williams et al., 2017).
Figure 4. Comparisons of ensemble Kalman filter-leaf area index-III (EnKF-LAI-III) optimized specific leaf area (SLA) with observations at the Mead and Bondville sites.
As shown in Figure 5, simulated SM from the EnOL method is remarkably lower than observations during the growing season at Bondville (2003) but slightly overestimated during the growing season at Mead (2003, 2005). Both the daily variations and the magnitudes of the SM estimates from the EnKF-SM agree well with observations, which also captures main soil-wetting events (as a response to precipitation) and drying spells at the two sites. EnKF-SM shows slightly worse performance in terms of SM estimates for Bondville in 2003 than in 2001 and 2005. This result occurred because the use of time-invariant SM observation errors and model errors through the modeling period sometimes did not take advantage of the information in the SM observations and model estimates.
Figure 5. Daily averaged soil moisture (0–10 cm) estimates from ensemble Kalman filter-soil moisture (EnKF-SM) and ensemble open loop (EnOL) versus ground measurements at the Mead and Bondville sites. Blue bars represent the daily total precipitation.
Table 2 shows that the two-site average MAE (RMSD) of the SM estimates from EnKF-SM is 0.03 m3 m−3 (0.04 m3 m−3), which is 40.00% (63.64%) lower than the MAE (RMSD) of 0.05 m3 m−3 (0.11 m3 m−3) from EnOL.
Table 2 Statistical Indices of the Daily Averaged Soil Moisture Retrievals From the EnOL and EnKF-SM Approaches at the Two Study Sites
Mead | Bondville | |||
EnOL | EnKF-SM | EnOL | EnKF-SM | |
MAE (m3 m−3) | 0.03 | 0.01 | 0.07 | 0.04 |
RMSD (m3 m−3) | 0.09 | 0.03 | 0.13 | 0.05 |
R (−) | 0.72 | 0.94 | 0.71 | 0.92 |
Abbreviations: EnKF, ensemble Kalman filter; EnOL, ensemble open loop; MAE, mean absolute error; RMSD, root mean square deviation; SM, soil moisture.
Assimilation of SIFIn the MLDAS, the linear relationship between SIF and GPP (SIF = a × GPP + b) was used as the observation operator. The spatial resolutions of GPP measurements from the EC systems and the SIF from satellites are approximately 500–1,000 and 5,000 m (B. Chen et al., 2008, 2011), respectively. Therefore, the linear observation operator includes not only photosynthesis information determined by SIF but also spatial-scale effects. As shown in Figure 6, in general, GPP increases with increasing SIF at the two sites. The 4-day interval SIF is strongly correlated with GPP for Mead (R = 0.71) and Bondville (R = 0.80). The GPP/SIF ratio at Mead is larger than that at Bondville, most likely due to irrigation, leading to an approximately 30% higher GPP per unit SIF at Mead than at Bondville.
Figure 6. The relationships between gross primary productivity (GPP) and solar-induced chlorophyll fluorescence (SIF). The red line represents the fitted linear regression.
Vcmax, a key parameter in modeling photosynthesis, varies greatly during the growing season and among crop species (Makela et al., 2004; Sellers, 1997) and is set to a wide range of values in different models (from 30 to 150 μmol m−2 s−1; Bonan et al., 2011; Z. Zhang et al., 2020). In this study, the dynamic value of Vcmax is updated by assimilating the SIF observations. As shown in Figure 7, Vcmax reveals seasonal variations depending on the vegetation conditions. In comparison to other times, in the middle of the growing season (DOYs 180–220), Vcmax values are higher and produce intense photosynthesis. The dynamic range of Vcmax retrievals is comparable to those of Hu et al. (2014), Y. Liu (2019), and Misson et al. (2006).
Figure 7. The ensemble Kalman filter-solar-induced chlorophyll fluorescence (EnKF-SIF) optimized maximum rate of carboxylation (Vcmax) values at the Mead and Bondville sites.
This section discusses the analysis of water-energy-carbon variables (i.e., H, LE, GPP, aboveground biomass, and grain mass) simulated by MLDAS. As shown in Figure 8, the EnOL approach overestimates (underestimates) H (LE) in May and underestimates (overestimates) H (LE) in August and September at the Mead site, mainly because the model underestimates (overestimates) SM from May to June (July to September, see Figure 5). In addition, the underestimated LAI from EnOL (July to September; Figure 3) allowed the land surface to absorb more solar radiation, resulting in higher H (Hardwick et al., 2015). Compared to the EnOL simulations, the EnKF-ALL simulated diurnal cycles of H and LE are closer to the measurements, indicating the benefit of assimilating LAI, SM, and SIF time series in improving the surface fluxes H and LE at the Mead site.
Figure 8. Monthly averaged diurnal cycle of the modeled sensible and latent heat fluxes from ensemble Kalman filter-ALL (EnKF-ALL) (solid lines), ensemble open loop (EnOL) (dashed lines) and corresponding observations (solid dots) at the Mead site.
Figure 9 shows that the simulated SM from EnOL is much lower than the 2003 observations at the Bondville site, leading to higher H and lower LE, and EnKF-ALL performs substantially better. The estimated LE from the EnKF-ALL approach matches the daily variations in precipitation (SM) and LAI. At the Mead and Bondville sites, LE shows a rising trend in the vegetation growing period, reaches its maximum in the middle of July, and decreases as the vegetation senesces. The Mead and Bondville sites have the lowest H in July and August as a result of the high LE caused by high LAI and SM availability. In contrast, H dominates over LE in May because of sparser vegetation cover.
Figure 9. Monthly averaged diurnal cycle of the modeled sensible and latent heat fluxes from ensemble Kalman filter-ALL (EnKF-ALL) (solid lines), ensemble open loop (EnOL) (dashed lines) and corresponding observations (solid dots) at the Bondville site.
As shown in Table 3, the two-site average MAE (RMSD) of the H estimates from EnKF-ALL is 50.56 W m−2 (79.42 W m−2), which is 23.00% (27.73%) lower than the MAE (RMSD) of 65.66 W m−2 (109.90 W m−2) from EnOL. For LE, the two-site average MAE and RMSD from EnKF-ALL (EnOL) are 54.60 and 89.59 W m−2 (68.55 and 116.93 W m−2), respectively.
Table 3 The MAE, RMSE, and R of Half-Hourly (Hourly) H and LE Retrievals From the EnOL and EnKF-ALL Approaches at the Two Study Sites
Study sites | H (W m−2) | LE (W m−2) | |||
EnOL | EnKF-ALL | EnOL | EnKF-ALL | ||
Mead | MAE (W m−2) | 57.59 | 38.52 | 55.57 | 45.29 |
RMSD (W m−2) | 102.43 | 76.19 | 108.67 | 88.30 | |
R (−) | 0.66 | 0.81 | 0.72 | 0.83 | |
Bonville | MAE (W m−2) | 73.72 | 62.60 | 81.52 | 63.90 |
RMSD (W m−2) | 117.36 | 82.65 | 125.19 | 90.87 | |
R (−) | 0.63 | 0.77 | 0.64 | 0.81 | |
Two-sites average | MAE (W m−2) | 65.66 | 50.56 | 68.55 | 54.60 |
RMSD (W m−2) | 109.90 | 79.42 | 116.93 | 89.59 | |
R (−) | 0.65 | 0.79 | 0.68 | 0.82 |
Abbreviations: EnKF, ensemble Kalman filter; EnOL, ensemble open loop; H, sensible heat flux; LE, latent heat flux; MAE, mean absolute error; RMSD, root mean square deviation.
Figure 10 compares the daily GPP estimates from EnOL and EnKF-ALL with the measurements at the Mead and Bondville sites. The GPP estimates from the EnOL method are remarkably lower than the observations during the growing season at Mead (2004, 2005) and Bondville (2001, 2003). The GPP values are slightly overestimated during the growing season at Bondville (2005), which is consistent with the LAI results from EnOL presented in Figure 3. The discrepancies between the GPP estimates and observations are primarily due to inaccurate model parameters (e.g., GDD, planting and harvesting dates, SLA, and Vcmax). The GPP estimated from the EnKF-ALL approach capture both the seasonal and daily variations in the measurements and is closer to the measurements compared to that from the EnOL approach, implying that the EnKF-ALL approach can exploit implicit information in the LAI and SIF observations to generate more accurate GPP estimates. The estimated GPP values increase continuously from May to June with the growth of the crop, reach their maxima in July (DOYs 182–212) and decrease from July to September because the crop is mature and harvest. The two-site average MAEs (RMSDs) from the EnOL and EnKF-ALL approaches are 3.40 (4.13 gC m−2 d−1) and 1.18 gC m−2 d−1 (1.99 gC m−2 d−1), respectively. In comparison to the EnOL approach, the EnKF-ALL approach cut down the MAE (RMSD) of the two-site average GPP by 65.29% (51.82%).
Figure 10. Time series of daily gross primary productivity (GPP) retrievals from the ensemble Kalman filter-ALL (EnKF-ALL) (red solid lines) and ensemble open loop (EnOL) (green dashed lines) approaches at the two study sites. The GPP measurements are indicated by open circles.
Figure 11 shows that the EnOL approach underestimated aboveground biomass at Bondville and overestimated aboveground biomass at Mead, but the performance of the Noah-MP-Crop model is improved with the EnKF-ALL method. The assimilation of LAI, SM, and SIF data effectively captures the rapid crop growth from May to August and the decline in aboveground biomass at the end of August when corn started to ripen and decline at Bondville. The two-site average MAE (RMSD) of aboveground biomass estimates from EnKF-ALL is 100.25 g m−2 (153.67 g m−2), which is 39.47% (50.47%) lower than the MAE (RMSD) of 165.63 g m−2 (310.25 g m−2) from EnOL. The Noah-MP-Crop model also provides useful crop yield information (grain biomass) for regional-scale agricultural research and management applications (X. Liu et al., 2016; Z. Zhang et al., 2020). The retrieved grain biomass values for Bondville are shown in Figure 11 (bottom line). In Bondville, the EnOL simulation underestimated the grain biomass by approximately 22.30% compared to ground measurements. Compared to the EnOL approach, the EnKF-ALL approach greatly improves corn yield by approximately 23.54%.
Figure 11. Comparisons of simulated aboveground biomass and grain biomass at the Mead and Bondville sites between the ensemble open loop (EnOL) and ensemble Kalman filter-ALL (EnKF-ALL) experiments.
The performance of EnKF-ALL is compared with those of EnKF-LAI-III, EnKF-SM, EnKF-SIF, and EnOL, and the results are shown in Figure 12. In comparison to EnOL, EnKF-LAI-III (EnKF-SM) significantly improves the estimation of LAI (SM) at Mead and Bondville. This is because the assimilation of LAI and SM data effectively constrain the parameter (SLA) and state variables (LFMASS and SM) within Noah-MP-Crop.
Figure 12. Comparison of the daily leaf area index (LAI), soil moisture (SM), sensible and latent heat flux (H and LE), and gross primary productivity (GPP) estimates from ensemble open loop (EnOL), ensemble Kalman filter (EnKF)-LAI-III (only assimilating LAI data), EnKF-SM (only assimilating SM data), EnKF-SIF (only assimilating SIF data), and EnKF-ALL (assimilating LAI, SM, and SIF data) at the Mead and Bondville sites.
The RMSD (R) values of the H, LE, and GPP retrievals from the EnKF-ALL approach are lower (higher) than those of EnOL, implying that EnKF-ALL can make full use of information within sequences of LAI, SM, and SIF to predict more accurate H, LE, and GPP. EnKF-SM significantly improve the H and LE retrievals, indicating that sequences of SM are of vital importance for H and LE retrievals. Similarly, EnKF-LAI-III and EnKF-SIF appropriately improves the H and LE retrievals, indicating that changes in vegetation phenology (e.g., LAI dynamics and Vcmax) also affects the partitioning of the available energy between the turbulent heat fluxes (H and LE). EnKF-LAI-III, and EnKF-SIF significantly improve the GPP estimates at the two study sites, indicating that changes in vegetation dynamics are vitally important for GPP retrievals. However, the GPP estimates from EnKF-SM are comparable to those from EnOL, which means that GPP is less sensitive to variations in SM in the Noah-MP-Crop model.
To examine the relative contributions of different assimilation strategy to the overall results, the differences in the simulations between EnKF (EnKF-LAI-III, EnKF-SM, EnKF-SIF, and EnKF-ALL) and EnOL are shown in Figure 13. The statistical results are averaged from three-year simulations at the two study sites. The assimilation of LAI and SIF led to increased LAI from July to September. The largest increase in LAI is observed from June to August. As LAI changes, corresponding changes in GPP and LFMASS are also found in the model estimates. The assimilation of LAI leads to slightly increased LE because changes in vegetation dynamics will affect vegetation transpiration (Kumar et al., 2019). The assimilation of SM has a large impact on SM, H and LE. Increased (decreased) SM in May (from June to September) leads to decreased (increased) H and increased (decreased) LE. The decrease in SM also leads to slightly decreased LAI, LFMASS, and GPP (June and July) because of its effects on canopy stomatal resistance and photosynthesis (Zhuo et al., 2019). In contrast, EnKF-ALL showed the opposite trend in LAI, LFMASS, and GPP in July. This is because the assimilation of one variable will be offset by the assimilation of the other variable, and the assimilation of multiple pairs of variables will produce indistinguishable results (Gao et al., 2015; Schneider et al., 2017).
Figure 13. The model variable (leaf area index [LAI], soil moisture [SM], sensible and latent heat flux [H and LE], gross primary productivity [GPP], and leaf biomass) differences between the ensemble Kalman filter (EnKF; EnKF-LAI-III, EnKF-SM, EnKF-SIF, and EnKF-ALL) and EnOL methods.
In this study, the MLDAS was developed based on the Noah-MP-Crop and EnKF. The LAI, SM, and SIF observations are simultaneously assimilated into the Noah-MP-Crop model within a MLDAS to improve the simulated H, LE, and GPP. The performance of the MLDAS is tested at two AmeriFlux cropland sites (Mead and Bondville), and the main results are as follows:
-
The assimilation of LAI helps update LFMASS and optimize SLA and is able to retrieve the statistically optimal value for the LFMASS and LAI predictions. The RMSD of the LAI estimates from EnKF-LAI-III at the Mead (Bondville) site are 0.39 (0.35), representing a reduction of 52.05% (43.55%) in the error compared to the EnOL. In comparison to EnOL, EnKF-LAI-III can capture both the observed seasonal and daily LAI variations with parameter optimization and is closer to the measurements, and the optimized SLA values are close to the ground measurements at the two sites.
-
The SM assimilation results show a more reasonable response to land surface wetting and drying events at the two sites. Compared to EnOL, assimilating surface SM reduces the two-site average MAE (RMSD) of the SM estimates by 40.00% (63.64%).
-
Assimilating SIF helps optimize the seasonal variations in the Vcmax values throughout the whole vegetation growing season.
-
In comparison to those from EnOL, the simulated day-to-day fluctuations in the H, LE, and GPP estimates from EnKF-ALL (by jointly assimilating LAI, SM, and SIF) is more consistent with the observations, which reduced the two-site average RMSD of H from 109.90 with EnOL to 79.42 W m−2 with EnKF-ALL. For LE, the two-site average RMSD from EnKF-ALL (EnOL) is 89.59 W m−2 (116.93 W m−2). The two-site average RMSD of daily GPP retrievals from EnKF-ALL is 1.99 gC m−2 d−1, which is 51.82% lower than the RMSD of 4.13 gC m−2 d−1 from EnOL. Moreover, EnKF-ALL also effectively constrained aboveground biomass variations and improved corn yield estimates.
The performance of EnKF-ALL is compared with those of the EnKF-LAI-III (only LAI assimilation), EnKF-SM (only SM assimilation), EnKF-SIF (only SIF assimilation), and EnOL approaches. The results show that the assimilation of SM significantly improves the H and LE retrievals, while the assimilation of LAI and SIF significantly improves the GPP retrievals. Future studies should focus on applying the developed MLDAS approach over large-scale domains using remotely sensed LAI, SM, and SIF data.
AcknowledgmentsThis study is supported by the Water System Program at the National Center for Atmospheric Research (NCAR), USDA NIFA Grants 2015-67003-23508 and 2015-67003-23460, NOAA Grants NA18OAR4590381 and NA18OAR4590398, and NSF Grant INFEWS #1739705, and project supported by State Key Laboratory of Earth Surface Processes and Resource Ecology (2021-ZD-04). We would like to thank the high-performance computing support from the Center for Geodata and Analysis, Faculty of Geographical Science, Beijing Normal University (
The in situ meteorological variables, turbulent heat fluxes, LAI, and biomass data are downloaded freely from the AmeriFlux Data Center (
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2021. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The interactions between crops and the atmosphere significantly impact surface energy and hydrology budgets, climate, crop yield, and agricultural management. In this study, a multipass land data assimilation scheme (MLDAS) is proposed based on the Noah‐MP‐Crop model. The ensemble Kalman filter (EnKF) method is used to jointly assimilate the leaf area index (LAI), soil moisture (SM), and solar‐induced chlorophyll fluorescence (SIF) observations to predict sensible (H) and latent (LE) heat fluxes, gross primary productivity (GPP), etc. Such joint assimilation is demonstrated to be effective in constraining the model state variables (i.e., leaf biomass and SM) and optimizing key crop‐model parameters (i.e., specific leaf area [SLA], and maximum rate of carboxylation, Vcmax). The performance of the MLDAS is evaluated against observations at two AmeriFlux cropland sites, revealing good an agreement with the observed H, LE, and GPP. When using optimized model parameters (SLA and Vcmax) and jointly assimilating LAI, SM, and SIF observations, the MLDAS produces 34.28%, 26.90%, and 51.82% lower root mean square deviations for daily H, LE, and GPP estimates compared with the Noah‐MP‐Crop open loop simulation. Our findings also indicate that the H and LE predictions are more sensitive to SM measurements, while the GPP simulations are more affected by LAI and SIF observations. The results indicate that performances of physical models can be greatly improved by assimilating multi‐source observations within MLDAS.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details





1 State Key Laboratory of Earth Surface Processes and Resource Ecology, School of Natural Resource, Faculty of Geographical Science, Beijing Normal University, Beijing, China
2 National Center for Atmospheric Research, Boulder, CO, USA
3 School of Environment and Sustainability, University of Saskatchewan, Saskatoon, SK, Canada