ABSTRACT
The contribution of spike photosynthesis to grain yield (GY) has been overlooked in the accurate spectral prediction of yield. Thus, it's essential to construct and estimate a yield-related phenotypic trait considering spike photosynthesis. Based on field and spectral reflectance data from 19 wheat cultivars under two nitrogen fertilization conditions in two years, our objectives were to (i) construct a yield-related phenotypic trait (spike–leaf composite indicator, SLI) accounting for the contribution of the spike to photosynthesis, (ii) develop a novel spectral index (enhanced triangle vegetation index, ETVI3) sensitive to SLI, and (iii) establish and evaluate SLI estimation models by integrating spectral indices and machine learning algorithms. The results showed that SLI was sensitive to nitrogen fertilizer and wheat cultivar variation as well as a better predictor of yield than the leaf area index. ETVI3 maintained a strong correlation with SLI throughout the growth stage, whereas the correlations of other spectral indices with SLI were poor after spike emergence. Integrating spectral indices and machine learning algorithms improved the estimation accuracy of SLI, with the most accurate estimates of SLI showing coefficient of determination, root mean square error (RMSE), and relative RMSE values of 0.71, 0.047, and 26.93%, respectively. These results provide new insights into the role of fruiting organs for the accurate spectral prediction of GY. This high-throughput SLI estimation approach can be applied for wheat yield prediction at whole growth stages and may be assisted with agronomical practices and variety selection.
Keywords:
Wheat spike photosynthesis
Yield-related phenotypic trait
Spectral indices
Machine learning
Estimation
(ProQuest: ... denotes formulae omitted.)
1. Introduction
Canopy photosynthesis always has an influence on cereal yield [1]. While green leaves are generally regarded as the primary organs responsible for cereal photosynthesis [2], non-leaf structures, such as the green spike, also exhibit photosynthetic capacity and play an irreplaceable role in grain filling [3]. The unique histological structure (epicuticular waxes) and superior physiological position of the wheat spike allow for higher light and water use efficiency and less exposure to pests and diseases compared to leaves [4]. Studies showed that the contribution of spike photosynthesis to grain yield of wheat averaged 20% [5] and up to 45% under source limitations such as water stress [6]. Genetic variation in the contribution of wheat spikes suggested using this trait to predict wheat yield [7]. The increase in yield of modern wheat varieties may ultimately be determined by the photosynthetic capacity of spikes under favorable agronomic conditions [8]. Compared with individual processes of leaf photosynthesis, integrated canopy photosynthesis may be more relevant for determining final yield [9]. Methods for estimating the photosynthetic contribution of nonleaf organs, such as defoliation, shading, spraying photosynthetic inhibitors, and isotope labeling are available [10], but these procedures are labor-intensive and invasive, making it challenging to apply them on a large scale for breeding and production purposes. In comparison with leaf, high-throughput methods for evaluating the effects of spike photosynthesis on yield prediction, particularly under field conditions, are lacking.
Spanning visible, near-infrared (NIR) and shortwave infrared (SWIR), reflectance spectroscopy is favorable for capturing subtle structural and physiological alterations in the crop canopy [11,12]. As a simple and efficient measurement to characterize vegetation change status, spectral index (SI) in various wavelength combinations has been researched intensively in the estimation of phenotypic traits such as leaf area index (LAI) [13], leaf chlorophyll content (LCC) [14] and above ground biomass (AGB) [15]. Most published SIs are developed for enhancing green leaf physiological parameter response to spectral signals and reducing confounding factors, such as the Soil-Adjusted Vegetation Index (SAVI) was designed to eliminate soil background [16], Enhanced Vegetation Index (EVI) [17] and Optimized Soil Adjusted Vegetation Index (OSAVI) have been developed to overcome the oversaturation caused by high biomass [18]. However, the spike layer influences radiative transfer in the canopy because spike structural and optical properties differ from those of leaves [19]. Thus, the reliability of spike trait estimation relying solely on SIs derived from leaves without considering spikes is questionable. Elimination at the post-heading stage of reproductive organs as a confounding factor means that few spike SIs have been developed. Gutierrez et al. [20] suggested that the sensitive bands of the spike layer focus on the NIR region (700–1100 nm) which is relevant to the water content. Additionally, the timing of the disappearance of the chlorophyll content in spikes will also affect the spectral response of the crop [21]. Studies have concentrated on mitigating the impacts of the spike layer on crop canopy reflectance [22] or closing the gap between spikes and leaves [23]. Therefore, exploring a new SI considers the spike layer is worthwhile for accurate spectral prediction of canopy photosynthesis.
Crop photosynthesis is a function of photosynthetic effective area and photosynthetic capacity and shows dynamic changes throughout the entire crop growth period [24]. The response of canopy reflectance to photosynthesis may be complex and cannot be quantified by a single type of SI. Lately, the integration of multiple SIs as a method to achieve multi-stage target trait estimation has risen to prominence [25–27]. However, an excessive number of parameter inputs does not necessarily improve the model's prediction performance [28]. Feature selection reduces data dimensionality and improves the generalization capability of a model, improving prediction accuracy and interpretability [29]. A popular gradient integration learning algorithm, eXtreme Gradient Boosting (XGBoost), has been successfully applied in disease detection [30] and trait prediction [31]. Besides its superior modeling performance, XGBoost ranks features based on their importance scores in the boosting tree [32]. This feature becomes particularly valuable when XGBoost is combined with a feature selection (XGB-FS) algorithm [33], as it enables effective feature filtering and analysis of the contribution of each important feature in the mode. Although previous studies [34–36] have shown the potential of the XGB-FS method for optimizing feature combination, its application in estimating canopy photosynthetic phenotypic traits in winter wheat based on SIs remains unexplored, and the relative importance of spectral features may differ across different stages of crop growth.
The objectives of this study were to (1) construct a novel yieldrelated phenotypic trait: a spike-leaf composite indicator (SLI) that accounts for the contribution of spikes to photosynthesis, (2) develop a SI sensitive to SLI, and (3) establish and evaluate SLI estimation models based on XGB-FS at multiple growth stages.
2. Materials and methods
2.1. Experimental design
Two experiments were conducted at the National Information Agriculture Engineering Technology Center (NETCIA) in Rugao city, Jiangsu province, China (32140 N, 120190 E) from 2019 to 2021. The experimental area has a subtropical monsoon climate with an average annual temperature of 14.6 C, a loamy soil type with an organic matter content of 19.95 g kg1 . Two field trials tested two N fertilizer rates and 19 winter wheat cultivars (Table 1).
Experiment 1 (Exp. 1) was conducted from November 2019 to June 2020. Twelve cultivars containing different leaf and spike types were sown on October 31, 2019. Two N fertilizer rates (N1 = 150 kg ha1 and N2 = 300 kg ha1 ) were applied with one planting density (D = 15 cm 30 cm). N fertilizer was applied at the rate of 50% for basal fertilizer and 50% for tiller fertilizer, and basal fertilizer was accompanied by 120 kg ha1 P2O5 for phosphorus and 135 kg ha1 K2O for potassium.
Experiment 2 (Exp. 2) was conducted from November 2020 to June 2021. Nineteen cultivars were sown on November 05, 2020. Other experimental designs were the same as Exp. 1.
2.2. Data collection
2.2.1. Acquisition and preprocessing of canopy spectra
Canopy reflectance was measured using an ASD FieldSpec Pro spectrometer (Analytical Spectral Devices, Boulder, CO, USA) in the band range 350–2500 nm, with the spectral sampling interval of 1.4 nm for 350–1000 nm and 1.1 nm for 1001–2500 nm. The fiber optic cable coupled with an array detector (with a field of view of 25) at approximately 1 m from the wheat canopy between 10:00 and 14:00 under clear skies was utilized to acquire spectrum data and standard whiteboard calibration was performed before data acquisition in each plot. Three sampling points were set up for each plot with each point having three spectra, and the average of nine spectra was taken as the canopy spectrum of the plot. A comparison of wheat canopy spectral reflectance changes with growth stages under the N1 (150 kg ha1 ) and N2 (300 kg ha1 ) fertilizer rates is shown in Fig. S1, where bands of 13511440 nm, 1801–1960 nm and 2401–2500 nm were removed owing to atmospheric absorption-caused variations.
2.2.2. Measurement of agronomy traits
After spectral measurement, three plants were destructively sampled in each plot; green leaf area was measured using a leaf area instrument (LI-3000, LI-COR Inc., Lincoln, NE, USA); spike length (SL, mm) was measured with a ruler, spike width (SW, mm) and spike thickness (ST, mm) were measured using digital vernier calipers around the middle part. Fifteen plants were separated according to stems, leaves, and spikes and enzymeinactivated at 105 C for 30 min and oven-dried at 80 C to constant weight. LAI, leaf dry weight (LDW, g m2 ), spike dry weight (SDW, g m2 ), and AGB (g m2 ) were calculated in each plot from the planting density after spectral measurement.
One square meter of wheat plants was harvested at maturity with three repetitions per plot [37], spikes were dried under sunlight, and grain number per spike (GN), thousand-grain weight (TGW, g), spike number per unit area (SN) and GY (t ha1 ) were measured (Table S1).
2.3. Data processing
2.3.1. Construction of the spike-leaf composite indicator (SLI)
Similar to LAI, the green spike area index (SAI) indicates the area of green spikes per unit area, with the following equation:
... (1)
... (2)
where SSA (spike effective surface area, m2 ), SN represent the spike surface area and the number of wheat spikes per unit area, respectively. A spike was treated as an elliptical cylinder for calculating its effective surface area for light interception. The surface area of the top end was calculated as ....
Green spike-leaf index (GSLI) indicates the active photosynthetic area per unit area as follows:
... (3)
Considering the contribution of spikes and leaves, the SLI was finally constructed according to the simplified photosyntheticbased yield formation equation (GY = photosynthetic area times photosynthetic capacity, where photosynthetic time and respiratory consumption are ignored in this study) as:
... (4)
where leaf ratio (LR) and spike ratio (SR) represent the respective ratios of LDW (or SDW) to AGB per unit area. LR (or SR) indicates the distribution of plant assimilation products and its value represents the photosynthetic center (on leaves or spikes) in a given stage. Photosynthesis of leaves (SLIleaf) and spikes (SLIspike) is calculated as LR LAInor ð Þ 1 k and SR SAInor k in Eq. (4), where k and (1 k) are the contribution of spikes and leaves, respectively. The value of k is taken in two steps: first, the top 20% of the Pearson correlation coefficient (r) between SLI and GY ( i½; j ) was selected by iterating in steps of 0.1 between 0½; 1 ; second, the value corresponding to rmax was finally selected as the optimal k by iterating in steps of 0.01 between i½; j .
2.3.2. Development of the enhanced triangle vegetation index 3 (ETVI3)
Broge and Leblanc [38] introduced a new spectral index method that parameterizes the general shape of the spectrum by selecting three bands in spectral space. Generally, these bands are supposed to capture subtle changes in relevant physiological parameters which are not observed in the measured canopy reflectance. First, three bands (R550, R670, and R750) in spectral space were selected as the vertices of the triangle vegetation index (TVI) to characterize the radiant energy absorbed by leaf pigments using reflectance changes (Eq. (5). It can also be described as Eq. (6) that we could easily observe the change of TVI value with the fluctuation of three vertices. Haboudane et al. [39] noted that 750 nm is in the transition interval between the red-edge and NIR, which is sensitive to canopy components and also influenced by leaf chlorophyll content. To make the apex of the triangle as much as possible affected by only one factor, 750 nm wavelength was replaced by 850 nm wavelength in this study, in which reflectance is affected by the canopy structure and insensitive to pigment changes. The modified TVI was defined as enhanced triangle vegetation index (ETVI) (Eq. (7), and can be expressed as the area spanned by the triangle Green-Red-NIR. After mathematical deduction, ETVI can be denoted as ETVI1 (Eq. (8) which allows a more observable relationship between the single vertex and the area of the triangle.
... (5)
... (6)
... (7)
... (8)
where A = (550 nm, R550), B = (670 nm, R670), and C = (850 nm, R850).
It can be seen that ETVI1 will increase as a result of wheat canopy tissue abundance (increase of NIR reflectance) and chlorophyll absorption (decrease of red reflectance). However, an increase in chlorophyll content also leads to a decrease in greenband reflectance, which in turn leads to a relative decrease in the triangle area. Wheat spikes were studied to influence the canopy reflectance variation mainly occurred in the NIR region [23], especially concentrated on R970 and R1100 [20,40]. Consequently, the green band is substituted by an alternative NIR band after conducting sensitivity analyses on various band combinations within the 900 nm to 1200 nm range to mitigate fluctuations in the central band. The substitution of R550 with R1080 in the ETVI1 formula was carried out to ensure consistency with the principle that R1080 exhibits a positive correlation with wheat plant water content (PWC) [41]. While a similar function is present within the 900 nm wavelength, the selection of 1080 nm is aimed to maximize the difference in triangle area across various growth stages. Finally, the value of ETVI2 (Eq. (9)) is proportional to R850 and R1080, and inversely proportional to R670.
... (9)
The NIR shoulder is influenced by canopy tissue abundance, to which spikes and leaves make different contributions at different times. Therefore, the spike-leaf adjustment factor (h) is calculated depending on the photosynthetic center at the given stage and is used to modify the 850 nm wavelength in the ETVI2. Then, ETVI3 was defined as:
... (10)
...
The schematic diagram of the conception of TVI and ETVI3 is shown in Fig. 1. To evaluate the performance of ETVI3 in SLI estimation, 55 SIs were summarized and classified by canopy-level biochemical parameters (e.g., LAI, LCC, biomass, PWC) in the literature and references can be found in Table S2.
2.4. Data analysis
2.4.1. Dataset partition
A set of 112 agronomic samples per stage were used to construct SLI, 160 spectral samples of Exp. 1 were used to calibrate SLI estimation models, and 360 spectral samples of Exp. 2 were used to independently verify the models. The wheat cultivars in different experiments were labeled according to the average amount of manually measured grain yield. Following clustering, samples with a yield of more than 8 t ha1 , between 8 and 5.5 t ha1 , and less than 5.5 t ha1 were grouped into variety high-yielding (HY), variety medium-yielding (MY) and variety low-yielding (LY), respectively.
2.4.2. XGB-FS-ML based model building
The multiple growth stage estimation model of SLI based on the XGB-FS-ML method consists of three steps: (1) data pre-processing, (2) feature selection based on XGB-FS algorithm, and (3) regression analysis based on machine learning (ML). The flow chart is illustrated in Fig. S2 and the details are described as follows:
(1) Outliers were removed from the raw data [42]and input features were normalized as subset #1.
(2) XGBoost was used as the base model to rank the importance of subset #1, for it can automatically process interactive features and take into account the interactions between features [32]. Then all features with a cumulative feature importance score of 95% were selected as subset #2. To reduce the complexity and improve the accuracy of the model, the sequential backward selection algorithm (SBS) was used to sequentially eliminate the lowest-ranked features [43], and the coefficient of determination (R2 ) was used as the evaluation criterion until the evaluation function achieved the optimum after eliminating the last feature and the selected features at that point was identified as the best feature subset.
(3) Four machine learning algorithms: multiple linear regression (MLR), support vector machine regression (SVR), least absolute shrinkage and selection operator regression (LASSO), and k Neighbors regression (KNR), widely used in crop phenomics, were selected as embedding functions to estimate SLI based on the best feature subset. MLR is a commonly used basic regression algorithm that using multiple explanatory variables to predict a target variable [44]. SVR is a supervised learning that uses a kernel function mapped to the feature space to optimize the model by maximizing the width of the interval bands and minimizing the loss [45] and the kernel function of SVR is chosen as "linear" and other parameters are kept as default. LASSO is a biased estimator for processing data with multicollinearity and improves the generalization ability of the model by introducing a penalty term k [46]. The calibration dataset (Exp. 1) was used in this step to train the SLI estimation models, with a 10-fold cross-validation method further used to optimize the hyperparameters of LASSO. KNR is a class of multivariate nonparametric approaches to prediction that has made it popular for crop and forest inventory [47]. The parameter of the KNR in this study was the default value.
2.4.3. Accuracy assessment
The estimation accuracy of SLI was evaluated with the validation dataset (Exp. 2) in terms of the R2 , root mean square error (RMSE), and relative RMSE (RE) as follows:
... (11)
... (12)
... (13)
where SSE and SST are the error sum of squares and a total sum of squares, respectively; y... and y... i are the measured and estimated sample values, respectively; y is the mean of the measured sample values and n is the total number of sample data.
3. Results
3.1. Dynamic changes of SLI
Fig. 2 illustrates the dynamic changes of SLI during critical growth stages in 2020 and 2021 for two nitrogen fertilizers and three wheat varieties. The trend of SLI change was consistent across multiple conditions, exhibiting an increase during leaf growth before spike emergence, followed by a "V"curve until spike senescing. Analysis of variance revealed significant variations in SLI among nitrogen fertilizers, varieties, and years (Table S3). Average yield in 2021 surpassed that of 2020, resulting in a slight increase in SLI values compared to 2020. This was supported by the observation that SLI values increased with yield under the same nitrogen fertilizer treatment. In 2021, significant variations in SLI were observed among different wheat varieties at all growth stages, except for the jointing stage (DAS 130); and in 2020, SLI was able to effectively distinguish high-yielding varieties at most growth stages, except for the flowering stage (DAS 165) where the difference between high-yielding and medium-yielding varieties was not statistically significant (Fig. 2C, D). However, considering the combined effects of various factors, SLI did not differ (Table S3).
A fitted plot between SLI and LAI with GY at six primary growth stages illustrates that SLI and GY have a higher relationship with R2 in the range of 0.35 to 0.72, compared to the relationships of LAI and GY with R2 of 0.36 to 0.66 (Fig. 3), especially during the post-heading phase. Specifically, the relationship of GY with SLI performed comparably with the LAI during the pre-heading phase, with an increasing tendency whereas the LAI gradually decreased during the post-heading phase.
3.2. Relationships between spectral indices and SLI at multiple stages
A comparison of the relationship of the published spectral indices and ETVI3 with SLI under the pre-heading and postheading phases is presented in Fig. 4 and Table S4. Most SIs produced R2 values had declined from pre-heading to post-heading stage, while ETVI3 maintained a relatively stability and had a better relationship with SLI, especially at the post-heading stage.
Further evaluating the performance of the ETVI3 based SLI estimation model under the three wheat varieties over the preheading and post-heading phases resulted in varied accuracies in SLI estimation (Table S5). Overall, the performance evaluated on the validation datasets was only slightly degraded compared to the calibration datasets. In which, higher performance on the HY variety during the pre-heading phase in the calibration and validation datasets with RE of 15.71% and 25.22%, respectively, whereas the performances on the MY and LY variety still need to be improved both at the pre-heading and post-heading stage.
3.3. Estimation of SLI based on XGB-FS-ML
The contribution of SIs to the SLI estimation model varied slightly at different phases (Fig. 5). The most important features during the pre-heading phase were EVI and Atmospherically Resistant Vegetation Index (ARVI) (Fig. 5A). Both SIs contain a blue band designed to correct for the effects of atmospheric scattering and soil background, while ETVI3 contributes less during this period. However, ETVI3 became the most important feature during the post-heading phase (Fig. 5B). The number of SIs developed for PWC changed from one (NWI3, Normalized Water Index-3) to two (NWI3; NWI2, Normalized Water Index-2) in the top ten important features, while feature importance scores were both increased.
When applying the selected features from the XGB-FS algorithm to predict the SLI values, four regression models produced relatively stability and reliability estimation accuracy for both preheading and post-heading datasets (Fig. 6). Comparison of the XGBFSML based models composed of the published SI (Fig. 6EH, MP) to the ETVI3 based models (Table S5) showed that the estimation accuracy of SLI was not improved. After considering the ETVI3 in the XGB-FS-ML based models, clear improvements in the estimation of SLI were observed during the post-heading phase while showing negligible effect during the pre-heading phase. ETVI3 showed a positive effect on the XGB-FS-ML based models while wheat spikes appeared. During the post-heading phase, LASSO showed a higher estimation accuracy of SLI (Fig. 6K, calibration RE = 26.93%), compared to the MLR, SVR and KNR (Fig. 6I–J, L). When validating XGB-FS-LASSO model using three varieties, HY variety and MY variety acquired higher improvement with reducing the RMSE by 0.017 to 0.035 and RE by 9.58% to 13.08%, only marginal variations in estimation accuracy were observed on LY variety, reducing RMSE by 0.007 and RE by 9.42% (Tables S5, S6).
4. Discussion
4.1. Uncertainty parameters and limitations of SLI
For better understanding and interpreting of the construction of SLI in the study, two uncertainty parameters of SLI, the method of calculating SSA and the contribution of the wheat spike photosynthesis (k) to GY, should be clarify. By comparing the results from four calculation methods (Table S7), we hope to determine a more appropriate geometric shape for calculating SSA. Previous studies tend to use a cylinder when calculating spike effective surface area for light interception [5,48] although it is observed to be closer to a fusiform shape in nature. Our study demonstrated that utilize the Elliptical Cylinder for calculating SSA was able to have a better correlation with GY, even if the improvement is negligible compared with other calculation methods (Table S8). It can be concluded that the method of calculating SSA did not play a dominant role in the construction of population spike photosynthesis especially when the spike geometric parameters were fixed. Moreover, k was eventually defined as 0.28 in our study and ties well with previous studies that spike photosynthesis contributes 9.8% to 39% to wheat GY [5,49,50], although the measurement methods differed. One concern about this result was that it determined by the correlation analysis using limited datasets and may have little difference in varieties and growth stages. Future studies we would like to emphasize the important of the field spike photosynthesis for validating and refining the k for it may genotypic differences as well as temporal variations.
4.2. Advantages of SLI for wheat variety identification and yield prediction
The differences in SLI under different nitrogen fertilizers and varieties between years on a temporal scale illustrate that SLI has the potential for high-yielding variety identification of wheat and crop management (Table S3). Previous studies [5,9,51] indicated that consideration of wheat spike photosynthesis may alleviate the degradation of yield prediction performance due to leaf senescence in later growth periods, the R2 value increased for SLI and GY but decreased for LAI and GY as the crop matured in our study also lead to similar conclusion (Fig. 3). The dynamic of SLI value can be observed by two phases: one phase is at the pre-heading phase with an increasing trend same as LAI, that mainly be attributed to leaves being the primary photosynthetic organ in the canopy during this period. The other phase is at the post-heading phase, senescent leaves and growing spikes often coexist resulting in the light interception and canopy chlorophyll content not declining immediately [49,52], probably accounting for the "V" shaped pattern of the value of SLI from DAS 150 to DAS 175.
4.3. Performance of the ETVI3 and XGB-FS-ML method on SLI estimation
ETVI3 was designed to capture canopy structure and physiology variation for both leaves and spikes. Therefore, a major aim of developing ETVI3 is to consider instead of eliminating the influence of changes in canopy reflectance by wheat spikes to narrow the gap with the lack of spike-related indices. The results showed a stable relationship of ETVI3 with SLI throughout the pre-heading and post-heading phase whereas a weaker relationship of published SI with SLI while reproductive organs appeared (Fig. 4; Table S4), providing confidence in the feasible of the development of ETVI3. For the HY variety, the ETVI3-based models exhibited strong correlation with SLI. In contrast, lower accuracy was observed for the MY and LY varieties (Table S5). This could be attributed to factors such as canopy density and soil background, which may impact estimation accuracy at different stages [53]. Additionally, the influence of canopy structural, chlorophyll content, and PWC on ETVI3 in the later stages remains unclear. It is crucial to further investigate and differentiate the sensitive spectral bands associated with wheat spikes and leaves in the spectral reflectance. By elucidating the respective contributions of wheat spikes and leaves to the canopy spectrum, we can further improve the performance of the SI based models.
An XGB-FS-ML method was established to improve the estimation accuracy of SLI, which allows leveraging the strengths of various indices and their complementary information. A strength of the ML-based estimation methodology is to ensure the rationality and interpretability of the optimal feature subset selected from SIs by XGB-FS procedure [29], which permits the transferability and stability of this methodology to other conditions [54]. Soilsensitive SIs such as EVI, OSAVI were selected in the pre-heading stage, while PWC-sensitive SIs were highlighted in the postheading stage (Fig. 5). Note that the inconsistent of optimal features over different periods demonstrated our assumption about the difference of ETVI3 between growth stages. Generally, multiple features describing several agronomic traits could better reveal the complex processes associated with SLI, as demonstrated in our independently validated result between four regression algorithms (Fig. 6). Similar strategies have been applied in crop biomass estimation [55] and plant disease detection [56].
4.4. Potential applications and future perspectives
GY is a product of cumulative photosynthesis over the entire growth cycle, especially in the grain filling stage. Indirect models based on "spectral features physiological parameters quality indicator" are typically used to improve the stability and interpretability of indicator estimation or prediction [31,57]. However, studies usually occur at an early growth stage, ignoring the effects of reproductive organs on yield production and canopy spectral reflectance. This study proposed a yield-related phenotyping by coupling wheat leaf and spike photosynthesis for the first time and developed a high-throughput method for spectral estimation of it. Our results enhance the relationship between spectral features and canopy photosynthesis parameters at a later growth stage, and thus, it has the potential to contribute to the accurate spectral estimation of GY.
Although this study has made progress in understanding the relationship between spectral features and GY during the late growth period, current regression models did not always provide the best prediction performance for GY. To find the best performing model, integrating of multiple data source and improved regression algorithms should be tested. Field data were used to generate SLI allowing fewer assumptions to be made, but its collection time-consuming. The incorporation of variety and management inputs with simulation models is a promising avenue for future research. The concept of LR (or SR) is similar to that of the substance partitioning index, which can be accurately calibrated using process-based crop growth models such as CropGrow [58,59]. The growth stage used for yield prediction is also a critical factor. Further investigation of the development stage adaptability of indirect models in yield prediction should is also worth exploring.
5. Conclusions
This study constructed a novel yield-related phenotypic trait relying on photosynthetic-based yield formation theory and estimated its dynamics by integrating spectral indices and machine learning algorithms. The proposed trait of SLI was strongly correlated to GY throughout critical growth stages, whereas LAI was more poorly correlated after spike emergence. It also displayed varieties differences as well as sensitive to nitrogen fertilization. ETVI3 exhibited a higher sensitivity to canopy structure and physiology variation than existing spectral indices for the retrieval of SLI. XGB-FS-ML method improved the accuracy of SLI estimation models, in which XGB-FS-LASSO was validated with a higher accuracy (R2 = 0.71, RMSE = 0.047, RRMSE = 26.93%). Our results provide new insights into the role of the spike photosynthesis for spectral prediction of wheat yield. The method developed for estimating SLI is promising for the accurate spectral yield prediction and may be extended to precision management of field crops and selection of superior wheat cultivars.
CRediT authorship contribution statement
Haiyu Tao: Conceptualization, Methodology, Investigation, Validation, Data curation, Formal analysis, Software, Visualization, Writing – original draft, Writing – review & editing, Funding acquisition. Ruiheng Zhou: Methodology, Investigation, Validation, Data curation, Formal analysis. Yining Tang: Investigation, Data curation, Writing – original draft. Wanyu Li: Investigation, Formal analysis, Writing – original draft. Xia Yao: Conceptualization, Writing – review & editing. Tao Cheng: Conceptualization, Writing – review & editing. Yan Zhu: Conceptualization, Writing – review & editing. Weixing Cao: Conceptualization, Writing – review & editing. Yongchao Tian: Conceptualization, Resources, Writing – original draft, Writing – review & editing, Supervision, Project administration, Funding acquisition.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (32371990, 31971784), the Earmarked Fund for Jiangsu Agricultural Industry Technology System (JATS (2022) 168, JATS (2022)468), the Jiangsu Provincial Cooperative Promotion Plan of Major Agricultural Technologies (2021-ZYXT-01-1), and the Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX23_0783).
ARTICLE INFO
Article history:
Received 5 September 2023
Revised 5 December 2023
Accepted 22 April 2024
Available online 13 May 2024
* Corresponding author.
E-mail address: [email protected] (Y. Tian).
References
[1] X.G. Zhu, S.P. Long, D.R. Ort, Improving photosynthetic efficiency for greater yield, Annu. Rev. Plant Biol. 61 (2010) 235–261.
[2] A. Krieger-Liszkay, K. Krupinska, G. Shimakawa, The impact of photosynthesis on initiation of leaf senescence, Physiol. Plant. 166 (2019) 148–164.
[3] J. Kashiwagi, Y. Yoshioka, S. Nakayama, Y. Inoue, P. An, T. Nakashima, Potential importance of the ear as a post-anthesis carbon source to improve drought tolerance in spring wheat (Triticum aestivum L.), J. Agron. Crop Sci. 207 (2021) 936–945.
[4] R. Sanchez-Bragado, M.D. Serret, J.L. Araus, The nitrogen contribution of different plant parts to wheat grains: exploring genotype, water, and nitrogen effects, Front. Plant Sci. 7 (2017) 1986.
[5] M. Zhang, Y. Gao, Y. Zhang, T. Fischer, Z. Zhao, X. Zhou, Z. Wang, E. Wang, The contribution of spike photosynthesis to wheat yield needs to be considered in process-based crop models, Field Crops Res. 257 (2020) 107931.
[6] M.L. Maydup, M. Antonietta, J.J. Guiamet, C. Graciano, J.R. López, E.A. Tambussi, The contribution of ear photosynthesis to grain filling in bread wheat (Triticum aestivum L.), Field Crops Res. 119 (2010) 48–58.
[7] G. Molero, M.P. Reynolds, Spike photosynthesis measured at high throughput indicates genetic variation independent of flag leaf photosynthesis, Field Crops Res. 255 (2020) 107866.
[8] R. Sanchez-Bragado, R. Vicente, G. Molero, M.D. Serret, M.L. Maydup, J.L. Araus, New avenues for increasing yield and stability in C3 cereals: Exploring ear photosynthesis, Curr. Opin. Plant Biol. 56 (2020) 223–234.
[9] E.H. Murchie, S. Kefauver, J.L. Araus, O. Muller, U. Rascher, P.J. Flood, T. Lawson, Measuring the dynamic photosynthome, Ann. Bot. 122 (2018) 207–220.
[10] E.A. Tambussi, M.L. Maydup, C.A. Carrión, J.J. Guiamet, J.L. Araus, Ear photosynthesis in C3 cereals and its contribution to grain yield: methodologies, controversies, and perspectives, J. Exp. Bot. 72 (2021) 39563970.
[11] Y. Fu, G. Yang, R. Pu, Z. Li, H. Li, X. Xu, X. Song, X. Yang, C. Zhao, An overview of crop nitrogen status assessment using hyperspectral remote sensing: current status and perspectives, Eur. J. Agron. 124 (2021) 126241.
[12] H. Tao, S. Xu, Y. Tian, Z. Li, Y. Ge, J. Zhang, Y. Wang, G. Zhou, X. Deng, Z. Zhang, Y. Ding, D. Jiang, Q. Guo, S. Jin, Proximal and remote sensing in plant phenomics: 20 years of progress, challenges, and perspectives, Plant Commun. 3 (2022) 100344.
[13] N. Xing, W. Huang, Q. Xie, Y. Shi, H. Ye, Y. Dong, M. Wu, G. Sun, Q. Jiao, A transformed triangular vegetation index for estimating winter wheat leaf area index, Remote Sens. 12 (2019) 16.
[14] B. Cui, Q. Zhao, W. Huang, X. Song, H. Ye, X. Zhou, A new integrated vegetation index for the estimation of winter wheat leaf chlorophyll content, Remote Sens. 11 (2019) 974.
[15] Z. Li, Y. Zhao, J. Taylor, R. Gaulton, X. Jin, X. Song, Z. Li, Y. Meng, P. Chen, H. Feng, C. Wang, W. Guo, X. Xu, L. Chen, G. Yang, Comparison and transferability of thermal, temporal and phenological-based in-season predictions of aboveground biomass in wheat crops from proximal crop reflectance data, Remote Sens. Environ. 273 (2022) 112967.
[16] A.R. Huete, A soil-adjusted vegetation index (SAVI), Remote Sens. Environ. 25 (1988) 295–309.
[17] A.R. Huete, H.Q. Liu, K. Batchily, W. van Leeuwen, A comparison of vegetation indices over a global set of TM images for EOS-MODIS, Remote Sens. Environ. 59 (1997) 440–451.
[18] G. Rondeaux, M. Steven, F. Baret, Optimization of soil-adjusted vegetation indices, Remote Sens. Environ. 55 (1996) 95–107.
[19] W. Li, J. Jiang, M. Weiss, S. Madec, F. Tison, B. Philippe, A. Comar, F. Baret, Impact of the reproductive organs on crop BRDF as observed from a UAV, Remote Sens. Environ. 259 (2021) 112433.
[20] M. Gutierrez, M.P. Reynolds, A.R. Klatt, Effect of leaf and spike morphological traits on the relationship between spectral reflectance indices and yield in wheat, Int. J. Remote Sens. 36 (2015) 701–718.
[21] M. Weiss, D. Troufleau, F. Baret, H. Chauki, L. Prévot, A. Olioso, N. Bruguier, N. Brisson, Coupling canopy functioning and radiative transfer models for remote sensing data assimilation, Agric. For. Meteorol. 108 (2001) 113–128.
[22] H. Li, C. Zhao, G. Yang, H. Feng, Variations in crop variables within wheat canopies and responses of canopy spectral characteristics and derived vegetation indices to different vertical leaf layers and spikes, Remote Sens. Environ. 169 (2015) 358–374.
[23] J. He, N. Zhang, X. Su, J. Lu, X. Yao, T. Cheng, Y. Zhu, W. Cao, Y. Tian, Estimating leaf area index with a new vegetation index considering the influence of rice panicles, Remote Sens. 11 (2019) 1809.
[24] M.A.J. Parry, M. Reynolds, M.E. Salvucci, C. Raines, P.J. Andralojc, X.G. Zhu, G.D. Price, A.G. Condon, R.T. Furbank, Raising yield potential of wheat. II. Increasing photosynthetic capacity and efficiency, J. Exp. Bot. 62 (2011) 453–467.
[25] X. Jing, Q. Zou, J. Yan, Y. Dong, B. Li, Remote sensing monitoring of winter wheat stripe rust based on mRMR-XGBoost algorithm, Remote Sens. 14 (2022) 756.
[26] L. Geng, T. Che, M. Ma, J. Tan, H. Wang, Corn biomass estimation by integrating remote sensing and long-term observation data based on machine learning techniques, Remote Sens. 13 (2021) 2352.
[27] S. Xi, J. Wang, L. Ding, J. Lu, J. Zhang, X. Yao, T. Cheng, Y. Zhu, W. Cao, Y. Tian, Grain yield prediction using multi-temporal UAV-based multispectral vegetation indices and endmember abundance in rice, Field Crops Res. 299 (2023) 108992.
[28] K. Bhardwaj, S. Patra, An unsupervised technique for optimal feature selection in attribute profiles for spectral-spatial classification of hyperspectral images, ISPRS J. Photogramm. Remote Sens. 138 (2018) 139–150.
[29] L. Tian, B. Xue, Z. Wang, D. Li, X. Yao, Q. Cao, Y. Zhu, W. Cao, T. Cheng, Spectroscopic detection of rice leaf blast infection from asymptomatic to mild stages with integrated machine learning and feature selection, Remote Sens. Environ. 257 (2021) 112350.
[30] Z. Xu, Q. Zhang, S. Xiang, Y. Li, X. Huang, Y. Zhang, X. Zhou, Z. Li, X. Yao, Q. Li, X. Guo, Monitoring the severity of Pantana phyllostachysae Chao infestation in Moso bamboo forests based on UAV multi-spectral remote sensing feature selection, Forests 13 (2022) 418.
[31] W. Li, W. Wu, M. Yu, H. Tao, X. Yao, T. Cheng, Y. Zhu, W. Cao, Y. Tian, Monitoring rice grain protein accumulation dynamics based on UAV multispectral data, Field Crops Res. 294 (2023) 108858.
[32] T. Chen, C. Guestrin, XGBoost: a scalable tree boosting system, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Association for Computing Machinery, San Francisco, CA, USA, 2016, pp. 785–794.
[33] E. Lasso, D.C. Corrales, J. Avelino, E. de Melo Virginio Filho, J.C. Corrales, Discovering weather periods and crop properties favorable for coffee rust incidence from feature selection approaches, Comput. Electron. Agric. 176 (2020) 105640.
[34] A. Zhang, Z. Dong, X. Kang, Feature selection algorithms of airborne LiDAR combined with hyperspectral images based on XGBoost, Chin. J. Lasers 46 (2019) 150–158.
[35] G. Fang, L. Fang, L. Yang, D. Wu, Comparison of variable selection methods among dominant tree species in different regions on forest stock volume estimation, Forests 13 (2022) 787.
[36] H.R. Seireg, Y.M.K. Omar, F.E.A. El-Samie, A.S. El-Fishawy, A. Elmahalawy, Ensemble machine learning techniques using computer simulation data for wild blueberry yield prediction, IEEE Access 10 (2022) 64671–64687.
[37] R. Tanabe, T. Matsui, T.S.T. Tanaka, Winter wheat yield prediction using convolutional neural networks and UAV-based multispectral imagery, Field Crops Res. 291 (2023) 108786.
[38] N.H. Broge, E. Leblanc, Comparing prediction power and stability of broadband and hyperspectral vegetation indices for estimation of green leaf area index and canopy chlorophyll density, Remote Sens. Environ. 76 (2001) 156–172.
[39] D. Haboudane, J.R. Miller, E. Pattey, P.J. Zarco-Tejada, I.B. Strachan, Hyperspectral vegetation indices and novel algorithms for predicting green LAI of crop canopies: modeling and validation in the context of precision agriculture, Remote Sens. Environ. 90 (2004) 337–352.
[40] W. Kong, W. Huang, L. Ma, L. Tang, C. Li, X. Zhou, R. Casa, Estimating vertical distribution of leaf water content within wheat canopies after head emergence, Remote Sens. 13 (2021) 4125.
[41] L. Liu, J. Wang, W. Huang, C. Zhao, B. Zhang, Q. Tong, Estimating winter wheat plant water content using red edge parameters, Int. J. Remote Sens. 25 (2004) 3331–3342.
[42] C. Leys, C. Ley, O. Klein, P. Bernard, L. Licata, Detecting outliers: do not use standard deviation around the mean, use absolute deviation around the median, J. Exp. Soc. Psychol. 49 (2013) 764–766.
[43] S.J. Reeves, Z. Zhao, Sequential algorithms for observation selection, IEEE Trans. Signal Process. 47 (1999) 123–132.
[44] K. Xu, Y. Su, J. Liu, T. Hu, S. Jin, Q. Ma, Q. Zhai, R. Wang, J. Zhang, Y. Li, H. Liu, Q. Guo, Estimation of degraded grassland aboveground biomass using machine learning methods from terrestrial laser scanning data, Ecol. Indic. 108 (2020) 105747.
[45] C. Cortes, V. Vapnik, Supportvector networks, Mach. Learn. 20 (1995) 273297.
[46] R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. B. 58 (1996) 267–288.
[47] G. Chirici, M. Mura, D. McInerney, N. Py, E.O. Tomppo, L.T. Waser, D. Travaglini, R.E. McRoberts, A meta-analysis and review of the literature on the k-Nearest Neighbors technique for forestry applications that use remotely sensed data, Remote Sens. Environ. 176 (2016) 282–294.
[48] O. Gaju, M.P. Reynolds, D.L. Sparkes, M.J. Foulkes, Relationships between largespike phenotype, grain number, and yield potential in spring wheat, Crop Sci. 49 (2009) 961–973.
[49] O. Merah, P. Monneveux, Contribution of different organs to grain filling in durum wheat under Mediterranean conditions I. Contribution of post-anthesis photosynthesis and remobilization, J. Agron. Crop Sci. 201 (2015) 344–352.
[50] M.L. Maydup, M. Antonietta, C. Graciano, J.J. Guiamet, E.A. Tambussi, The contribution of the awns of bread wheat (Triticum aestivum L.) to grain filling: responses to water deficit and the effects of awns on ear temperature and hydraulic conductance, Field Crops Res. 167 (2014) 102–111.
[51] J. Zhu, Y. Yin, J. Lu, T.A. Warner, X. Xu, M. Lyu, X. Wang, C. Guo, T. Cheng, Y. Zhu, W. Cao, X. Yao, Y. Zhang, L. Liu, The relationship between wheat yield and suninduced chlorophyll fluorescence from continuous measurements over the growing season, Remote Sens. Environ. 298 (2023) 113791.
[52] C. Celestina, M.T. Bloomfield, K. Stefanova, J.R. Hunt, Use of spike moisture content to define physiological maturity and quantify progress through grain development in wheat and barley, Crop Pasture Sci. 72 (2021) 95–104.
[53] W. Li, D. Li, S. Liu, F. Baret, Z. Ma, C. He, T.A. Warner, C. Guo, T. Cheng, Y. Zhu, W. Cao, X. Yao, RSARE: A physically-based vegetation index for estimating wheat green LAI to mitigate the impact of leaf chlorophyll content and residue-soil background, ISPRS J. Photogramm. Remote Sens. 200 (2023) 138–152.
[54] P.J. Zarco-Tejada, C. Camino, P.S.A. Beck, R. Calderon, A. Hornero, R. HernándezClemente, T. Kattenborn, M. Montes-Borrego, L. Susca, M. Morelli, V. GonzalezDugo, P.R.J. North, B.B. Landa, D. Boscia, M. Saponari, J.A. Navas-Cortes, Previsual symptoms of Xylella fastidiosa infection revealed in spectral planttrait alterations, Nat. Plants 4 (2018) 432–439.
[55] Y. Han, R. Tang, Z. Liao, B. Zhai, J. Fan, A novel hybrid GOA-XGB model for estimating wheat aboveground biomass using UAV-based multispectral vegetation indices, Remote Sens. 14 (2022) 3506.
[56] T. Poblete, C. Camino, P.S.A. Beck, A. Hornero, T. Kattenborn, M. Saponari, D. Boscia, J.A. Navas-Cortes, P.J. Zarco-Tejada, Detection of Xylella fastidiosa infection symptoms with airborne multispectral and thermal imagery: assessing bandset reduction performance from hyperspectral analysis, ISPRS J. Photogramm. Remote Sens. 162 (2020) 27–40.
[57] Z. Wang, J. Chen, J. Zhang, Y. Fan, Y. Cheng, B. Wang, X. Wu, X. Tan, T. Tan, S. Li, M.A. Raza, X. Wang, T. Yong, W. Liu, J. Liu, J. Du, Y. Wu, W. Yang, F. Yang, Predicting grain yield and protein content using canopy reflectance in maize grown under different water and nitrogen levels, Field Crops Res. 260 (2021) 107988.
[58] Y. Zhu, L. Tang, L. Liu, B. Liu, X. Zhang, X. Qiu, Y. Tian, W. Cao, Research progress on the crop growth model CropGrow, Sci. Agric. Sin. 53 (2020) 3235–3256.
[59] T. Liu, W. Cao, W. Luo, S. Wang, W. Zou, Q. Zhou, W. Guo, Quantitative simulation on dry matter partitioning dynamic in wheat organs, J. Triticeae Crops 21 (2001) 25–31.
Appendix A. Supplementary data
Supplementary data for this article can be found online at https://doi.org/10.1016/j.cj.2024.04.003.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The contribution of spike photosynthesis to grain yield (GY) has been overlooked in the accurate spectral prediction of yield. Thus, it's essential to construct and estimate a yield-related phenotypic trait considering spike photosynthesis. Based on field and spectral reflectance data from 19 wheat cultivars under two nitrogen fertilization conditions in two years, our objectives were to (i) construct a yield-related phenotypic trait (spike–leaf composite indicator, SLI) accounting for the contribution of the spike to photosynthesis, (ii) develop a novel spectral index (enhanced triangle vegetation index, ETVI3) sensitive to SLI, and (iii) establish and evaluate SLI estimation models by integrating spectral indices and machine learning algorithms. The results showed that SLI was sensitive to nitrogen fertilizer and wheat cultivar variation as well as a better predictor of yield than the leaf area index. ETVI3 maintained a strong correlation with SLI throughout the growth stage, whereas the correlations of other spectral indices with SLI were poor after spike emergence. Integrating spectral indices and machine learning algorithms improved the estimation accuracy of SLI, with the most accurate estimates of SLI showing coefficient of determination, root mean square error (RMSE), and relative RMSE values of 0.71, 0.047, and 26.93%, respectively. These results provide new insights into the role of fruiting organs for the accurate spectral prediction of GY. This high-throughput SLI estimation approach can be applied for wheat yield prediction at whole growth stages and may be assisted with agronomical practices and variety selection.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer