1. Introduction
Forest canopy height is the basis for the estimation of forest stock and biomass. Therefore, obtaining accurate forest canopy height information is important for assessing forest growth and carbon balance. Currently, forest canopy height information is typically estimated at large, regional scales using remote sensing methods, primarily optical remote sensing, LiDAR (light detection and ranging) remote sensing, synthetic aperture radar remote sensing, etc [1]. For the forest vertical height information estimation, optical remote sensing is less sensitive to forest vertical structure information and is easily saturated and affected by weather [2]. While LiDAR is the most accurate remote sensing tool for forest canopy height measurement, the data acquisition cost is high, and it is difficult to carry out forest canopy height measurements across large areas [3,4,5]. Polarized interferometric SAR (PolInSAR) is an active remote sensing technique based on the penetrating and scattering characteristics of microwaves to obtain height information of ground targets. It is therefore widely used for forest canopy height estimation. With the development of PolInSAR technology, and the subsequent acquisition of a large amount of satellite-based and airborne SAR data, forest canopy height inversion based on PolInSAR technology has become an important topic within quantitative remote sensing of forests [6,7,8].
The current PolInSAR-based forest canopy height inversion methods can be divided into mechanism modeling and machine learning methods. The mechanism modeling methods include ground-phase differencing, two-layer random volume of ground (RVoG) scattering RVoG modeling, the derived coherence amplitude method, and the combined phase-coherence amplitude inversion method. In the ground-phase differencing method, the scattering phase centers of the ground and canopy for differencing are calculated based on the scattering mechanism of PolInSAR. As the specific location of the effective scattering center is related to the forest structure and microwave frequency, this method may underestimate the forest canopy height [9,10]. The most classical and widely used methods are the RVoG coherence scattering model and the RVoG three-stage method, which have been successfully applied to different frequency InSAR/PolInSAR data, including C, L, P, and even X-band [11,12,13], and different forest types are included in these studies [14,15]. In the RVoG model, the ground magnitude ratio is usually assumed to be 0. The ground phase is solved by fitting the coherence line to the intersection of the unit circle, and a reasonable extinction coefficient and forest canopy height are set to construct a look-up table for forest canopy height inversion. The coherence magnitude method and the combined phase coherence magnitude inversion method are simplifications of the RVoG model under special assumptions [16], and the coherence magnitude method additionally assumes that the forest structure is homogeneous. However, these model assumptions do not necessarily apply to actual forest conditions.
Therefore, machine learning methods have been proposed to estimate forest canopy height by using a small amount of ground-based forest canopy height information combined with polarized interferometric variables of PolInSAR to train inverse models to predict forest canopy height at large regional scales. Most machine learning applications use coherence, geometric parameters, backward scattering features and even coherence shape parameters as independent variables. In Zahriban Hesari and Persson ‘s study, forest canopy height was estimated by constructing a linear regression relationship between InSAR coherence and forest canopy height, which is the simplest application [17,18]. However, for fully polarized PolInSAR data, there is more information that can be mined, so in Brigot’s study, the distribution trait parameters of coherence points were used as independent variables based on the RVoG model, and a neural network model was used to estimate forest canopy height, which gave a more satisfactory result [19].
However, a key issue is overlooked in these studies; the variables used in the above studies do not fully reflect the intuitive reflection of SAR on forest vertical structure parameters. A noteworthy advantage of PolInSAR is that it can acquire the vertical structure information of the forest (such as phase center height, penetration depth, geometric parameters, etc.) [17,18,19]. Therefore, there remains a need for improvement of mining and optimization of machine learning variables, a gap which mechanism models can potentially fill. One promising option is to utilize vertical structure information derived from mechanism modeling as an independent variable for machine learning methods to construct the inverse model.
In summary, although the current mechanism models based on PolInSAR for forest canopy height inversion (e.g., RVoG model) are physical process models, realistic conditions for model parameterization are often difficult to establish for practical applications, resulting in large forest canopy height estimation errors. As an alternative, machine learning approaches offer the benefit of model simplicity, but these tools provide limited capabilities for interpretation and generalization. To leverage the benefits of both approaches, we used a fusion model to estimate forest canopy height, utilizing the RVoG three-stage method and the RVoG phase-coherence magnitude method as model controls. Based on this, the RVoG three-stage method and penetration depth models were used to calculate the phase center height, coherence separation, and microwave penetration depth as vertical structure parameters representing forest canopy height and add the geometric parameters of the observation platform and baseline selection parameters as independent variables of the machine learning method to invert the forest canopy height.
At present. TanDEM-X has been proven to be an effective tool for forest canopy height estimation. ALOS-2 and SAOCOM satellite data are also gradually used for forest canopy height estimation studies to accommodate the quantitative inversion of forest parameters. The purpose of this study is to develop an accurate and efficient method for forest canopy height inversion to serve global forest canopy height estimation and forest carbon measurement. So far, GEDI and ICEsat-2 have acquired a large amount of forest canopy height data, and our proposed method will become more practical with the realization of TanDEM-L and BIOMASS satellites and the NISAR program.
2. Materials and Methods
2.1. Study Area and Data
The test area is located in Gabon, on the west coast of Africa. In 2016, the National Aeronautics and Space Administration (NASA) collaborated with the European Space Agency (ESA) and the Gabonese Space Agency to conduct the AfriSAR project. The NASA Unmanned Aerial Vehicle Synthetic Aperture Radar (UAVSAR) and Airborne LiDAR sensors acquired L-band multi-baseline fully polarized PolInSAR data and full-waveform LiDAR datasets for calculating forest structure parameters and topography, respectively. The UAVSAR dataset is publicly available in the form of polarimetrically calibrated, baseline fine coregistered, and SLC stacks [20], which contain data for each polarization (HH, HV, VH, and VV). In this study, two locations (Lope and Pongara) were selected as test areas(Figure 1). The Lope test area is an inland tropical forest with a forest canopy height range of 2–84 m, and the Pongara test area is a mangrove forest with a forest canopy height range of 2–65 m, with eight tracks in the Lope test area and five in Pongara (Table 1). The predicted value of forest canopy height from PolInSAR was validated using the relative height variable RH100 of LVIS LiDAR [21], with a pixel resolution of 25 m. Multi-look processing (8 × 6) in the range and azimuth was used to eliminate the effect of noise, and two complex coherence variables were calculated using the PD coherence optimization (γhigh and γlow, dominated by canopy and ground surface, respectively) [22].
2.2. Methods
The overall workflow of this study is shown in Figure 2.
2.2.1. Mechanism Model
-
(1). RVoG model
The RVoG scattering model is the simplest and most effective forest canopy height inversion model available, and it is widely used and proven. The RVoG scattering process incorporates a forest volume scattering layer and a ground layer that cannot be penetrated. The method treats the volume scattering layer as an isotropic, homogeneous medium of thickness, hv. It describes the scattering and absorption losses of electromagnetic waves with a polarization-independent average attenuation coefficient σ [11,12,13]. As shown in Figure 3.
The interferometric complex coherence of the master and slave images after registration can be expressed as
(1)
where S1 and S2 denote the master image and slave image, respectively.The interferometric complex coherence in the different polarization channels of the RVoG model can be expressed as follows.
(2)
where m(ω) is the effective ground-to-volume amplitude ratio, φ0 is the ground phase. A value of (ω) = ∞ indicates ground scattering, and m(ω) = 0 indicates volume scattering. “Pure” volume coherence is represented by γv, which can be expressed as (Equation (3)).(3)
where σ is the average extinction coefficient, hv is the forest height, kz is the vertical effective wave number, R is the slant distance, B⊥ is the vertical baseline length, and n depends on the acquisition mode of the radar image [23].This study used the RVoG three-stage function within the Kapok open-source package to invert the forest height [12,24]. The first step of this process is to fit a coherence line that intersects a unit circle in two coherence points (γ1 andγ2) of PD coherence optimization to obtain two potential ground coherence points (γφ1 and γφ2) [22]. As shown in Figure 4.
(4)
The second stage of the three-stage function is to solve the ground phase φ0 from the two intersections. In this study, we used the method proposed by Denbina and Simard [24] to determine the ground phase, which is more stable.
(5)
(6)
where γv1 and γv2 denote the volume coherence corresponding to the ground phase solution in the two cases, respectively, and Sep (1) denotes taking the first value of sep.The third stage is the output of forest height and extinction coefficient. According to the relationship between γv and (hv,σ) in Equation (3), a two-dimensional look-up table (LUT) is created based on a set of reasonable hv and σ values. By looking for the smallest distance between γh and the from the LUT, the pair (hv,σ) fulfilling Equation (7) is taken as the output.
(7)
-
(2). Phase-coherence amplitude combined inversion method
Both the single coherence amplitude and the phase are easily affected by the extinction coefficient and vertical structure, potentially leading to inaccurate forest canopy height estimation. To address this, Cloud proposed to combine the DEM differential method with the coherence amplitude combined method to invert the forest canopy height and use the coherence amplitude information to compensate for the deficiency of the differential method, to improve the estimation of forest canopy height [16]. This model contains two parts, the first of which is the forest canopy height from the interference phase difference. In this part, two polarization coherence values close to the canopy and close to the ground surface are usually chosen to calculate the difference height. As the ground phase is located in the upper part of the ground surface, this approach results in low inversion values. The RVoG three-stage method used herein to estimate the ground phase improves the accuracy of differential forest canopy height calculation. The second part of the method is the coherence amplitude method. As the phase center separation between the polarization channels increases, the volume scattering height decreases, and the structure function is compressed at the top of the canopy. Because the volume scattering decorrelation is also decreasing, the SINC function can be used to compensate for the lack of height by the phase difference; however, the SINC model is affected by the decorrelation, and the coefficient ε is used to compensate. Typically, ε is taken as 0.4 [24], and the specific expression of the model is as follows:
(8)
-
(3). Baseline selection method
According to the RVoG model, the shape of the distribution of complex coherence points is elliptical in the unit circle. The purpose of baseline selection is to select the combination of baselines from multiple options that best fit the assumptions of the RVoG model. Our previous study compared the effects of different baseline selection methods on the forest canopy height inversion results of the RVoG model and found that the results of baseline selection by the PROD (product of average coherence magnitude and separation) method were more satisfactory [25,26,27], and this method was used herein. The PROD method is based on the product of average coherence magnitude and separation. The purpose of coherence optimization is to effectively separate different types of scattering phases to obtain the “pure” volume scattering complex coherence and surface complex coherence when the degree of coherence separation corresponding to the baseline reaches the maximum. As this approach is more consistent with the RVoG model assumptions, the baseline combination with the maximum PROD is selected to invert the canopy height [25,26,27]. As shown in Figure 5.
(9)
Here, γh and γl correspond to the two coherence points close to the canopy and ground surface. Figure 5
Baseline selection schematic, S1, S2, S3 …; S denotes a different orbit.
[Figure omitted. See PDF]
-
(4). Penetration depth model
The SAR signal in the L-band penetrates the forest canopy to a certain extent so that the center of the interferometric phase is located at the lower part of the top of the canopy (Figure 6), which was not considered in the previous studies.
To correct for this, Dall [28] suggested that phase-normalized interferometric coherence ∠γ is directly related to the height bias Bh:
(10)
The coherence in an infinitely deep volume is
(11)
(12)
In this study, it is assumed that the refraction n in the volume is negligible, so it can be concluded from Equation (12) that d2/HoAVol is related to the coherence amplitude, from which the coherence phase can be extracted due to the uniqueness of the coherence amplitude and the coherence phase [28]. Although this is the penetration depth in the infinitely deep volume, the volume depth can be considered infinite when it exceeds the penetration depth by a factor between two and five. In this paper, the penetration depth is used as a variable only, so we do not consider whether the condition of infinitely deep volume holds. The penetration depth is calculated as follows [28,29]:
(13)
(14)
2.2.2. Machine Learning Methods
-
(1). Independent variable extraction
(a). Vertical height parameter
The greatest advantage of PolInSAR is acquiring forest vertical structure information. In this study, the first and second stages of the RVoG three-stage method were used to calculate the ground phase. The coherence separation and phase center height are calculated with reference to the ground phase (Figure 7a), and the volume coherence penetration depth is calculated using the penetration depth model (Figure 7b, Table 2).
-
(b). Baseline selection parameters
The baseline selection parameter (Table 3) is an essential factor to reflect the shape of the coherence region, which reflects the forest structure information to some extent, for example, the overall coherence magnitude and coherence separation.
-
(c). Geometric parameters
The imaging geometry of InSAR/PolInSAR altimetry is shown in Figure 8 and Table 4.
-
(2). Regression Model Development
(a). Partial least squares regression model
Partial least squares (PLS) regression was originally proposed by Wold and Albano et al. [30]. It is typically applied to regression modeling between multiple dependent variables and multiple independent variables that are suitable for both principal component analysis and typical correlation analysis. The method has the following advantages: (1) It avoids the problem of multicollinearity between variables; (2) it can produce satisfactory results when the sample size is small or the sample size is less than the number of variables; and (3) it can distinguish systematic information from noise. The principle of PLS regression modeling is an independent variable (X1, X2, …, Xa) and a single, dependent variable Y with a total sample size of n; the resulting matrix of independent and dependent variables is
X = [x1, x2, …, xa] n × a
The first principal component is extracted in Equation (15) and regressed on the dependent variable, and the algorithm is terminated if the results satisfy the expected requirements. Otherwise, after the extraction of the first principal component, residual information extracted is excluded from a second principal component extraction, and the regression is continued until the established accuracy is reached [31,32].
-
(b). Random forest regression model
Random forest (RF) regression modeling is a data mining method developed by Adele and Breiman [33]. The RF technique combines combinatorial self-learning with modern regression and classification. RF can be used for both classification and regression, as well as clustering and survival analysis. Its advantages over other algorithms are its adaptability to the data, excellent noise immunity, and excellent fitting ability (without overfitting). This method uses bootstrap resampling to draw multiple samples from the original sample, models the decision tree for each bootstrap sample, and combines the predictions of multiple decision trees to obtain the final prediction result by “voting”. The internal node tree structure is constructed according to the best principles of the Gini criterion [34]. With A original variables, Ai feature variables are randomly selected for splitting the decision tree and growing freely to generate multiple decision trees, and the number of trees (Ntree), and the size of the subset randomly selected (Ai) in the regression process are optimized to derive the best fit. The advantages of the RF regression method are that (1) it is suitable for large-scale data sets; (2) it is insensitive to multivariate linear formulations; (3) it provides more reliable prediction results for missing and unbalanced data than alternative methods; (4) it generates importance estimates of variables; and (5) it is fast training [34,35].
3. Results
3.1. Mechanism Model Inversion Results
In the Lope test area, the inversion accuracy of the RVoG phase-coherence magnitude method was the lowest, with an R2 of 0.723 and RMSE of 8.583 bias of −2.431 m (Figure 9b). The RVoG model had better inversion accuracy than the phase-coherence magnitude method, with an R2 of 0.775 and RMSE of 7.748 bias of 1.120 m (Figure 9a). The RMSE of the RVoG model was reduced by about 1 m, which indicates that the RVoG model interprets the forest canopy height information better than the RVoG phase coherence amplitude method because the RVoG model uses forest canopy height and extinction coefficient to construct a look-up table to calculate the forest canopy height, while the RVoG phase coherence amplitude method is a simplified expression of the RVoG model. The scatter plot (Figure 9) shows that the forest canopy height does not reflect the true forest canopy height after higher than 50 m, and the inversion results of both methods are underestimated and overestimated (i.e., there is no systematic direction of difference in the model error).
The results in the Pongara test area are consistent with the pattern in the Lope experimental area. Among the two mechanism models, the inversion accuracy of the RVoG phase coherence amplitude method is the lowest, with R2 of 0.728, RMSE of 7.897 m, and bias of −4.043 m (Figure 9d). The inversion accuracy of the RVoG model is better than that of the RVoG phase coherence amplitude method, but the difference is not too significant, with R2 of 0.752, RMSE of 7.628 m and bias of −4.188 m (Figure 9b). The bias is relatively large after the forest canopy height is greater than 50 m, and there are also underestimations and overestimations, which are consistent with the results of the Lope test area. Errors may be related to decorrelation, observation geometry, and vegetation conditions.
3.2. Machine Learning Method Inversion Results
3.2.1. Importance Analysis of Independent Variables
In machine learning, feature dimensionality reduction can be performed by variable selection to improve model efficiency. In this study, we used RF to filter variables; the main objective of RF is to determine the size of the contribution made by each feature in each tree of the RF, average these contributions, and finally, compare the size of the contribution between features. In the Lope and Pongara regions, 4239 and 3068 samples were used to train the RF model, respectively. The results of the importance analysis showed that the penetration depth, phase center height, coherence separation, and baseline selection index information contributed the most to the model in both study areas, and the cumulative contribution reached more than 90% (Figure 10). These parameters are related to the forest vertical structure information, so they were more sensitive to forest canopy height. Therefore, it appears feasible to use a mechanism model combined with machine learning methods to invert the forest canopy height from PolInSAR data. In the model construction, variables with a cumulative contribution of 90% were selected to participate in the regression model construction to reduce the variable dimensionality.
3.2.2. Inversion Results
In the construction of the RF-RVoG-DEP model (Table 5), the model parameters were optimized twice: first, using a random iteration method to obtain the local optimal parameters, followed by a grid search function to determine the global optimal parameters; the PLS-RVoG-DEP model was constructed to determine the number of principal components according to the error minimization principle.
In the following, we validated the inversion results of the machine learning methods using independent validation samples and compared the differences between the methods, and the results are shown in Table 6 and Figure 11. In the Lope test area, the inversion accuracy of the machine learning method is significantly greater than that of the mechanism model (Table 6 and Figure 11), where the R2 of the PLS-RVoG-DEP model is 0.850, the RMSE is 6.320 m, and the bias is 0.002 m (Figure 11b). Among all the methods, the RF-RVoG-DEP model has the highest inversion accuracy with R2 of 0.900, RMSE of 5.154 m, and bias of −0.061 m (Figure 11a). Compared with the RVoG model, the R2 was increased from 0.775 to 0.900, and the RMSE was reduced from 7.748 m to 5.154 m. The bias was also significantly reduced, indicating that machine learning methods combined with mechanism models are more responsive to forest canopy height information than mechanism method alone. This may be related to whether the conditions and assumptions of the mechanism models are satisfied, but the bias is also greater when the forest canopy height is larger. The same pattern was observed in the Pongara experimental area, where the inversion accuracy of the machine learning method was significantly greater than that of the mechanism model, which had an R2 of 0.869, RMSE of 5.534 m, and bias of 0.038 m for the PLS-RVoG-DEP model (Figure 11d), compared to an R2 of 0.903, RMSE of 4.769 m and bias of 0.01 6 m in the RF-RVoG-DEP model (Figure 11c). Compared with the RVoG model, the R2 of the RF-RVoG-DEP model increased from 0.728 to 0.903, and the RMSE was reduced from 7.897 to 4.769 m. While the accuracy was substantially improved, but the bias was relatively large for forest canopy height greater than 50 m, which was consistent with the results of the Lope test area. Although the forest canopy height inversion accuracy can be effectively improved by using machine learning methods combined with mechanism models, the geometric error and decorrelation factor of PolInSAR data still cannot be solved.
4. Discussion
In this work, we used machine learning combined with a mechanism modeling approach to estimate forest canopy height and greatly improved the estimation accuracy by mining the PolInSAR parameter representing the vertical height of the forest rather than simply relying on coherence features to estimate forest canopy height, which is more complete compared to the studies of Zahriban, Hesari, and Persson. Compared with the Brigot study, we used more adequate independent variables, such as the microwave penetration depth, the phase center height obtained by the RVoG model, and the observation geometry parameters. However, the following questions deserve further exploration in the subsequent study.
4.1. Limitations of the Mechanism Model
The RVoG three-stage and RVoG phase coherence amplitude methods rely on polarization interference information to invert forest canopy height, which does not require training samples. However, there are many uncertainties in real forest conditions, and the forest canopy height calculated by a fixed model form, with its associated assumptions and parameters, differs from the real forest canopy height values. In the RVoG model, the ground-to-volume magnitude ratio of the volume coherence is assumed to be 0 [13]. Still, this assumption does not support realistic forest conditions when the forest cover is low. The contribution of the surface scattering from the volume coherence also increases the estimation error of the pure volume coherence when cover is low. From the results, we found that the results of the Pongara test area were better than Lope test area, which may be related to the forest type. In the Pongara test area (mangrove forest), the forest structure is more homogeneous, with a large canopy cover and no other vegetation on the ground surface, and the ground surface is usually covered by water, together with the shading of the canopy, which is more consistent with the assumption of zero ground-to-volume magnitude ratio in this case. However, in the Lope test area (inland tropical forest), the forest structure is complex, with taller and low forests, and the ground surface scattering contribution is larger under the condition of lower forest cover, so the assumption of the ground-to-volume magnitude ratio of zero is not fully valid. In addition, differences in the range and step size of forest canopy height and extinction coefficients in the construction of LUT for the RVoG three-stage method can also affect the inversion results.
4.2. The Effect of Temporal De-Correlation
The RVoG three-stage method does not consider the effect of temporal decorrelation, which is usually affected by the temporal difference, baseline size, and forest conditions when SAR data are acquired. It has been shown that temporal decorrelation not only decreases the coherence coefficient but also increases the volatility of the coherence phase in vegetated areas [36,37], so the error of interferometric complex coherence potentially affects the accuracy of ground phase and volume coherence. The inversion results of the RVoG phase coherence amplitude method consist of the inversion results of the phase difference method and the coherence amplitude method. In the different parts of this process, the ground phase error source is consistent with the RVoG model, while the part of the coherence amplitude method is heavily influenced by the temporal de-correlation [38,39,40], and the assumption that the extinction coefficient 0 is not valid under practical conditions with large differences in forest structure. In the next study, temporal de-correlation models (e.g., RMOG model, RVoG-VDT model) can be added to obtain temporal de-correction factors to improve this limitation.
4.3. Effect of Baseline Selection Method and Observation Geometry
The baseline selection method is also one of the sources of inverse errors, and we use the PROD method to select the baselines. According to a related study [41], it is shown that the selection of baseline combinations relying only on the shape of the distribution of complex coherence does not achieve the global optimum, so the optimization of the baseline selection method is also a problem worth considering. Despite the fact that the machine learning method greatly improves the estimation accuracy, there is still an underestimation trend when the forest canopy height is greater than 50 m. These analyses show that the ambiguity height Hoa (2π/kz) no longer increases with the forest canopy height when the forest canopy height is greater than 50 m in both study areas. It was also shown that the inversion results are more accurate when the product of forest canopy height and vertical wave number is less than the Height of ambiguity (kz × hv < Hoa). Values of kz that are too large or too small can increase the de-correlation interference and lead to a relatively large bias in the inversion results [42], while the spatial baseline size is an important parameter to determine kz. Chen’s study mentioned that when the spatial baseline corresponding to Hoa is 2 to 4 times the height of the forest can reflect the forest canopy height more accurately [43]. As mentioned above in mangrove forests with a homogeneous structure, the distribution of coherent points is closer to the RVoG model hypothesis and the baseline selection results are more reasonable. In contrast, inland tropical forests have a complex structure and more disturbing factors, so the uncertainty of baseline selection results is larger, which was also verified in the study of Denbina [26]. The baseline selection method and spatial baseline optimization are promising ways to improve forest canopy height inversion accuracy in future research.
4.4. Uncertainty of Machine Learning Methods
The machine learning methods rely on a large amount of training data to combine the polarized interference information with the forest structure information (derived from the mechanism model) to construct the inverse model, which does not itself assume preconditions. Inverse forest canopy height derived from this ”fusion model” were closer to the LiDAR-observed forest canopy height. In this study, vertical height information and SAR penetration depth are used, while penetration depth, sensor platform parameters, and baseline selection parameters are also considered. However, there are still errors in some samples of the inversion results, which are mainly related to decorrelation and vegetation conditions. Our research object is a tropical rainforest, and this forest structure may more sensitive to errors in vegetation conditions and temporal decorrelation. Furthermore, the coherence optimization results are not accurate in the case of poor interference quality, which increases the error of independent variables. Future research will investigate the compensation of the decorrelation factor as a factor in forest canopy height inversion accuracy.
5. Conclusions
Forest canopy height is an important parameter to characterize forest biomass and carbon stock. In remote sensing-based forest canopy height monitoring, interferometric, polarized synthetic aperture radar interferometry (PolInSAR) has been widely studied in the past two decades. It has proven to be an effective tool for forest canopy height estimation, which has been confirmed on the TanDEM-X satellite. Now ALOS-2, the SAOCOM are also gradually used in forest canopy height estimation studies. However, traditional mechanism models and machine learning methods can hardly meet realistic conditions completely. Therefore, exploring a series of efficient and accurate forest canopy height estimation methods to improve forest parameter estimation is an important problem that needs to be addressed urgently. Our study offers a novel approach, by combining machine learning and mechanism modeling to estimate forest canopy height, showing that it can effectively improve the accuracy of forest canopy height inversion. Inversion results using this “fusion model” method were substantially better than those derived from the mechanism model alone. The fusion model does not require incorporating factors such as extinction coefficient, ground-to-volume magnitude ratio, or baseline selection, meaning the method is more scalable than other approaches. Methodologically, due to the high correlation between forest canopy height and forest biomass, our proposed method can also be applied to the estimation of other forest parameters which is an issue to be explored in the future. Previous studies have required a large number of samples, either for model improvement or new algorithm proposals. Currently, GEDI and ICESat-2 have acquired a large number of laser point information for the globe, with the support of a large number of a priori samples, forest canopy height estimation and forest carbon measurement at a large regional scale will become more convenient with the application of ALOS-2, TanDEM-L and BIOMASS satellites and NISAR program. In the next study, we can also further improve this method in terms of temporal de-correlation, baseline selection, inversion model, and study subjects.
H.L. designed the experiments, completed the data analysis, and wrote the paper; C.Y. provided important guidance for experimental design, data analysis, and writing the paper; F.X. provided much help in completing the experiments; B.Z. helped in data processing; S.C. helped in developing the design of the graphs. All authors have read and agreed to the published version of the manuscript.
UAVSAR data and LiDAR-RH100 from NASA’s Oak Ridge National Laboratory Biogeochemical Dynamics Distributed Active Archive Center (
Thanks to NASA for providing all the publicly available free datasets to support this work.
The authors declare no conflict of interest, and the manuscript has been approved by all authors for publication.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figure 2. Workflow of forest canopy height inversion. Red dashed lines indicate models or variables.
Figure 3. RVoG model schematic. The ground elevation is z0, the volume height is hv, the scatterer is distributed randomly in the forest volume. F (z) is the radar reflectivity at height z, σ is the extinction coefficient, φ0 is the ground phase, and μ is the ground to volume magnitude ratio.
Figure 6. SAR penetration schematic in the forest. (a) indicates the position of the scattering phase center in the building scene, (b) indicates the position of the scattering phase center in the forest scene.
Figure 7. Schematic of the phase center height and penetration depth of SAR signal in the forest, (a) denotes the scattering phase center height, (b) denotes the penetration depth.
Figure 9. Scatter plot of mechanism model in the Lope and Pongara test areas. (a,b) show the inversion results of the RVoG three-phase method and the RVoG phase-coherence amplitude method for the Lope test area, respectively. (c,d) show the inversion results of the RVoG three-phase method and the RVoG phase-coherence amplitude method for the Pongara test area, respectively.
Figure 10. Variable importance in Lope and Pongara test areas, (a) is the result of independent variable selection for the Lope test area, and (b) is the result of independent variable selection for the Pongara test area.
Figure 11. Scatter plots of machine learning and random forest regression “fusion model” predicted vs. observed forest canopy height in the Lope and Pongara test areas. (a,b) show the inversion results of the RF-RVoG-DEP method and the PLS-RVoG-DEP method for the Lope test area, respectively. (c,d) show the inversion results of the RF-RVoG-DEP method and the PLS-RVoG-DEP method for the Pongara test area, respectively.
Summary of UAVSAR data.
Test Area | Number of Tracks | Vertical Baseline |
Range Resolution (m) | Azimuth Resolution (m) |
---|---|---|---|---|
Lope | 8 | 0, 20, 45, 105 | 3.33 | 4.8 |
Pongara | 5 | 0, 20, 40, 60, 80, 100, 120 | 3.33 | 4.8 |
Vertical height parameter.
Variable Type | Name | Description | Expressions |
---|---|---|---|
Coherence phase center height and coherence separation | PDHsep | High coherence separation |
|
PDLsep | Low coherence separation |
|
|
PDHmab | High coherence magnitude |
|
|
PDLmab | Low coherence amplitude |
|
|
PDHarg | High coherence phases |
|
|
PDLarg | Low coherence phases |
|
|
Phi | Ground phase | / | |
Phimab | Surface coherence amplitude |
|
|
HeightPDH | High coherence phase center height |
|
|
HeightPDL | Low coherence phase center height |
|
|
Penetration depth | Bh | Penetration depth |
|
Baseline selection parameters.
Variable Type | Name | Description | Expressions |
---|---|---|---|
Baseline selection parameters | sep | Coherence separation |
|
mab | Coherence amplitude |
|
|
cit | Product of coherence separation and coherence amplitude |
|
Geometric parameters.
Variable Type | Name | Description | Expressions |
---|---|---|---|
Geometric parameters | Cosθ | Incident angle cosine | None |
Sinθ | Incident angle sine | None | |
Inc | incident angle | None | |
Kz | Vertical wave number |
|
|
Hoa | Height of ambiguity |
|
Training results of machine learning models.
Test Area | N | Model | R2 | RMSE (m) | BIAS (m) |
---|---|---|---|---|---|
Lope | 4239 | RF-RVoG-DEP | 0.967 | 2.959 | −0.022 |
PLS-RVoG-DEP | 0.847 | 6.380 | −0.012 | ||
Pongara | 3068 | RF-RVoG-DEP | 0.979 | 2.226 | 0.013 |
PLS-RVoG-DEP | 0.853 | 5.861 | −0.014 |
Comparison of the validation results of different inversion methods.
Test Area | N | Model | R2 | RMSE (m) | BIAS (m) | |
---|---|---|---|---|---|---|
Lope | 2118 | Fusion Model | RF-RVoG-DEP | 0.900 | 5.154 | −0.061 |
PLS-RVoG-DEP | 0.850 | 6.320 | 0.002 | |||
Mechanism Model | RVoG | 0.775 | 7.748 | 1.120 | ||
RVoG-Sinc-Phase | 0.723 | 8.583 | 2.431 | |||
Pongara | 1534 | Fusion Model | RF-RVoG-DEP | 0.903 | 4.769 | 0.016 |
PLS-RVoG-DEP | 0.869 | 5.534 | 0.038 | |||
Mechanism Model | RVoG | 0.752 | 7.628 | −4.188 | ||
RVoG-Sinc-Phase | 0.728 | 7.987 | −4.043 |
References
1. Goetz, S.; Dubayah, R. Advances in remote sensing technology and implications for measuring and monitoring forest carbon stocks and change. Carbon Manag.; 2011; 2, pp. 231-244. [DOI: https://dx.doi.org/10.4155/cmt.11.18]
2. Zhao, P.; Lu, D.; Wang, G.; Wu, C.; Huang, Y.; Yu, S. Examining Spectral Reflectance Saturation in Landsat Imagery and Corresponding Solutions to Improve Forest Aboveground Biomass Estimation. Remote Sens.; 2016; 8, 469. [DOI: https://dx.doi.org/10.3390/rs8060469]
3. Hall, F.G.; Bergen, K.; Blair, J.B.; Dubayah, R.; Houghton, R.; Hurtt, G.; Kellndorfer, J.; Lefsky, M.; Ranson, J.; Saatchi, S. et al. Characterizing 3D vegetation structure from space: Mission requirements. Remote Sens. Environ.; 2011; 115, pp. 2753-2775. [DOI: https://dx.doi.org/10.1016/j.rse.2011.01.024]
4. Xu, M.; Xiang, H.; Yun, H.; Ni, X.; Chen, W.; Cao, C.-X. Retrieval of forest canopy height jointly using airborne LiDAR and ALOS PALSAR data. J. Appl. Remote Sens.; 2019; 14, 022203. [DOI: https://dx.doi.org/10.1117/1.JRS.14.022203]
5. Bao, Y.; Cao, C.; Chen, W.; Tian, R.; Dang, Y.; Li, L.; Li, G. Extraction of forest structural parameters based on the intensity information of high-density airborne light detection and ranging. J. Appl. Remote Sens.; 2012; 6, 063533. [DOI: https://dx.doi.org/10.1117/1.JRS.6.063533]
6. Zhang, L.; Duan, B.; Zou, B. Research on Inversion Models for Forest Height Estimation Using Polarimetric Sar Interferometry. ISPRS-Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci.; 2017; 42, pp. 659-663. [DOI: https://dx.doi.org/10.5194/isprs-archives-XLII-2-W7-659-2017]
7. Graham, L.C. Synthetic Interferometer Radar For Topographic Mapping. Proc. IEEE.; 1974; 62, pp. 763-768. [DOI: https://dx.doi.org/10.1109/PROC.1974.9516]
8. Garestier, F.; Le Toan, T. Forest Modeling For Height Inversion Using Single-Baseline InSAR/Pol-InSAR Data. IEEE Trans. Geosci. Remote Sens.; 2010; 48, pp. 1528-1539. [DOI: https://dx.doi.org/10.1109/TGRS.2009.2032538]
9. Soja, M.J.; Ulander, L.M.H. Digital canopy model estimation from TanDEM-X interferometry using high-resolution lidar DEM. Proceedings of the 2013 IEEE International Geoscience and Remote Sensing Symposium-IGARSS; Melbourne, Australia, 21–26 July 2013.
10. Sadeghi, Y.; St-Onge, B.; Leblon, B.; Simard, M.; Papathanassiou, K. Mapping forest canopy height using TanDEM-X DSM and airborne LiDAR DTM. Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium; Quebec City, QC, Canada, 13–18 July 2014.
11. Treuhaft, R.N.; Moghaddam, M.; van Zyl, J.J. Vegetation Characteristics And Underlying Topography Frominterferometer Radar. Radio Sci.; 1996; 31, pp. 1449-1485. [DOI: https://dx.doi.org/10.1029/96RS01763]
12. Cloude, S.R.; Papathanassiou, K.P. Three-Stage Inversion Process For Polarimetric SAR Interferometry. IEE Proc.-Radar. Sonar. Navig.; 2003; 150, pp. 125-134. [DOI: https://dx.doi.org/10.1049/ip-rsn:20030449]
13. Papathanassiou, K.; Cloude, S.R. Single-baseline polarimetric SAR interferometry. IEEE Trans. Geosci. Remote Sens.; 2001; 39, pp. 2352-2363. [DOI: https://dx.doi.org/10.1109/36.964971]
14. Hajnsek, I.; Kugler, F.; Lee, S.K. Tropical-forest-parameter estimation by means of Pol-InSAR: The INDREX-II campaign. IEEE Trans. Geosci. Remote Sens.; 2009; 47, pp. 481-493. [DOI: https://dx.doi.org/10.1109/TGRS.2008.2009437]
15. Liao, Z.; He, B.; Quan, X.; van Dijk, A.I.; Qiu, S.; Yin, C. Biomass estimation in dense tropical forest using multiple information from single-baseline P-band PolInSAR data. Remote Sens. Environ.; 2018; 221, pp. 489-507. [DOI: https://dx.doi.org/10.1016/j.rse.2018.11.027]
16. Cloude, S.R.; Papathanassiou, K.P. Forest vertical structure estimation using coherence tomography. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium IGARSS; Boston, MA, USA, 7–11 July 2008.
17. Zahriban Hesari, M.; Shataee, S.; Maghsoudi, Y.; Mohammadi, J.; Fransson, J.E.; Persson, H.J. Forest Variable Estimations Using TanDEM-X Data in Hyrcanian Forests. Can. J. Remote Sens.; 2020; 46, pp. 166-176. [DOI: https://dx.doi.org/10.1080/07038992.2020.1763790]
18. Persson, H.J.; Fransson, J.E.S. Comparison between TanDEM-X-and ALS-based estimation of aboveground biomass and tree height in boreal forests. Scand. J. For. Res.; 2017; 32, pp. 306-319. [DOI: https://dx.doi.org/10.1080/02827581.2016.1220618]
19. Brigot, G.; Simard, M.; Colin-Koeniguer, E.; Boulch, A. Retrieval of Forest Vertical Structure from PolInSAR Data by Machine Learning Using LIDAR-Derived Features. Remote Sens.; 2019; 11, 381. [DOI: https://dx.doi.org/10.3390/rs11040381]
20. Fore, A.G.; Chapman, B.D.; Hawkins, B.P.; Hensley, S.; Jones, C.E.; Michel, T.R.; Muellerschoen, R.J. UAVSAR Polarimetric Calibration. IEEE Trans. Geosci. Remote Sens.; 2015; 53, pp. 3481-3491. [DOI: https://dx.doi.org/10.1109/TGRS.2014.2377637]
21. Armston, J.; Tang, H.; Hancock, S.; Marselis, S.; Duncanson, L.; Kellner, J.; Hofton, M.; Blair, J.B.; Fatoyinbo, T.; Dubayah, R.O. AfriSAR: Gridded Forest Biomass and Canopy Metrics Derived from LVIS, Gabon, 2016; ORNL DAAC: Oak Ridge, TN, USA, 2020.
22. Xie, Q.H.; Wang, C.C.; Zhu, J.J.; Fu, H.Q. Forest height inversion by combining S-RVOG model with terrain factor and PD coherence optimization. Acta Geod. Cartogr. Sin.; 2015; 44, pp. 686-693.
23. Kugler, F.; Lee, S.K.; Hajnsek, I.; Papathanassiou, K.P. Forest Height Estimation by Means of Pol-InSAR Data Inversion: The Role of the Vertical Wavenumber. IEEE Trans. Geosci. Remote Sens.; 2015; 53, pp. 5294-5311. [DOI: https://dx.doi.org/10.1109/TGRS.2015.2420996]
24. Denbina, M.; Simard, M. Kapok: An open source Python library for PolInSAR forest height estimation using UA VSAR data. Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS); Fort Worth, TX, USA, 23–28 July 2017.
25. Lee, S.K.; Kugler, F.; Papathanassiou, K.; Hajnsek, I. Multibaseline polarimetric SAR interferometry forest height inversion approaches. Proceedings of the ESA POLinSAR Workshop; Frascati, Italy, 24–28 January 2011.
26. Denbina, M.; Simard, M.; Hawkins, B. Forest Height Estimation Using Multibaseline PolInSAR and Sparse Lidar Data Fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.; 2018; 11, pp. 3415-3433. [DOI: https://dx.doi.org/10.1109/JSTARS.2018.2841388]
27. Luo, H.B.; Zhu, B.D.; Yue, C.R.; Wang, N. Forest Canopy Height Inversion Based On Airborne Multi-Baseline PolInSAR. J. Geomatics.; 2022; 48, pp. 1-7.
28. Dall, J. InSAR Elevation Bias Caused by Penetration Into Uniform Volumes. IEEE Trans. Geosci. Remote Sens.; 2007; 45, pp. 2319-2324. [DOI: https://dx.doi.org/10.1109/TGRS.2007.896613]
29. Schlund, M.; Baron, D.; Magdon, P.; Erasmi, S. Canopy penetration depth estimation with TanDEM-X and its compensation in temperate forests. ISPRS J. Photogramm. Remote Sens.; 2019; 147, pp. 232-241. [DOI: https://dx.doi.org/10.1016/j.isprsjprs.2018.11.021]
30. Kong, X.; Cao, Z.; An, Q.; Gao, Y.; Du, B. Quality-Related and Process-Related Fault Monitoring With Online Monitoring Dynamic Concurrent PLS. IEEE Access; 2018; 6, pp. 59074-59086. [DOI: https://dx.doi.org/10.1109/ACCESS.2018.2872790]
31. Hoeppner, J.M.; Skidmore, A.K.; Darvishzadeh, R.; Heurich, M.; Chang, H.C.; Gara, T.W. Mapping canopy chlorophyll content in a temperate forest using airborne hyperspectral data. Remote Sens.; 2020; 12, 3573. [DOI: https://dx.doi.org/10.3390/rs12213573]
32. Ali, A.M.; Darvishzadeh, R.; Skidmore, A.; Gara, T.W.; O’Connor, B.; Roeoesli, C.; Heurich, M.; Paganini, M. Comparing methods for mapping canopy chlorophyll content in a mixed mountain forest using Sentinel-2 data. Int. J. Appl. Earth Obs. Geoinf.; 2020; 87, 102037. [DOI: https://dx.doi.org/10.1016/j.jag.2019.102037]
33. Breiman, L. Random Forests. Mach. Learn.; 2001; 45, pp. 5-32. [DOI: https://dx.doi.org/10.1023/A:1010933404324]
34. Purohit, S.; Aggarwal, S.P.; Patel, N.R. Estimation of forest aboveground biomass using combination of Landsat 8 and Sentinel-1A data with random forest regression algorithm in Himalayan Foothills. Trop. Ecol.; 2021; 62, pp. 288-300. [DOI: https://dx.doi.org/10.1007/s42965-021-00140-x]
35. Huang, H.; Liu, C.; Wang, X. Constructing a Finer-Resolution Forest Height in China Using ICESat/GLAS, Landsat and ALOS PALSAR Data and Height Patterns of Natural Forests and Plantations. Remote Sens.; 2019; 11, 1740. [DOI: https://dx.doi.org/10.3390/rs11151740]
36. Lee, S.K.; Kugler, F.; Hajnsek, I.; Papathanassiou, K. The impact of temporal decorrelation over forest terrain in polarimetric SAR interferometry. Proceedings of the International Workshop on Applications of Polarimetry and Polarimetric Interferometry (Pol-InSAR); Frascati, Italy, 26–30 January 2009.
37. Lee, S.-K.; Kugler, F.; Papathanassiou, K.; Hajnsek, I. Quantification and compensation of temporal decorrelation effects in polarimetric SAR interferometry. Proceedings of the 2012 IEEE International Geoscience and Remote Sensing Symposium; Munich, Germany, 22–27 July 2012.
38. Zhou, Y.-S.; Hong, W.; Cao, F.; Wang, Y.-P.; Wu, Y.-R. Analysis of Temporal Decorrelation in Dual-Baseline Polinsar Vegetation Parameter Estimation. Proceedings of the IEEE International Geoscience and Remote Sensing Symposium; Boston, MA, USA, 7–11 July 2008.
39. Mette, T.; Kugler, F.; Papathanassiou, K.; Hajnsek, I. Forest and the random volume over ground-nature and effect of 3 possible error types. Proceedings of the European Conference on Synthetic Aperture Radar (EUSAR); Dresden, Germany, 16–18 May 2006.
40. Simard, M.; Denbina, M. An Assessment of Temporal Decorrelation Compensation Methods for Forest Canopy Height Estimation Using Airborne L-Band Same-Day Repeat-Pass Polarimetric SAR Interferometry. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.; 2018; 11, pp. 95-111. [DOI: https://dx.doi.org/10.1109/JSTARS.2017.2761338]
41. Lee, S.-K.; Kugler, F.; Papathanassiou, K.P.; Hajnsek, I. Quantifying temporal decorrelation over boreal forest at L-and P-band. Proceedings of the 7th European Conference on Synthetic Aperture Radar; Friedrichshafen, Germany, 2–5 June 2008.
42. Du, K.; Lin, H.; Wang, G.; Long, J.; Li, J.; Liu, Z. The Impact of Vertical Wavenumber on Forest Height Inversion by PolInSAR. Proceedings of the 2018 Fifth International Workshop on Earth Observation and Remote Sensing Applications (EORSA); Xi’an, China, 18–20 June 2018.
43. Chen, H.; Cloude, S.R.; Goodenough, D.G.; Hill, D.A.; Nesdoly, A. Radar Forest Height Estimation in Mountainous Terrain Using Tandem-X Coherence Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.; 2018; 11, pp. 3443-3452. [DOI: https://dx.doi.org/10.1109/JSTARS.2018.2866059]
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
The mapping of tropical rainforest forest structure parameters plays an important role in biodiversity and carbon stock estimation. The current mechanism models based on PolInSAR for forest height inversion (e.g., the RVoG model) are physical process models, and realistic conditions for model parameterization are often difficult to establish for practical applications, resulting in large forest height estimation errors. As an alternative, machine learning approaches offer the benefit of model simplicity, but these tools provide limited capabilities for interpretation and generalization. To explore the forest height estimation method combining the mechanism model and the empirical model, we utilized UAVSAR multi-baseline PolInSAR L-band data from the AfriSAR project and propose a solution of a mechanism model combined with machine learning. In this paper, two mechanism models were used as controls, the RVoG three-phase method and the RVoG phase-coherence amplitude method. The vertical structure parameters of the forest obtained from the mechanism model were used as the independent variables of the machine learning model. Random forest (RF) and partial least squares (PLS) regression models were used to invert the forest canopy height. Results show that the inversion accuracy of the machine learning method, combined with the mechanism model, is significantly better than that of the single-mechanism model method. The most influential independent variables were penetration depth, volume coherence phase center height, coherence separation, and baseline selection. With the precondition that the cumulative contribution of the independent variables was greater than 90%, the number of independent variables in the two study areas was reduced from 19 to 4, and the accuracy of the RF-RVoG-DEP model was higher than that of the PLS-RVoG-DEP model. For the Lope test area, the R2 of the RVoG phase coherence amplitude method is 0.723, the RMSE is 8.583 m, and the model bias is −2.431 m; the R2 of the RVoG three-stage method is 0.775, the RMSE is 7.748, and the bias is 1.120 m, the R2 of the PLS-RVoG-DEP model is 0.850, the RMSE is 6.320 m, and the bias is 0.002 m; and the R2 of the RF-RVoG-DEP model is 0.900, the RMSE is 5.154 m, and the bias is −0.061 m. The results for the Pongara test area are consistent with the pattern for the Lope test area. The combined “fusion model” offers a substantial improvement in forest height estimation from the traditional mechanism modeling method.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details


1 College of Forestry, Southwest Forestry University, Kunming 650224, China
2 Institute of International Rivers and Eco-Security, Yunnan University, Kunming 650500, China
3 College of Forestry, Southwest Forestry University, Kunming 650224, China; College of Forestry, Northeastern Forestry University, Harbin 150040, China