Full Text

Turn on search term navigation

1. Introduction

With the development of space science, synthetic aperture radar (SAR) technology has been widely used in natural disaster monitoring, terrain interpretation, agricultural detection, and other applications for Earth observation. Meanwhile, SAR imaging has imaging superiority over optical imaging due to its capability of acquiring high-spatial-resolution images in all-weather and all-time scenarios [1]. However, the performance of SAR images is always suppressed by speckle noise pollution, which is generated by the backward scattering of coherent electromagnetic waves. As a kind of multiplicative noise, speckle noise causes the distortion of pixel values in a wild region, which creates a series of huge challenges for SAR image processing [2].

SAR image classification is the essential technique to deal with image processing and is mainly used in recognizing and detecting topographical objects in SAR images. As a matter of fact, the purpose of image classification is to divide the image into a certain number of classes with similar features and group the purposeful parts to promote subsequent manipulation. In recent years, Markov random fields (MRF) have been widely used in both optical images and SAR images [3,4]. Unsupervised methods based on MRFs consider both the relations of features and the context of spatial information, which leads to suppressing speckle noise efficiently. An MRF obtains the results of image classification through an optimizing energy function in the Bayesian framework. Deng et al. [5] propose an anisotropic circular Gaussian MRF (ACG-MRF) for image classification, and then Wu et al. [6] improve the MRF from pixel-level to region-level by introducing Wishart distance. Li et al. [7] design a Gaussian mixture model MRF (GMM-MRF) to consider the relationships between spatial information and contextual information, and then Song et al. [8] build a mixture WGΓ-MRF (MWGΓ-MRF) to improve the surveying of textural information. Though MRF-based models have accomplished a series of successes on contextual constraints, they are still subject to a lack of prior knowledge and expert experience. Furthermore, only the low-level features are used in these unsupervised methods, which limits the accuracy of the algorithms.

Based on the multifarious mentality of designing for classifiers, there are two main categories in supervised SAR image classification, which are artificial feature extraction methods and deep learning methods [9]. The key idea of artificial feature extraction is to extract certain kinds of image features and the selection of classifiers. Primarily in the aspect of hand-crafted feature extraction, the intensity of images and the texture information are common characteristics used in reprocessing. While in classification algorithms, separate classifiers for SAR images have been proposed, such as support vector machine (SVM) [10], random forest (RF) [11], and artificial neural network (ANN) methods [12]. Although the above methods achieve a certain degree of classification accuracy and complete description of image features at a low level or middle level, the results still suffered from speckle noise and resulted in unstable robustness in different regions.

Deep learning algorithms can extract discriminative features automatically and require strong robustness against speckle noise; especially, convolutional neural networks (CNN) have gained widespread acceptance in remote sensing applications [13]. In recent years, a succession of CNN-based models have been used to classify SAR images. Zhou et al. [14] first introduced a CNN method for SAR image classification and achieved better classification accuracy than traditional artificially supervised methods, and then Wang et al. [15] proposed a multiple CNN (MP-CNN) to fuse intensity and edge information for SAR image classification. The input of a CNN is 2-D image patches which guarantee the constraints of contextual information at the pixel level [16]. However, the process of extracting characters shatters the inherent construction of the feature map in regions [17].

To maintain the inherent construction of the feature map in regions, a novel SAR image classification method using Markov random fields with deep learning is proposed in this paper. Subregions of the image are generated by the SLIC superpixels algorithm, and corresponding labels are traded off by the CNN with an internal voting strategy. The characters extracted by the CNN are introduced to enhance the description of the intensity of regions and complement the binary contextual relevance between regions. By minimizing the energy function of the proposed method, the results of the classification can be obtained. Experiments have been conducted on simulated and real SAR images to evaluate the performance of the proposed algorithm. The experimental results demonstrate that the proposed algorithm outperforms the other CNN-based region-level SAR image classification algorithms.

2. Materials and Methods

The flowchart of the proposed approach is shown in Figure 1. The motivation of the proposed method is to research the complex spatial relationships of different characters. Firstly, the pixel-level category confidence-degree features are extracted by the CNN [18], and superpixels are generated by the SLIC algorithm. The construction of the region adjacency graph (RAG) and the voting strategy accomplish the feature extraction at the region level. Subsequently, all characters are introduced into the proposed region-level MRF, and the final classification results are calculated by minimizing the total energy function.

2.1. Superpixel Construction

Superpixel construction is used to classify regions in the proposed method A superpixel is a subregion in an image composed of a series of adjacent pixels with similar features such as color, intensity, and textures. Compared with pixel-level segmentation algorithms, superpixels retain efficient spatial structure information for further image processing and generally reserve the boundary information. In this paper, the simple linear iterative clustering (SLIC) algorithm [19] is used to generate superpixels. The SLIC algorithm converges homogeneous spatial pixels into the same class by adaptive k-means clustering. As the most common method in regional segmentation, the SLIC algorithm owns superior performance with simple implementation, and it is meanwhile resistant to speckle noise due to considering regional consistency.

In the SLIC algorithm, the value of every superpixel is obtained by calculating the mean intensity of all pixels in each region. For better facilitating the subsequent operations, the region adjacency graph (RAG) is introduced to represent the correlations between regions [20]. The rule of connection is based on shared boundaries between regions. Figure 2 shows an example of an RAG for superpixels; the nodes denote the values of the superpixels, and the edges describe the pairwise spatial information.

2.2. Initialization by Convolutional Neural Network

The convolutional neural network (CNN) is the most common tool in supervised image processing, and a traditional CNN in classification is structured as shown in Figure 3. The convolution layer can be regarded as a kind of filter to output various features from input patches, and the feature maps are constructed by iterating through all positions. The main work of the pooling layer is subsampling the feature maps to resist overfitting and increase robustness for the whole design. The fully connected layer reshapes the feature maps from two-dimensional to one-dimensional; meanwhile, the softmax layer outputs the classification results.

In the softmax layer, the network outputs the pixel-level category labels and corresponding probability distribution. The probability that a pixel corresponds to different labels is listed, and the total probability for each pixel is equal to 1. The pixel-level category labels are used to generate the label field for superpixels, and the probability distribution is regarded as a deep feature to construct the probability field in the MRF.

The criterion of superpixel segmentation is clustering the spatial adjacent pixels with the same features, so the category of a superpixel region should be consistent among most pixels. Because the output of the CNN only provides pixel-wise labels, a region-based majority voting strategy is necessary for initializing RAG labels. Assume there are M pixels in superpixel r and the category of each pixel is received by the CNN, through counting the histograms of the label distribution in the region, the majority voting result will be selected as the initial category of r. The superpixel probability distribution is calculated by the joint probability of the corresponding pixels.

2.3. Region-Level Markov Random Fields

The image classification problem can be formulated as a maximum a posteriori (MAP) estimate in the manner of the Bayesian framework.

(1) $\hat{x} = \arg \underset{\hat{x}}{m a x} P (x | y) = \arg \underset{\hat{x}}{m a x} P (y | x) P (x)$

where

x

is the image label, and

y

is the image feature. The maximum a posteriori probability can be equivalent to two parts, which are

P (y | x)

and

P (x)

separately.

P (y | x)

describes the conditional probability of

x

and is referred to as the feature model, and

P (x)

is the prior probability of

x

, which is referred to as the spatial context model.

The traditional MRF model bears a sharply growing computation cost as the size of the image increases, and meanwhile, the block-level texture information fails to be completely extracted in a pixel-based framework [21]. Based on the idea of solving the above problems, a region-level MRF model is constructed to improve algorithm capacity. Compared with pixel-based MRF model, the region-level MRF model has three main advantages. Firstly, dealing with an image in region-level patterns could effectively reduce the complexity of the algorithm due to the declining number of factors in the label field. Secondly, the latent semantic information could be reflected, including the oversegmented regions [22]. Thirdly, the process of generating regions suppresses the influence of oversegmentation, and some of the pixel-level misclassification is smoothed out by regions. Generally, dealing with an image in regions can efficiently parse the structure of the topology and contextual information.

In the superpixel-based region-level MRF model for SAR image classification, the spatial context model describes the interactions between continuous superpixels, and the feature model expresses the intensity distribution for each superpixel. The feature model for an SAR image is often calculated as Gaussian distribution, and the energy function is written as:

(2) $U I_{1} = \sum_{R_{i} \in R} \sum_{s_{i} \in R_{i}} \{\frac{1}{2} \ln (2 π σ_{i}^{2}) + \frac{{(s_{i} - μ_{i})}^{2}}{2 σ_{i}^{2}}\}$

$μ_{i}$ and $σ_{i}^{2}$ are the mean and variance of the intensity distribution at the class $R_{i}$ , and $s_{i}$ is the sets of superpixels which belong to class $R_{i}$ . $R$ represents all the regions in the input SAR image. Assuming two adjacent superpixels $s_{k}$ and $s_{l}$ , the energy function of the spatial context model is as follows:

(3) $U I_{2} = β \sum_{⟨x_{k}, x_{l}⟩ \subset e} f (x_{k}, x_{l})$

(4) $f (x_{k}, x_{l}) = \{\begin{array}{l} 1, & x_{k} = x_{l} \\ 0, & x_{k} \neq x_{l} \end{array}$

$x_{k}$ expresses the label of $s_{k},$ $f (x_{k}, x_{l})$ expresses the interactions for two regions, $e$ denotes the set of all cliques in the SAR input image, and $β$ is a potential parameter to balance the contributions between the feature model and spatial context model. The MRF energy function of intensity field is integrated as:

(5) $U I = U I_{1} + U I_{2}$

2.4. Construction of Probability Field

It should be mentioned that the traditional superpixel-based region-level MRF model only considers the intensity characteristics, and the label field is guided by the circulation of the energy function. The Gaussian distribution is simple to calculate but rough for multiplicative noise, and punishment between adjacent superpixels is rude and unfair. In this paper, a probability field is constructed to jointly guide the label field and improve the classification accuracy. The probability field is based on the probability output of the CNN. By calculating the average probability of corresponding pixels in the same set, the superpixel probability distribution is obtained.

In accordance with the manner of the MRF model based on the intensity field, the energy function for the probability field has two parts as well. In the unary part, it quantifies the possibilities for a superpixel in each class, and the energy function can be defined as:

(6) $U P_{1} = \sum_{s_{i} \in S} \{- \ln \{\frac{1}{N} \sum_{p \in s_{i}} P_{c n n} (l_{p} = {Label}_{i})\}\}$

$N$ is the total number of pixels within the superpixel $s_{i}$ , $p$ is a random pixel in the superpixel $s_{i}$ , $S$ is the total superpixels in the input SAR image, and $P_{c n n}$ is the probability of the pixel belong to $L a b e l_{i}$ . $U P_{1}$ calculates the confidence that the superpixels belong to their labels. Similar to the purpose of the spatial context model, the binary part is used to describe the relationship between regions. The energy function is represented as:

(7) $U P_{2} = β \sum_{⟨x_{k}, x_{l}⟩ \subset e} L (x_{k}, x_{l})$

(8) $L (x_{k}, x_{l}) = P_{s_{k}} ⊙ P_{s_{l}}$

$L (x_{k}, x_{l})$ is the inner product between the superpixels $k$ and $l$ . $P_{s_{l}}$ is the total probability distribution for superpixel $l$ and is written as:

(9) $P_{s_{l}} = \{P_{1}, P_{2}, P_{3}, P_{4}, \dots, P_{j}\}$

where

P_{j}

is the probability distribution for the superpixel

l

at each label, and

j

is the total number of categories. The inner product can be computed by relying on the construction of the softmax layer in the CNN and manifests a positive correlation to similitude for adjacent superpixels. To be specific, the value of

L ()

is increased when the regions are similar to each other.

β

is the balance coefficient and agrees with the initialization in the intensity field. The energy function for the probability field is as follows:

(10) $U P = U P_{1} + U P_{2}$

We can obtain the total RMRF energy function for the two fields, which is shown below:

$\begin{array}{l} U_{total} & = U I + U P \\ = \sum_{R_{i} \in R} \sum_{s_{i} \in R_{i}} \{\frac{1}{2} \ln (2 π σ_{i}^{2}) + \frac{(s_{i} - μ_{i})}{2 σ_{i}^{2}}\} + \sum_{s_{i} \in S} \{- \ln \{\frac{1}{N} \sum_{p \in s_{i}} P_{c n n} (l_{p} = Label l_{i})\}\} \\ + β \sum_{⟨x_{k}, x_{l}⟩ \subset e} \{f (x_{k}, x_{l}) + L (x_{k}, x_{l})\} \end{array}$

The motivation of constructing the probability field is to remedy the insufficiency of MRF model while considering the superpixel probability distribution from the CNN as the deep-level character information to be operated on in the framework of the probability field. The unary part gives the specific quantified value of which category the region belongs to, and the binary part shows the spatial context relations for superpixels. The traditional region-level MRF model gives a polarized strategy for neighborhood relationships, in which the energy function equals 1 if the regions have the same label and settles to 0 when they have different labels. The binary part of the probability field renders the strategy kindly and continuous. When the labels for adjacent regions are different, the energy function outputs a measured value to judge how different they are. When the labels are the same, the energy function also gives a detailed measurement for the similarity index.

The initial RAG label is generated by the CNN, and updating the label field is performed by minimizing the energy functions of both the intensity field and probability field. The simulated annealing (SA) algorithm is used to acquire the final classification results.

3. Experimental Study

In this section, experiments with synthetic SAR images and real SAR images are presented to evaluate the validity of the proposed classification method. The comparison methods are separately chosen from pixel-level and region-level techniques. CNN and CNN-MRF [23] are the pixel-level methods to examine the basis of the proposed method. The CNN + superpixel algorithm (CNN-SP) [24] and the region category confidence-degree-based Markov random field (RCC-MRF) method [25] are the region-level methods to verify the performance of the probability field. Firstly, the accuracy of the proposed method is explored using one synthetic SAR image and two real SAR images. Secondly, the capacity of each algorithm is determined using a TerraSAR image with a large size and high resolution. Thirdly, the robustness of each algorithm is considered by controlling the number of superpixels.

3.1. Experimental Datasets

Four SAR datasets are employed to evaluate the performance of the proposed method. The datasets consist of one synthetic SAR image, two real SAR images, and one TerraSAR image. The synthetic SAR image is always used to roughly verify the capability of classification of each algorithm; it contains eight categories with a size of 486 × 486. San Francisco Bay (SF-Bay) and the Flevoland images are the most common datasets for SAR image classification, and both were captured by the Radarsat-2 satellite, which satisfies the resolution requirement of 10 m. Both SF-Bay and Flevoland contain five categories. Lillestroem is a large and high-resolution image acquired by the TerraSAR-X satellite, and the related information of the real image datasets [26,27] is shown in Table 1.

The experiment images and corresponding ground truth are shown in Figure 4. A description of the categories for each image is introduced in the following experimental section.

3.2. Experimental Setup and Evaluation Criteria

To maintain the fairness of the experimental environment, the setup for the CNN references the design in RCC-MRF and the corresponding parameters listed in Table 2. In the CNN training process, the input image patch is set as 27 × 27, 1000 pixels in each image are randomly chosen as training samples, and others are stored to evaluate the performance of the CNN. The superpixels in each image also settle at the same number as in RCC-MRF. There are 3000 superpixels in the synthetic SAR image, and 9000 superpixels each in the SF-Bay image, Flevoland image, and Lillestroem image. The overall accuracy (OA) and kappa coefficient ( $κ$ ) are used to evaluate the classification results.

3.3. Performance Analysis on Synthetic SAR Image

The synthetic SAR image has eight categories, having four curved boundary areas on the upper left and four textured areas on the right side. A non-linear boundary will cause incorrect positioning, and complex textures will cause misclassification, both leading to increased difficulty of classification accuracy.

The accuracies of the five methods on the synthetic SAR image are shown in Table 3. The proposed method obtains the best classification results, i.e., 95.12% in OA and 94.73% in $κ$ . The OA of RCC-MRF is 93.60% and the OA of CNN-SP is 93.39%. CNN obtains the lowest value of OA, which is 90.74%.

The classification results are shown in Figure 5. In the area with complex texture information, the differences in the same class become large, and differences between adjacent classes become small, which caused much misclassification. CNN classification, regarding results in the right half of the synthetic image, not only misjudges the boundary but also generates pixel-level misclassifications. For the proposed method, the texture information can be captured through the structure of superpixels, so it shows internal smooth and regular edge results on the textured area. The curved boundary areas on the upper left are long and narrow, and superpixels generated by SLIC do not quite fit at the edge. The oversegmentation of superpixels and speckle noise cause the region-level misclassification from class 1 to 4. For the adjacent classes 6 and 7 in CNN-MRF, due to the degree of texture similarity, the differences in the same class become large, and differences between adjacent classes become small, which causes much misclassification. The proposed method performs well in terms of regional consistency, and the initial misclassification is smoothed by the structure of the superpixels.

3.4. Performance Analysis on Flevoland Image

The Flevoland image is a real dataset with five categories: Forest, Farmland 1, Farmland2, Urban, and Water. To further examine the performance of the proposed method, the proposed method with 4000 superpixels will be added for real SAR image classification comparison.

The accuracies of the algorithms on the Flevoland image are listed in Table 4. The proposed method outperforms the other algorithms and obtains an OA of 89.77% and a $κ$ of 86.03%. The OA of RCC-MRF is quite close to that of the proposed method at 89.44%. CNN-SP and CNN-MRF receive results of 88.90% and 88.55% respectively. Unsurprisingly, CNN has the lowest value of OA, which is 85.52%.

The classification result maps for all algorithms are shown in Figure 6. Through generating superpixels, a large number of pixel-level misclassifications are smoothed out with high efficiency in the class Forest. In the class Urban, due to the complexity of the texture information, a pixel-level algorithm cannot recognize the terrains clearly, as shown in Figure 6a,b. Meanwhile in Figure 6e, establishing spatial context information between adjacent areas rectifies the region-level misclassification and shows better performance and accuracy than RCC-MRF. It should be mentioned that some region-level misclassifications increased regarding which class is labeled as Farmland 2. These inaccuracies are caused by the structure of superpixels and the poor prior knowledge from CNN with the class accuracy of 35.81%. As a result, all region-level algorithms suffer terrible classification results. Compared with CNN-SP and RCC-MRF, the proposed method still obtains a higher accuracy in Farmland 2 due to the improvement from the probability field.

3.5. Performance Analysis on San Francisco Bay Image

The real dataset SF-Bay image is used to evaluate the performance of the algorithms with regard to five categories, which are Build-up 1, Build-up 2, Build-up 3, Water, and Vegetation.

The assessments of the accuracies of the five designed algorithms on the SF-Bay image are shown in Table 5. The OA of the proposed method has the highest value with a rate of 87.69%. The classification performance of RCC-MRF is better than those of the pixel-level algorithms and CNN-SP.

The classification result maps for all explored algorithms are shown in Figure 7. In Figure 7a,b, the distribution of misclassifications scatters in each category with the level of the pixel, and regions of Build-up 1, Build-up 2, and Build-up 3 suffer the majority of the inaccuracies. From Figure 7c–e, it can be seen that region-level smoothing improves the performance of the results map, and plenty of pixel-level errors vanished. Especially in the category Build-up 3 at the bottom-right corner, the proposed method reaches the highest classification accuracy with a superiority of 8.62% over RCC-MRF, which proves the outperformance of the probability field.

3.6. Performance Analysis on Lillestroem Image

The Lillestroem image is a real dataset captured by the TerraSAR-X satellite and has five categories: River, Forest, Grassland, Building, and Road. Compared with the Flevoland and SF-Bay images, the Lillestroem image has two main characteristics. Firstly, the Lillestroem image is quite large, with a size of 3580 × 2250, and its resolution is higher than those of the Radasat-2 datasets, which created a certain challenge for the algorithms. Secondly, the terrains of Building and Road are narrow and slender, which could cause failures of superpixel segmentation.

The classification results on the Lillestroem image are listed in Table 6. The proposed method achieves the highest accuracy among the explored algorithms. In the categories of River, Forest, and Grassland, the performance of the proposed method is superior to those of the others. Meanwhile, in the category of Building, the proposed method fails to improve the accuracy, and it even becomes worse in the region of Road. The invalidation of classification is caused by two aspects: the initialization from CNN is ineffective (accuracy of CNN in Building is 22.02%, in Road is 4.82%), and the shape of the superpixels is not suitable for narrow landcover. From the classification result map of RCC-MRF, the label of Building spread to other regions, which indicates that the improvement of the result is based on overdetection, which sacrifices the accuracies of other regions. Though the proposed method fails to reach the highest accuracy in the class of Building, the corresponding label stays at the bottom right corner, which indicates that overdetection does not occur. In addition, there are certain numbers of region-level misclassifications that appear in RCC-MRF, which caused it to have the worst performance in the class of Forest. From the classification results of the proposed method, some advantages are visible to the naked eye: intra-region classification remains consistent, and boundaries between regions are relatively smooth.

The classification result maps are shown in Figure 8. In Figure 8d, RCC-MRF exhibits much region-level misclassification, especially in the category Forest. Focusing on the same field of view in Figure 8e, region-level mistakes are suppressed by the proposed method, and edges between categories become distinct to some degree.

3.7. Demonstrating the Effect of the Probability Field

The RCC-MRF method generates an RCC term to complement the distribution of the Gaussian mixture model. The properties of the vector form drop, and there is a lack of consideration of the correlations between adjacent regions. To further verify the capacity of the proposed method, comparisons between RCC-MRF and the proposed method are considered by increasing superpixels from 2000 to 30,000 sets. The comparison results on the SF-Bay image for the two algorithms are shown in Figure 9. Both RCC-MRF and the proposed method reach their highest OA at 9000 superpixels. With the increase of superpixels, the accuracy of RCC-MRF goes down sharply; meanwhile, the change in the proposed method is gentle and robust. The main computation cost for RCC-MRF and the proposed method lies in the superpixel generation process. When the number of superpixels is not suitable, the accuracy of segmentation will decline, which causes errors of classification to increase. RCC-MRF is more sensitive to the number of superpixels, so the right decision for selecting numbers requires repeated testing. However, the proposed method has a certain robustness regarding the number of superpixels, and it would be easy to balance the computation cost and classification accuracy.

4. Conclusions

In this paper, a novel classification method using Markov random fields with deep learning is proposed for SAR images. In this method, a probability field based on neural network is proposed to describe the relationships between regions. By constructing binary and unary parts based on probabilities, the contextual information is fully considered. The energy function is obtained by both the intensity field and probability field, which allows a better initialization of the MRF. Experiments on real SAR datasets with several evaluation indicators are performed, and the proposed framework is confirmed to provide better performance in comparison to existing methods in terms of robustness and accuracy.

Author Contributions

Conceptualization, X.Y. (Xiangyu Yang) and X.Y. (Xuezhi Yang); methodology, X.Y. (Xiangyu Yang); software, X.Y. (Xiangyu Yang) and J.W.; validation, X.Y. (Xiangyu Yang) and C.Z.; formal analysis, X.Y. (Xiangyu Yang) and C.Z.; writing-original draft preparation, X.Y. (Xiangyu Yang); writing-review and editing X.Y. (Xiangyu Yang), X.Y. (Xuezhi Yang), C.Z. and J.W. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Figures and Tables

View Image - Figure 1. Flowchart of the proposed approach. a, b, c and d are the four adjacent superpixel regions connecting based on the rule of shared boundaries. The grey arrow describes the process of constructing intensity field; the blue arrow describes the process of constructing label field; the purple arrow describes the process of constructing probability field.

Figure 1. Flowchart of the proposed approach. a, b, c and d are the four adjacent superpixel regions connecting based on the rule of shared boundaries. The grey arrow describes the process of constructing intensity field; the blue arrow describes the process of constructing label field; the purple arrow describes the process of constructing probability field.

Figure 2. Illustration of an RAG.

Figure 3. The structure of a convolutional neural network.

View Image - Figure 4. SAR images and the corresponding ground truth. (a) The synthetic SAR image. (b) Ground truth for the synthetic SAR image. (c) The Flevoland image. (d) Ground truth for the Flevoland image. (e) The SF-Bay image. (f) Ground truth for the SF-Bay image. (g) The Lillestroem image. (h) Ground truth for the Lillestroem image.

Figure 4. SAR images and the corresponding ground truth. (a) The synthetic SAR image. (b) Ground truth for the synthetic SAR image. (c) The Flevoland image. (d) Ground truth for the Flevoland image. (e) The SF-Bay image. (f) Ground truth for the SF-Bay image. (g) The Lillestroem image. (h) Ground truth for the Lillestroem image.

Figure 5. The classification results on the synthetic SAR image. (a) CNN. (b) CNN-MRF. (c) CNN-SP. (d) RCC-MRF. (e) Proposed method. (f) Ground truth.

Figure 6. The classification results on the Flevoland image. (a) CNN. (b) CNN-MRF. (c) CNN-SP. (d) RCC-MRF. (e) Proposed method. (f) Ground truth.

Figure 7. The classification results on the SF-Bay image. (a) CNN. (b) CNN-MRF. (c) CNN-SP. (d) RCC-MRF. (e) Proposed method. (f) Ground truth.

Figure 8. The classification results on the Lillestroem image. (a) CNN. (b) CNN-MRF. (c) CNN-SP. (d) RCC-MRF. (e) Proposed method. (f) Ground truth.

Figure 9. The classification results of RCC-MRF and the proposed method on the SF-Bay image.

Table 1

Geographic information of real images.

Image	San Francisco Bay	Flevoland	Lillestroem
Sensor	Radarsat-2 SAR	Radarsat-2 SAR	TerraSAR-X
Polarization	HH	HH	HH
Resolution (m)	10	10	0.38
Date	2015	2017	2013
Size	1101 × 1161	1000 × 1400	3580 × 2250
Coordinates	37°47′N	52°22′N	37°47′N
	122°28′W	5°27′E	118°54′E
Total Categories	five	five	five

Table 2

The parameters for the CNN.

Layer	Strategy	Kernel Size	Feature Maps
The first convolution layer	RELU	4 × 4 × 20	24 × 24 × 20
The first pooling layer	max-pooling		12 × 12 × 20
The second convolution layer	RELU	5 × 5 × 20	8 × 8 × 20
The second pooling layer	max-pooling		4 × 4 × 20
The fully connected layer	softmax		320 × 1

Table 3

Accuracies of classification on synthetic SAR image.

Class	CNN	CNN-MRF	CNN-SP	RCC-MRF	Proposed
Class1	92.98%	97.95%	91.31%	91.20%	90.98%
Class2	97.43%	99.20%	95.79%	93.62%	96.30%
Class3	96.17%	97.86%	93.63%	93.02%	93.51%
Class4	96.43%	98.21%	96.26%	96.65%	96.66%
Class5	96.82%	91.84%	97.43%	96.47%	97.67%
Class6	70.86%	87.89%	87.50%	88.99%	93.96%
Class7	84.21%	78.13%	91.17%	93.23%	95.81%
Class8	94.91%	94.44%	97.02%	98.25%	98.94%
OA	90.74%	93.01%	93.39%	93.60%	95.12%
$κ$	89.25%	92.01%	92.42%	93.05%	94.73%

Table 4

Accuracies of classification on Flevoland image.

Class	CNN	CNN-MRF	CNN-SP	RCC-MRF	Proposed
Forest	88.76%	95.18%	96.86%	97.83%	97.59%
Farmland1	97.36%	97.36%	97.36%	98.13%	97.31%
Farmland2	35.81%	38.82%	26.43%	24.22%	31.73%
Urban	82.59%	84.54%	94.96%	99.07%	95.92%
Water	94.58%	97.26%	99.24%	99.59%	99.29%
OA	85.52%	88.55%	88.90%	89.56%	89.77%
$κ$	80.40%	84.41%	85.97%	85.90%	86.03%

Table 5

Accuracies of classification on SF-Bay image.

Class	CNN	CNN-MRF	CNN-SP	RCC-MRF	Proposed
Build-up 1	83.52%	89.09%	91.44%	92.32%	94.40%
Build-up 2	78.56%	85.65%	84.29%	90.00%	87.92%
Water	91.62%	93.34%	94.67%	95.56%	95.58%
Vegetation	81.85%	89.30%	85.97%	88.34%	88.20%
Build-up 3	52.42%	42.35%	62.36%	60.06%	68.68%
OA	78.69%	82.19%	84.62%	86.45%	87.69%
$κ$	73.00%	77.27%	80.50%	82.78%	84.38%

Table 6

Accuracies of classification on Lillestroem image.

Class	CNN	CNN-MRF	CNN-SP	RCC-MRF	Proposed
River	37.88%	39.85%	43.84%	48.96%	53.96%
Forest	77.09%	81.97%	81.04%	75.09%	85.08%
Grassland	66.30%	75.93%	70.52%	73.59%	77.36%
Building	22.02%	00.00%	08.59%	29.53%	22.10%
Road	04.82%	01.73%	00.00%	00.00%	00.00%
OA	59.97%	64.51%	63.30%	64.76%	70.08%
$κ$	41.06%	46.56%	47.20%	48.29%	56.63%

References

1. Alberto, M.; Prats-Iraola, P.; Younis, M.; Krieger, G.; Hajnsek, I.; Papathanassiou, K.P. A tutorial on synthetic aperture radar. IEEE Geosci. Remote Sens. Mag.; 2013; 1, pp. 6-43.

2. Simard, M.; DeGrandi, G.; Thomson, K.P.; Benie, G.B. Analysis of speckle noise contribution on wavelet decomposition of SAR images. IEEE Trans. Geosci. Remote Sens.; 1998; 36, pp. 1953-1962. [DOI: https://dx.doi.org/10.1109/36.729367]

3. Melgani, F.; Melgani, S.B. A Markov random field approach to spatiotemporal contextual image classification. IEEE Trans. Geosci. Remote Sens.; 2003; 41, pp. 2478-2487. [DOI: https://dx.doi.org/10.1109/TGRS.2003.817269]

4. Trianni, G.; Gamba, P. Boundary-adaptive MRF classification of optical very high resolution images. Proceedings of the 2007 IEEE International Geoscience and Remote Sensing Symposium; Barcelona, Spain, 23–28 July 2007; IEEE: Piscataway, NJ, USA, 2007.

5. Deng, H.; Clausi, D.A. Gaussian MRF rotation-invariant features for image classification. IEEE Trans. Pattern Anal. Mach. Intell.; 2004; 26, pp. 951-955. [DOI: https://dx.doi.org/10.1109/TPAMI.2004.30]

6. Wu, Y.; Ji, K.; Yu, W.; Su, Y. Region-Based Classification of Polarimetric SAR Images Using Wishart MRF. IEEE Geosci. Remote Sens. Lett.; 2008; 5, pp. 668-672. [DOI: https://dx.doi.org/10.1109/LGRS.2008.2002263]

7. Li, W.; Prasad, S.; Fowler, J.E. Hyperspectral Image Classification Using Gaussian Mixture Models and Markov Random Fields. IEEE Geosci. Remote Sens. Lett.; 2014; 11, pp. 153-157. [DOI: https://dx.doi.org/10.1109/LGRS.2013.2250905]

8. Christian, S.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Boston, MA, USA, 7–12 June 2015; IEEE: Piscataway, NJ, USA, 2015.

9. Postadjian, T.; Le Bris, A.; Mallet, C.; Sahbi, H. Superpixel partitioning of very high resolution satellite images for large-scale classification perspec-tives with deep convolutional neural networks. Proceedings of the IGARSS 2018–2018 IEEE International Geoscience and Remote Sensing Symposium; Valencia, Spain, 22–27 July 2018; IEEE: Piscataway, NJ, USA, 2018.

10. Olivier, C.; Haffner, P.; Vapnik, N.V. Support vector machines for histogram-based image classifica-tion. IEEE Trans. Neural. Netw.; 1999; 10, pp. 1055-1064.

11. McNairn, H.; Kross, A.; Lapen, D.; Caves, R. Shang. Early season monitoring of corn and soybeans with TerraSAR-X and RADARSAT-2. Int. J. Appl. Earth Obs. Geoinf.; 2014; 28, pp. 252-259.

12. Rudolf, R.; Frost, A.; Lehner, S. A neural network-based classification for sea ice types on X-band SAR images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.; 2015; 8, pp. 3672-3680.

13. Zhang, Z.; Wang, H.; Xu, F.; Jin, Y.-Q. Complex-valued convolutional neural network and its application in polarimetric SAR image classification. IEEE Trans. Geosci. Remote Sens.; 2017; 55, pp. 7177-7188. [DOI: https://dx.doi.org/10.1109/TGRS.2017.2743222]

14. Zhou, Y.; Wang, H.; Xu, F.; Jin, Y.-Q. Polarimetric SAR Image Classification Using Deep Convolutional Neural Networks. IEEE Geosci. Remote Sens. Lett.; 2016; 13, pp. 1935-1939. [DOI: https://dx.doi.org/10.1109/LGRS.2016.2618840]

15. Wang, N.; Wang, Y.; Liu, H.; Zuo, Q.; He, J. Feature-Fused SAR Target Discrimination Using Multiple Convolutional Neural Networks. IEEE Geosci. Remote Sens. Lett.; 2017; 14, pp. 1695-1699. [DOI: https://dx.doi.org/10.1109/LGRS.2017.2729159]

16. Zhao, W.; Du, S. Spectral–spatial feature extraction for hyperspectral image classification: A dimension re-duction and deep learning approach. IEEE Trans. Geosci. Remote Sens.; 2016; 54, pp. 4544-4554. [DOI: https://dx.doi.org/10.1109/TGRS.2016.2543748]

17. Shang, R.; He, J.; Wang, J.; Xu, K.; Jiao, L.; Stolkin, R. Dense connection and depthwise separable convolution based CNN for polarimetric SAR image clas-sification. Knowl.-Based Syst.; 2020; 194, 105542. [DOI: https://dx.doi.org/10.1016/j.knosys.2020.105542]

18. Ding, Y.; Zhang, Z.; Zhao, X.; Hong, D.; Cai, W.; Yu, C.; Yang, N.; Cai, W. Multi-feature fusion: Graph neural network and CNN combining for hyperspectral image classification. Neurocomputing; 2022; 501, pp. 246-257. [DOI: https://dx.doi.org/10.1016/j.neucom.2022.06.031]

19. Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell.; 2012; 34, pp. 2274-2282. [DOI: https://dx.doi.org/10.1109/TPAMI.2012.120] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/22641706]

20. Anjan, S.; Biswas, M.K.; Sharma, K.M.S. A simple unsupervised MRF model based image segmentation approach. IEEE Trans. Image Process; 2000; 9, pp. 801-812.

21. Yang, X.; Clausi, D.A. SAR sea ice image segmentation using an edge-preserving region-based MRF. Proceedings of the 2009 16th IEEE International Conference on Image Processing (ICIP); Cairo, Egypt, 7–10 November 2009; IEEE: Piscataway, NJ, USA, 2009.

22. Zheng, C.; Hu, Y.; Wang, L.; Qin, Q. Region-based MRF model with optimized initial regions for image segmentation. Proceedings of the 2011 International Conference on Remote Sensing, Environment and Transportation Engineering; Nanjing, China, 24–26 June 2011; IEEE: Piscataway, NJ, USA, 2011.

23. Bi, H.; Yao, J.; Wei, Z.; Hong, D.; Chanussot, J. PolSAR Image Classification Based on Robust Low-Rank Feature Extraction and Markov Random Field. IEEE Geosci. Remote Sens. Lett.; 2020; 19, pp. 1-5. [DOI: https://dx.doi.org/10.1109/LGRS.2020.3034700]

24. Duan, Y.; Liu, F.; Jiao, L.; Zhao, P.; Zhang, L. SAR Image segmentation based on convolutional-wavelet neural network and markov random field. Pattern Recognit.; 2017; 64, pp. 255-267. [DOI: https://dx.doi.org/10.1016/j.patcog.2016.11.015]

25. Zhang, A.; Yang, X.; Fang, S.; Ai, J. Region level SAR image classification using deep features and spatial constraints. ISPRS J. Photogramm. Remote Sens.; 2020; 163, pp. 36-48. [DOI: https://dx.doi.org/10.1016/j.isprsjprs.2020.03.001]

26. Samat, A.; Gamba, P.; Du, P.; Luo, J. Active extreme learning machines for quad-polarimetric SAR imagery classification. Int. J. Appl. Earth Obs. Geoinf.; 2015; 35, pp. 305-319. [DOI: https://dx.doi.org/10.1016/j.jag.2014.09.019]

27. Geng, J.; Fan, J.; Wang, H.; Ma, X.; Li, B.; Chen, F. High-Resolution SAR Image Classification via Deep Convolutional Autoencoders. IEEE Geosci. Remote Sens. Lett.; 2015; 12, pp. 2351-2355. [DOI: https://dx.doi.org/10.1109/LGRS.2015.2478256]

Word count: 5962

Show less

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Classification algorithms integrated with convolutional neural networks (CNN) display high accuracies in synthetic aperture radar (SAR) image classification. However, their consideration of spatial information is not comprehensive and effective, which causes poor performance in edges and complex regions. This paper proposes a Markov random field (MRF)-based algorithm for SAR image classification which fully considers the spatial constraints between superpixel regions. Firstly, the initialization of region labels is obtained by the CNN. Secondly, a probability field is constructed to improve the distribution of spatial relationships between adjacent superpixels. Thirdly, a novel region-level MRF is employed to classify the superpixels, which combines the intensity field and probability field in one framework. In our algorithm, the generation of superpixels reduces the misclassification at the pixel level, and region-level misclassification is rectified by the improvement of spatial description. Experimental results on simulated and real SAR images confirm the efficacy of the proposed algorithm for classification.

Details

Title

SAR Image Classification Using Markov Random Fields with Deep Learning

Author

Yang, Xiangyu¹; Yang, Xuezhi¹

; Zhang, Chunju²; Wang, Jun³

¹ Anhui Province Key Laboratory of Industry Safety and Emergency Technology, School of Computer and Information, Hefei University of Technology, Hefei 230009, China
² College of Civil Engineering, Hefei University of Technology, Hefei 230009, China
³ School of Mechanical Engineering, Quzhou University, Quzhou 324000, China

First page

617

Publication year

2023

Publication date

2023

Publisher

MDPI AG

e-ISSN

20724292

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.3390/rs15030617

ProQuest document ID

2774971058

SAR Image Classification Using Markov Random Fields with Deep Learning

Jump to:

Full Text

Abstract

Details

Suggested sources