1. Introduction
Retinal vessels play an important role in various fields, especially in clinical medicine, wherein retinal vessels obtained are particularly important for diagnosing certain diseases and are also very easy to observe. Retinal morphology and density can provide timely feedback to the doctor and can cause visual impairment in mild cases and blindness in severe cases, such as diabetes, as well as cardiovascular diseases [1,2], so the precise segmentation of retinal vessels is crucial. The manual operation of retinal vessel segmentation was adopted in the early years, which is relatively backward. This method is not only time-consuming and labor-intensive, resulting in low efficiency, but also the segmentation results are not ideal. With the development of medical images, more and more image segmentation methods are used to segment retinal vessels. These have achieved relatively good results. Therefore the accuracy of retinal vascular segmentation for blindness prevention is particularly important [3].
In the early stage of medical image segmentation, the development of Convolutional Neural Networks (CNNs) [4] has achieved good results in image segmentation. CNNs can automatically learn features mainly through multi-layer structures and are also capable of learning features at multiple levels. Long et al. [5] proposed an FCN network structure that can classify images at the pixel level. FCN replaces the fully connected layer in CNNs with a convolutional layer, which can solve the segmentation problem at the semantic level by classifying each pixel. On this basis, Ronneberger et al. [6] proposed a network structure similar to a u-shaped encoder–decoder, which is called U-Net. In the middle of the encoder and decoder, a skip connection is used to obtain better context information and global features, and four pooling layers can also realize the multi-scale feature recognition of image features. Because of the excellent performance of U-Net, most of the medical image segmentation networks are based on the improvement of U-Net. Zhou et al. [7] proposed a U-Net++ network structure whose main purpose is to solve the low efficiency of the U-Net network caused by a large number of experiments and the problem of feature fusion only at the same scale; it significantly improves the optimal depth of supervised learning, can flexibly aggregate more scale features in skip connection and improves the efficiency by pruning technology to achieve a better segmentation effect. Alom et al. [8] proposed the R2U-Net model for medical image segmentation, adding a convolutional neural network of cyclic residuals to the U-Net model, which allows the network to deepen while avoiding the problem of disappearing gradients in the better acquisition of features at the same time to improve the accuracy of segmentation. The Attention U-Net network proposed by Oktay et al. [9] can well incorporate an attention mechanism into the neural network to make the skip connection more selective, which can improve attention to the segmented region and improve performance without increasing the calculation amount. To better fuse features together, Huang et al. [10] proposed a multi-scale deeply supervised network structure U-Net+++, which can improve location awareness and enhance the segmentation of boundaries while providing fewer parameters. Trans U-Net network structure proposed by Chen et al. [11] uses the Transformer [12] structure on the structure of the U-Net encoder. Transformer structure can better extract global information, which can make up for CNN’s inability to better extract global information so as to better extract features.
In retinal vascular image segmentation, U-Net and other related networks have also made many improvements. Jin et al. [13] proposed a DU-Net network structure, which uses deformable convolution to replace the original convolution layer, which combines low-level features with high-level features and realizes accurate segmentation according to the size and shape of blood vessels. Hu et al. [14] proposed an S-UNet network structure to avoid the loss of feature information caused by sampling under the U-Net structure. By connecting Mi-U-Net networks in series, this network structure can prevent the overfitting problem caused by small data volume. Wang et al. [15] proposed a FRNet that, by introducing a FAF structure, could efficiently combine features of different depths and reduce information loss. Yang et al. [16] proposed an improvement based on the U-Net network. The method used is to add a decoder. By adding a fusion network to fuse the outputs of two decoders, the two decoders are responsible for segmenting thin and thick vessels, thus achieving accurate segmentation of retinal vessels. Dong et al. [17] proposed a cascaded residual attention U-Net for retinal vessel segmentation, which they named CRAUnet. In this method, they used a similar DropBlock regularization, which could greatly reduce the overfitting problem. In addition, they also used an MFCA module to explore helpful information and to merge information instead of using a direct skip connection. Yang et al. [18] proposed a structure called DCU-net, wherein they used deformable convolution to construct the feature extraction module, and to improve the transfer efficiency, they used a residual channel attention module. Yang et al. [19] proposed a method based on residual attention and dual-supervision cascaded U-Net and named it RADCU-Net. This enhances efficiency while improving the accuracy of retinal vessel segmentation.
Combining the above analysis, an improved retinal vessel segmentation method based on the U-Net network structure is proposed. It has been shown to be effective in both DRIVE and CHASE_DB1 datasets. The main contributions to this paper are as follows:
(1). A new network structure based on U-Net is proposed, which can be used to accurately and efficiently detect retinal blood vessels.
(2). For the problem of partial feature loss caused by using convolution many times, ResNest is used to replace the original convolution layer of the encoder as the main network to enhance the feature extraction, which can better extract the image feature information.
(3). A novel DFB network architecture is proposed to solve the problem of up-and-down feature mismatch caused by skip connections. This can better achieve the image of low-level features and high-level features of the fusion to achieve accurate vascular segmentation.
2. Materials and Methods
Aiming at the problem of retinal vessel segmentation, an improved segmentation network based on U-Net is proposed. This section describes in detail the framework and modules of the proposed U-Net-based improved segmented network.
2.1. Network Structure
Figure 1 is the whole network structure of the proposed improved segmentation network based on U-Net. It is composed of two parts: the main U-shaped network and the multi-scale fusion block. In order to reduce the loss of more image features from encoding to decoding, a DFB structure is proposed in combination with an encoder–decoder framework. In order to solve the problem of information loss caused by convolution, ResNest Block is used to replace the convolution module, which makes the encoder extract image features better. In order to solve the problem of local feature information loss caused by using convolution multiple times in feature extraction, the original convolution module is replaced by ResNest Block so that the encoder can better extract image feature information. In order to solve the problem of image information loss caused by skip connection, a DFB network structure is proposed to optimize the original skip connection and achieve effective multi-scale feature representation.
2.2. Feature Extraction
It is very important to accurately extract the characteristics of retinal blood vessels if we want to understand the patient’s condition through the morphology of retinal blood vessels. We think that the loss of image feature information in the U-Net network structure is mainly caused by convolution in the encoder, so we mainly make some changes to the encoder. Since the deep convolutional neural network has achieved good results in image processing, the ResNest [20] module is used to replace the convolutional block of the encoder for better extraction of image features.
ResNet [21], as one of the most successful CNN architectures, is widely used in computer vision. From ResNet to ResNext [22] and then to ResNest, as the most successful improvement of ResNet, it has a relatively good performance in computer vision downstream tasks. Not only is its computational efficiency the same as ResNet, but also its speed accuracy is better. ResNest also performs well relative to other networks of similar model complexity without introducing additional computational costs and can be used as a skeleton for other tasks. The network structure is shown in Figure 2.
2.3. Depthwise FCA Block
Due to the semantic gap in the skip connection of the U-Net, the context feature information is not processed properly, which causes the mismatch between the low-level image and the high-level image feature fusion. Therefore, this network structure (see Figure 3) is proposed to optimize the existing skip connection part and realize multi-scale feature fusion expression.
In the proposed structure, depth separable convolution [23], a lightweight network, is used. Compared to conventional convolutional operations, the number of parameters and operating costs is relatively low. In this paper, four depth-separable convolutions are connected in parallel. Because too large of a convolution kernel can cause a waste of resources and computing costs, the convolution kernels of the four parallel depth-separable convolutions are selected as 1, 5, 9, and 13. On this basis, after the parallel depth-separable convolution is completed, stitching and clipping operations are performed, and finally the FCA Block [24] structure is entered (see Figure 4), so as to better process the feature information of low-level images. FCA Block is a frequency channel attention network that can highlight important channels in a multi-channel feature map and better express feature information. The FCA Block can also make up for the defect of insufficient feature information in the existing channel attention method, and the channel is generalized to a more general two-dimensional discrete cosine transform (DCT) form through a global average pool. In this way, more frequency components can be introduced to make full use of the information, and the image feature information processed by the encoder can be better fused to the decoder.
2.4. Datasets
In order to fully reflect the performance of the proposed network structure, the method is evaluated on two common datasets: DRIVE [25], CHASE_DB1 [26] (See Table 1).
Drive: There are a total of 40 fundus photos, 20 of which are used as a training set and 20 as a test set. The size of each image is 584 × 565 pixels, and the channel is 3. Each image also has a circular 45° field of view (FOV) for performance evaluation.
CHASE_DB1: A total of 28 eyeground photographs of 14 children, 20 of them as a training set and 8 as a test set. The size of each image is 999 × 960 pixels, and the channel is 3. Each image also has a circular 30° field of view (FOV) for performance evaluation.
Figure 5 is a partial sample image of both datasets.
2.5. Image Preprocessing
In order to train the proposed model better, image preprocessing is an important task. In the processing of retinal blood vessel images, the method of normalization processing is used. Since the DRIVE and CHASE_DB1 datasets have different pixels, we have standardized the image input pixels to 480 × 480.
Then each channel of the retinal vascular image is normalized. That is, the mean is subtracted from the characteristics of each channel, and then divided by variance. On this basis, we also use random flips and random clipping to enhance the data to achieve better training of the proposed model (see Figure 6). To prove that the preprocessing is effective, we performed a quantitative process (see Table 2), and the models for the comparison experiments we conduct later are in Table 3.
3. Results
In order to better reflect the proposed U-Net based improved network for retinal blood vessel segmentation of the image effect, the use of comparative experiments and ablation experiments to prove its effectiveness.
This section first describes the relevant evaluation indicators, then tests the basic segmentation network structure and compares it with the proposed method.
3.1. Evaluation Indicators
In order to better highlight the effectiveness of the proposed model, some evaluation indicators, including accuracy, F1-score, sensitivity, specificity, and precision, were used to evaluate the segmentation ability of retinal vascular images.
Acc (Accuracy): This reflects the proportion of correctly classified blood vessels and background pixels to the total number of pixels (see Equation (1)).
(1)
Among them, TP and TN represent correctly segmented retinal vessels and background pixels, respectively, FP and FN represent incorrectly segmented retinal vessels and background pixels, respectively.
F1-score: The ability to measure the accuracy of a dichotomous model, taking into account both model accuracy and recall, is a harmonic mean of both accuracy and recall (see Equation (2)).
(2)
Se (Sensitivity): Also known as true positive rate (TPR), this represents the proportion of retinal vessels correctly identified (see Equation (3)).
(3)
Sp (Specificity): Also known as true negative rate (TNR), this represents the proportion of pixel points with correct background pixel classification to the total number of background pixels (see Equation (4)).
(4)
Pr (Precision): This represents the proportion of the number of correctly segmented retinal vessel pixels to the total number of segmented retinal vessel pixels (see Equation (5)).
(5)
3.2. Experimental Setup
The proposed model is based on the Py-Torch framework. For training of the model, the epoch for training was set to 200. We used the SGD optimizer, setting the learning rate to 1 × 10−2, the momentum to 0.9, and the weight decay to 1 × 10−4. The batch size was set to 4. In addition, in order to speed up the network training and testing process, the Nvidia GeForce RTX5000 TI card was used on the above-mentioned experimental process (See Figure 7).
For the loss function, cross entropy loss plus dice loss was selected. Cross-entropy loss, which is a common loss function in semantic segmentation, can not only obtain the difference in prediction probability but also measure the performance of different classifiers in more detail.
3.2.1. Cross-Entropy Loss
Cross-entropy is a very important concept in information theory, whose main purpose is to measure the difference between two probability distributions. For image segmentation, cross-entropy loss is calculated by the average cross entropy of all pixels. The definition of cross-entropy loss is:
(6)
If Ω represents a pixel region, the pixel region consists of height a, width b, and K classes. Then there is , (see Equation (6)).
3.2.2. Dice Loss
(7)
where X denotes the pixel tag of a true-segmented retinal vessel, and Y denotes the pixel category of the model-predicted retinal vessel-segmented image (see Equation (7)).3.3. Ablation Experiments
In order to prove the effectiveness of each method, the ablation experiment is used to study the performance of each module for segmented images. First, the basic U-Net network is tested, and then the performance analysis is carried out by adding ResNest Block and DFB network structures. The results are shown in Table 4’s ablation experiment results.
From the results of the ablation experiment, it can be seen that, compared with the basic network structure model, our proposed model has improved in all indicators, especially in F1-score and Pr indicators. √ in the table means the module has been added, and × means the module has not been added.
On the DRIVE dataset, F1-score improved from 0.8278 to 0.8687, and Pr improved from 0.7287 to 0.7863. When the DFB module is added, the F1-score, Sp, and Pr indicators were all improved, which is enough to prove the effectiveness of the proposed DFB module.
In the Chase_DB1 dataset, the performance indicator can also show that the proposed model has a good effect. The key indicators F1-score and Pr are more effective, and the other indicators also show more outstanding performance than the basic network model.
From ablation experiments of the above two datasets, we can see that, except for Acc and Se, by adding the DFB module, the other performance indicators have been improved. Although Acc and Se decreased slightly after the addition of the DFB module, the performance of the proposed model is improved, which shows the validity of the proposed model for retinal image segmentation.
3.4. Comparison Test with Other Models
In order to better highlight the effect of the proposed model, we conducted a series of experiments in two public datasets, showing the effect of retinal blood vessel segmentation and corresponding indicators. The results were compared with more advanced models, with quantitative analysis and qualitative analyses of relevant indicators.
As shown in Table 5, the F1-score of the proposed improved network structure based on U-Net on the DRIVE dataset can reach 0.8687, the score on Sp can reach 0.9703, and the score on Pr can reach 0.7863, which are higher than the scores of other models. Compared to the other best data, the difference between Acc and the best score was 0.0080, and the difference between Se scores was 0.0842.
On the CHASE_DB1 dataset, as shown in Table 6, the proposed improved network structure based on U-Net still obtains relatively good results. Among them, F1-score can reach 0.8358, Sp score can reach 0.9720, Pr score can reach 0.7330; these scores are also the highest compared with other models. Compared to the other best data, the numerical difference between Acc and the best score was 0.0060 and between Se and the best score was 0.1207.
Obviously, the network structure has obvious segmentation effect in the two data sets. From the contrast experiment, the F1-score, Sp, and Pr of the improved model based on U-Net are the highest on DRIVE dataset and CHASE_DB1 dataset, but the effect of the change on Acc and Se is not obvious. Among them, the effect of Acc is not significantly reduced compared to other advanced models, so the improvement of Acc is effective on the whole.
The Se effect is not ideal or even slightly decreased in the ablation test and comparison test above, which may be caused by an excessive proportion of background noise pixels. Because the background pixel noise is too high, this will lead to missegmentation of the background pixel, resulting in an increase in FN in Equation (3), which will decrease Se. In addition, other noise factors may be introduced in the identification process, which may be identified as background pixels, so that the proportion of correctly identified retinal blood vessel pixels measured may be reduced, which may also lead to a slight decrease in the Se indicator.
In order to prove that the inference that the Se indicator obviously decreases in the above experiment is correct, we adopt the opening and closing operation on the picture to remove the noise of the background pixels and analyze the influence of noise on the Se indicator.
From the analysis of the Se performance indicator by denoising in Table 7, it can be seen that the performance indicator of Se rises after denoising, which proves that the inference we made above is correct. On the whole, the proposed model has better segmentation performance.
We did the same for Acc and found that the denoised Acc metric also increased. The results are presented in Table 8.
3.5. Qualitative Analysis
In order to visualize the effect of retinal vessel segmentation, the performance of the proposed model was qualitatively analyzed by visualization. Some samples are selected from the DRIVE and CHASE_DB1 datasets as experimental objects, and the effect of the segmentation map can be seen in Figure 8.
We can clearly see the effect of the proposed model on retinal vessel segmentation in Figure 8. Due to different light intensities in the original image, the segmentation of the darker regions of the retinal vessels is not ideal, and some regions are over-segmented. In the retinal blood vessels, there are more dense areas, which leads to the poor segmentation of small and thin vessels. However, based on qualitative and quantitative analysis results, the proposed model has better segmentation performance for retinal blood vessel segmentation.
For comparison, we also show predicted images from other models (Figure 9).
4. Discussion
Retinal vessel segmentation is still a great challenge, because it is very meaningful for clinical diagnosis to segment retinal vessels accurately. However, now we are faced with not only the light intensity processing of the retinal blood vessel image and the contrast processing with the background image but also the thickness and density of the retinal blood vessel. There are also some problems, such as the accuracy of the methods and techniques of image segmentation.
Although various variants of the U-Net network have achieved good results in medical image segmentation in recent years, due to the problem of information loss caused by multiple convolutions when the encoder module extracts feature images and the mismatch between high-level and low-level features caused by the skip connection part, etc. In order to solve the above problems, the convolutional layer of the encoder is replaced by a ResNest Block network structure, so that feature extraction can be strengthened as much as possible and so that the problem of feature information loss can be reduced. On the basis of this, a new structure DFB module is proposed, which can strengthen the matching of high-level and low-level features and optimize the original skip connection part.
In order to better highlight the proposed model segmentation performance, this paper adopts the methods of quantitative analysis and qualitative analysis and adopts ablation experiments to prove its effectiveness. Through quantitative analysis to compare with the current more advanced models, we can compare the performance of the proposed model from various indicators. This paper selects five important indicators. The proposed model in F1-score, SP, and Pr, three key indicators, has a better performance. Through qualitative analysis, we can observe that the performance of the segmented image is better. Ablation experiments verify that the proposed modules are improved.
However, the segmentation of retinal vascular images will face many challenges in the future, such as the problem of the uneven brightness of retinal vascular images, as well as retinal vascular density and thickness, etc. It is still the most difficult problem in retinal vascular image segmentation. In future work, we hope to make up for the shortcomings of Acc and Se and to continue to optimize the proposed network model to achieve more accurate segmentation of retinal vessels.
5. Conclusions
In this paper, an improved model based on the U-Net network structure is proposed, which shows a good segmentation effect for retinal image segmentation. Through a comparison of the more advanced network structure, it can reflect the effectiveness of the proposed network model structure. The contributions of this article are as follows:
(1). In order to segment retinal vessels accurately, an improved structure based on U-Net is proposed, which can segment retinal vessels accurately and help patients understand their condition in time.
(2). For the problem of information loss caused by using convolution many times, ResNest is used to replace the original convolution operation of the encoder as the main network structure, which can better extract the retinal vascular features, minimizing the problem of information loss.
(3). To solve the problem of mismatch between high-level features and low-level features caused by skip connection, a novel DFB network structure is proposed, which can better realize the feature fusion of context and realize accurate vessel segmentation.
In the future research, we will improve each indicator of retinal blood vessel segmentation and achieve more accurate retinal blood vessel segmentation.
Methodology, N.W.; software, K.L.; validation, G.Z., Z.Z. and P.W.; formal analysis, N.W. and G.Z.; investigation, N.W.; resources, N.W. and G.Z.; data curation, N.W.; writing—original draft preparation, N.W.; writing—review and editing, N.W., K.L. and G.Z.; visualization, K.L. and G.Z.; supervision, Z.Z. and P.W. All authors have read and agreed to the published version of the manuscript.
Not applicable.
Not applicable.
DRIVE dataset source:
The authors declare no conflict of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 5. (a) DRIVE dataset and (b) CHASE_DB1 dataset. Looking from left to right, the leftmost column is the original image, the middle column is the corresponding ground truth, and the rightmost column is the corresponding mask.
Figure 6. Examples of image preprocessing. (a) Original image. (b) Image graying and image normalization.
Figure 8. Retinal vessel segmentation sample image. (a) is the original images. (b) is the segmentation prediction maps of the proposed model, and (c) is the ground truth.
Figure 9. Predicted images from other models. The top row is the DRIVE dataset, and the bottom row is the CHASE_DB1 dataset.
About the retinal vessel segmentation datasets.
Dataset | Number of Images | Training | Validation | Pixels | Epochs |
---|---|---|---|---|---|
DRIVE | 40 | 20 | 20 | 584 × 565 | 200 |
CHASE_DB1 | 28 | 20 | 8 | 999 × 960 | 200 |
Effect of preprocessing on experimental data.
Datasets | Preprocessing or Not | Acc | F1-Score | Se | Sp | Pr |
---|---|---|---|---|---|---|
DRIVE | No | 0.9395 | 0.8595 | 0.7531 | 0.9672 | 0.7734 |
Yes | 0.9403 | 0.8687 | 0.7380 | 0.9703 | 0.7863 | |
CHASE_DB1 | No | 0.9429 | 0.8213 | 0.7231 | 0.9675 | 0.7135 |
Yes | 0.9504 | 0.8358 | 0.7413 | 0.9720 | 0.7330 |
Methods compared by the experiment.
Datasets | Comparison Methods |
---|---|
DRIVE and CHASE_DB1 | Iternet |
SA_Unet | |
Res-Unet++ | |
D-Unet | |
Att-Unet | |
Unet | |
Unet++ | |
DCSAU_Net |
Results of ablation experiments.
Datasets | ResNest Block | DFB | Acc | F1-Score | Se | Sp | Pr |
---|---|---|---|---|---|---|---|
DRIVE | × | × | 0.9334 | 0.8278 | 0.7663 | 0.9580 | 0.7287 |
√ | × | 0.9406 | 0.8649 | 0.7706 | 0.9669 | 0.7823 | |
√ | √ | 0.9403 | 0.8687 | 0.7380 | 0.9703 | 0.7863 | |
CHASE_DB1 | × | × | 0.9376 | 0.7770 | 0.7292 | 0.9594 | 0.6529 |
√ | × | 0.9510 | 0.8168 | 0.8063 | 0.9658 | 0.7076 | |
√ | √ | 0.9504 | 0.8358 | 0.7413 | 0.9720 | 0.7330 |
Comparison of performance indicators on DRIVE dataset.
Method | Acc | F1-Score | Se (TPR) | Sp (TNR) | Pr |
---|---|---|---|---|---|
Iternet [ |
0.9483 | 0.8660 | 0.8222 | 0.9668 | 0.7843 |
SA_Unet [ |
0.9350 | 0.8342 | 0.7491 | 0.9616 | 0.7366 |
Res-Unet++ [ |
0.9409 | 0.8575 | 0.7696 | 0.9662 | 0.7708 |
D-Unet [ |
0.9422 | 0.8570 | 0.7906 | 0.9649 | 0.7708 |
Att-Unet | 0.9438 | 0.8475 | 0.8268 | 0.9611 | 0.7579 |
Unet | 0.9334 | 0.8278 | 0.7663 | 0.9580 | 0.7287 |
Unet++ | 0.9470 | 0.8641 | 0.8189 | 0.9660 | 0.7817 |
DCSAU_Net [ |
0.9406 | 0.8584 | 0.7647 | 0.9666 | 0.7719 |
DFB-Unet | 0.9403 | 0.8687 | 0.7380 | 0.9703 | 0.7863 |
Performance metrics on the CHASE_DB1 dataset versus contrast with other models.
Method | Acc | F1-Score | Se (TPR) | Sp (TNR) | Pr |
---|---|---|---|---|---|
DCSAU_Net | 0.9466 | 0.8093 | 0.7812 | 0.9641 | 0.6973 |
Iternet | 0.9547 | 0.8293 | 0.8620 | 0.9648 | 0.7272 |
Res-Unet++ | 0.9456 | 0.8113 | 0.7508 | 0.9660 | 0.6993 |
D-Unet | 0.9527 | 0.8195 | 0.8538 | 0.9633 | 0.7130 |
Att-Unet | 0.9542 | 0.8250 | 0.8480 | 0.9660 | 0.7199 |
SA_Unet | 0.9362 | 0.7732 | 0.7543 | 0.9559 | 0.6491 |
Unet | 0.9376 | 0.7770 | 0.7292 | 0.9594 | 0.6529 |
Unet++ | 0.9564 | 0.8317 | 0.8518 | 0.9672 | 0.7296 |
DFB-Unet | 0.9504 | 0.8358 | 0.7413 | 0.9720 | 0.7330 |
Analysis of Se performance indicator by removing noise.
Datasets | Se before Noise Removal | Se after Noise Removal |
---|---|---|
DRIVE | 0.7380 | 0.7746 |
CHASE_DB1 | 0.7413 | 0.7846 |
Analysis of Acc performance indicator by removing noise.
Datasets | Acc before Noise Removal | Acc after Noise Removal |
---|---|---|
DRIVE | 0.9403 | 0.9426 |
CHASE_DB1 | 0.9504 | 0.9508 |
References
1. Feldman-Billard, S.; Larger, É.; Massin, P. Early worsening of diabetic retinopathy after rapid improvement of blood glucose control in patients with diabetes. Diabetes Metab.; 2018; 44, pp. 4-14. [DOI: https://dx.doi.org/10.1016/j.diabet.2017.10.014] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/29217386]
2. Cho, K.H.; Kim, C.K.; Woo, S.J.; Park, K.H.; Park, S.J. Cerebral small vessel disease in branch retinal artery occlusion. Investig. Ophthalmol. Vis. Sci.; 2016; 57, pp. 5818-5824. [DOI: https://dx.doi.org/10.1167/iovs.16-20106] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/27802487]
3. Bourne, R.R.; Stevens, G.A.; White, R.A.; Smith, J.L.; Flaxman, S.R.; Price, H.; Jonas, J.B.; Keeffe, J.; Leasher, J.; Naidoo, K. et al. Causes of vision loss worldwide, 1990–2010: A systematic analysis. Lancet Glob. Health; 2013; 1, pp. e339-e349. [DOI: https://dx.doi.org/10.1016/S2214-109X(13)70113-X]
4. O’Shea, K.; Nash, R. An Introduction to Convolutional Neural Networks. arXiv; 2015; arXiv: 1511.08458
5. Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. arXiv; 2017; arXiv: 1411.4038
6. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, The Netherlands, 2015; pp. 234-241.
7. Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support; Springer: Cham, The Netherlands, 2018; pp. 3-11.
8. Alom, M.Z.; Hasan, M.; Yakopcic, C.; Taha, T.M.; Asari, V.K. Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation. arXiv; 2018; arXiv: 1802.06955
9. Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B. et al. Attention u-net: Learning where to look for the pancreas. arXiv; 2018; arXiv: 1804.03999
10. Huang, H.; Lin, L.; Tong, R.; Hu, H.; Zhang, Q.; Iwamoto, Y.; Han, X.; Chen, Y.-W.; Wu, J. Unet 3+: A full-scale connected unet for medical image segmentation. Proceedings of the ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); Barcelona, Spain, 4–8 May 2020; pp. 1055-1059.
11. Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. Transunet: Transformers make strong encoders for medical image segmentation. arXiv; 2021; arXiv: 2102.04306
12. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Proceedings of the 31st Conference on Neural Information Processing System; Long Beach, CA, USA, 4–9 December 2017.
13. Jin, Q.; Meng, Z.; Pham, T.D.; Chen, Q.; Wei, L.; Su, R. DUNet: A deformable network for retinal vessel segmentation. Knowl. -Based Syst.; 2019; 178, pp. 149-162. [DOI: https://dx.doi.org/10.1016/j.knosys.2019.04.025]
14. Hu, J.; Wang, H.; Gao, S.; Bao, M.; Liu, T.; Wang, Y.; Zhang, J. S-unet: A bridge-style u-net framework with a saliency mechanism for retinal vessel segmentation. IEEE Access; 2019; 7, pp. 174167-174177. [DOI: https://dx.doi.org/10.1109/ACCESS.2019.2940476]
15. Wang, D.; Hu, G.; Lyu, C. Frnet: An end-to-end feature refinement neural network for medical image segmentation. Vis. Comput.; 2021; 37, pp. 1101-1112. [DOI: https://dx.doi.org/10.1007/s00371-020-01855-z]
16. Yang, L.; Wang, H.; Zeng, Q.; Liu, Y.; Bian, G. A hybrid deep segmentation network for fundus vessels via deep-learning framework. Neurocomputing; 2021; 448, pp. 168-178. [DOI: https://dx.doi.org/10.1016/j.neucom.2021.03.085]
17. Dong, F.; Wu, D.; Guo, C.; Zhang, S.; Yang, B.; Gong, X. Medicine. CRAUNet: A cascaded residual attention U-Net for retinal vessel segmentation. Comput. Biol. Med.; 2022; 20, 105651. [DOI: https://dx.doi.org/10.1016/j.compbiomed.2022.105651] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/35635903]
18. Yang, X.; Li, Z.; Guo, Y.; Zhou, D. DCU-net: A deformable convolutional neural network based on cascade U-net for retinal vessel segmentation. Multimed. Tools Appl.; 2022; 81, pp. 15593-15607. [DOI: https://dx.doi.org/10.1007/s11042-022-12418-w]
19. Yang, Y.; Wan, W.; Huang, S.; Zhong, X.; Kong, X. RADCU-Net: Residual attention and dual-supervision cascaded U-Net for retinal blood vessel segmentation. Int. J. Mach. Learn. Cybern.; 2022; pp. 1-16. [DOI: https://dx.doi.org/10.1007/s13042-022-01715-3]
20. Zhang, H.; Wu, C.; Zhang, Z.; Zhu, Y.; Lin, H.; Zhang, Z.; Sun, Y.; He, T.; Mueller, J.; Manmatha, R. Resnest: Split-attention networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; New Orleans, LA, USA, 19–20 June 2022; pp. 2736-2746.
21. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Las Vegas, NV, USA, 27–30 June 2016; pp. 770-778.
22. Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA, 21–26 July 2017; pp. 1492-1500.
23. Chollet, F. Xception: Deep learning with depthwise separable convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA, 21–26 July 2017; pp. 1251-1258.
24. Qin, Z.; Zhang, P.; Wu, F.; Li, X. Fcanet: Frequency channel attention networks. Proceedings of the IEEE/CVF International Conference on Computer Vision; Nashville, TN, USA, 20–25 June 2021; pp. 783-792.
25. Staal, J.; Abràmoff, M.D.; Niemeijer, M.; Viergever, M.A.; Van Ginneken, B. Ridge-based vessel segmentation in color images of the retina. IEEE Trans. Med. Imaging; 2004; 23, pp. 501-509. [DOI: https://dx.doi.org/10.1109/TMI.2004.825627] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/15084075]
26. Fraz, M.M.; Remagnino, P.; Hoppe, A.; Uyyanonvara, B.; Rudnicka, A.R.; Owen, C.G.; Barman, S.A. An ensemble classification-based approach applied to retinal blood vessel segmentation. IEEE Trans. Biomed. Eng.; 2012; 59, pp. 2538-2548. [DOI: https://dx.doi.org/10.1109/TBME.2012.2205687] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/22736688]
27. Li, L.; Verma, M.; Nakashima, Y.; Nagahara, H.; Kawasaki, R. Iternet: Retinal image segmentation utilizing structural redundancy in vessel networks. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision; Seattle, WA, USA, 13–19 June 2020; pp. 3656-3665.
28. Sun, J.; Darbehani, F.; Zaidi, M.; Wang, B. Saunet: Shape attentive u-net for interpretable medical image segmentation. International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, The Netherlands, 2020; pp. 797-806.
29. Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Johansen, D.; De Lange, T.; Halvorsen, P.; Johansen, H.D. Resunet++: An advanced architecture for medical image segmentation. Proceedings of the 2019 IEEE International Symposium on Multimedia (ISM); San Diego, CA, USA, 9–11 December 2019; pp. 225-2255.
30. Zhou, Y.; Huang, W.; Dong, P.; Xia, Y.; Wang, S. D-UNet: A dimension-fusion U shape network for chronic stroke lesion segmentation. IEEE/ACM Trans. Comput. Biol. Bioinform.; 2019; 18, pp. 940-950. [DOI: https://dx.doi.org/10.1109/TCBB.2019.2939522] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/31502985]
31. Xu, Q.; Duan, W.; He, N. DCSAU-Net: A Deeper and More Compact Split-Attention U-Net for Medical Image Segmentation. arXiv; 2022; arXiv: 2202.00972
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Retinal vessel segmentation remains a challenging task because the morphology of the retinal vessels reflects the health of a person, which is essential for clinical diagnosis. Therefore, achieving accurate segmentation of the retinal vessel shape can determine the patient’s physical condition in a timely manner and can prevent blindness in patients. Since the traditional retinal vascular segmentation method is manually operated, this can be time-consuming and laborious. With the development of convolutional neural networks, U-shaped networks (U-Nets) and variants show good performance in image segmentation. However, U-Net is prone to feature loss due to the operation of the encoder convolution layer and also causes the problem of mismatch in the processing of contextual information features caused by the skip connection part. Therefore, we propose an improvement of the retinal vessel segmentation method based on U-Net to segment retinal vessels accurately. In order to extract more features from encoder features, we replace the convolutional layer with ResNest network structure in feature extraction, which aims to enhance image feature extraction. In addition, a Depthwise FCA Block (DFB) module is proposed to deal with the mismatched processing of local contextual features by skip connections. Combined with the two public datasets on retinal vessel segmentation, namely DRIVE and CHASE_DB1, and comparing our method with a larger number of networks, the experimental results confirmed the effectiveness of the proposed method. Our method is better than most segmentation networks, demonstrating the method’s significant clinical value.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details



1 School of Information Science and Electrical Engineering, Shandong Jiaotong University, Jinan 250357, China
2 School of Information Science and Electrical Engineering, Shandong Jiaotong University, Jinan 250357, China; Institute of Automation, Shandong Academy of Sciences, Jinan 250013, China